-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract subset load balancer metadata matches from request #2436
Comments
@zuercher Do you have a preference between the two options? Either would work for my use case but option 1 is probably a better fit. This option also feels a little more consistent with options like
In my use case, I'm looking to inject the necessary headers used for subset load balancing via a lua filter so I'd probably just remove any headers from the initial request that I use internally for routing to prevent probing. |
I think the choice is down to performance. There are two parts:
|
Since the number of times regex matches are needed is somewhat dependent on the use case, what about a third option where a new http filter is introduced that enables the addition of new headers based on regex processing of existing headers and then the metadata processing in the router doesn't need to do any regex - it simply pulls the metadata as-is from a configurable set of headers. Loosly based on the example from #2334, the config might look something like:
We do the regex processing once in the new filter and then the metadata is just grabbing the header values directly without any need for regex. If header matching is required for route selection, you could eliminate the need to have a duplicate regex and instead have the I think this approach potentially gives the flexibility to avoid duplicate regex matches in all use cases. The new filter isn't event strictly necessary because it could be handled by the lua filter although performance would likely be better if handled in a C++ filter. |
@ufodone I think this approach makes sense. BTW, when rewriting header values, from where does $1 and $2 come from?
So in the above example, a cluster is selected based on |
The $1, $2 refer to the matched sub-expressions from the regex. I pulled the example directly from the other ticket. It might make more sense to use \1 and \2 to be consistent with typical regex syntax? I hadn't considered using this for The side-effect of this approach is that these extra headers will be passed to the upstream host. That probably isn't generally an issue but in many cases this is really an internal detail of Envoy and it might be preferred not to leak that info upstream. I wonder if it would be worth having a way to add "private" headers which are used internally but not propagated upstream? Or alternately have a different way of passing info between filters and the router? |
Thanks for taking the time to think it though. In general, this looks like a good approach to me. One odd bit is that the |
@ufodone I think one way to not pass these header values to upstream host is may be naming them with a different name space like |
I knew there were namespaces for metadata but wasn't aware that headers could also be namespaced such that the router could remove them. But if that's the case, it seems like it would do the trick. |
@zuercher @ufodone what do you think of something like #3254 as a workaround? This would at least — for now — allow us to do this from a filter. And even with regex support, some more advanced use cases would need to be able to do this from a filter. So I am curious on your thoughts of using |
@rgs1 I'll have to look into the details a bit more but at first glance I think I could use to to fulfill my use case. I'd originally wanted to avoid writing a custom filter so that a vanilla envoy executable could be used but that's probably less important now as the deployment of such instances will be more contained than I'd originally thought. I'd been planning on using a LUA filter so I suppose that it would be fairly easy to expose |
I am in need of this feature as part of our envoy migration from V1 to V2 API. As in V1 we were routing based on cluster_header Do we have any timeline when this fix will be available ? |
@sahanaha cluster header is supported in v2 here: https://github.com/envoyproxy/envoy/blob/master/api/envoy/api/v2/route/route.proto#L326 |
@mattklein123
I tried the above configuration but it did not work. |
@sahanaha sorry do you just need to match on the path header? Can you describe what you are trying to accomplish exactly? |
Yes, idea here is to dynamically match the :Path value in route metadata with the host metadata ex: "route": { and my host metadata from EDS response would be something like below.. {
} So that I just define a single metadata_match in route configuration, as you see metadata value for url is different for each host, however I want to configure a single route match_criteria where url value is dynamic. |
@sahanaha we achieve this with a filter, by doing something like:
However, it would be nice to be able to express this via config. |
@rgs1 Do we have any ETA to achieve dynamic metadata matching via config ? |
I have 2 questions.. #1 with the above filter example you have given how does my route metadata_match configuration looks like ? Should I remove metadata_match config from route ? #2 can we achieve setting dynamic metadata value with LUA filter ? I was just exploring the LUA filter. I see that we can extract the ":path" value from the header. Is there any way that we can set :path value extracted from LUA filter into the metadata_match ? "route": { "http_filters": [ Since I am able to extract the :path value in inline function, is there a way that I can assign this value back to metadata match, please share any config example that you have done this way. |
|
@sahanaha `//Expose requestInfo in Decoder/Encoder callbacks //Add to lua exported functions //Impelment the lua call with something like const char* key = luaL_checkstring(state, 2); ENVOY_LOG(info, "Dynamic metadata key={} value={} ", key, value); auto keyval = MessageUtil::keyValueStruct(key, value); return 0; You then can just add the following line to your LUA script Hope that helps. I have dreams of contributing something like that back, but time is lacking, and we haven't decided if we're going this route or not. |
@moss94 stealing your idea, I have a WIP in here: dio@293fcac I put the API as follows function envoy_on_request(request_handle)
request_handle:requestInfo():dynamicMetadata():set("envoy.lb", "foo", "bar")
request_handle:requestInfo():dynamicMetadata():get("envoy.lb")["foo"]
end As exhibited in the test file here. Not sure if we should default to "envoy.lb" key whenever possible. |
I modified the lua_filter.cc/h files as you suggested… file lua_filter.h // added the following line in FilterCallbacks //added to lua exported functions {"dynamicMetadata", static_luaDynamicMetadata}}; // declared the following for luaDynamicMetadata // added the following in DecoderCallbacks and EncoderCallbacks file lua_filter.cc // contains the implementation of luaDynamicMetadata int StreamHandleWrapper::luaDynamicMetadata(lua_State* state) {
ASSERT(state_ == State::Running); I am getting the following linking error when I build envoy with ./ci/run_envoy_docker.sh './ci/do_ci.sh bazel.dev' ERROR: /source/source/exe/BUILD:21:1: Linking of rule '//source/exe:envoy-static' failed (Exit 1): envoy_cc_wrapper failed: error executing command Is there anything else I am missing here ? I am new to envoy code , kindly help! |
@sahanaha |
FYI: @ggreenway and I are currently working on implementing this in a HTTP filter, by utilizing the changes from #3254. We're planning on supporting regex matching a header and constructing metadata from the regex capture groups. |
@rgs1 That would be great if you have something built already. Here's what we were thinking roughly, for configuration:
|
@ggreenway we are not that far away then. This is what ours looks like currently:
A couple of notes:
Let me know what you think of this approach. I'll try to push a clean branch today. |
@bmetzdorf @ggreenway this is what we are using: #3633, lmk what you think — thanks! |
Add support for extracting dynamic metadata from requests. This can then be used as static metadata would be used (e.g.: for subset load balancer metadata matches, logging, etc). Risk Level: Low Testing: unit-test Docs Changes: Basic docs, more later. Release Notes: N/A Signed-off-by: Raul Gutierrez Segales <[email protected]>
@moss94 |
Add support for extracting dynamic metadata from requests. This can then be used as static metadata would be used (e.g.: for subset load balancer metadata matches, logging, etc). Risk Level: Low Testing: unit-test Docs Changes: Basic docs. Release Notes: N/A Signed-off-by: Raul Gutierrez Segales <[email protected]>
@rgs1
The document looks good! Thanks!
I was just trying to map it with my requirement of ":path" value matching.
Please check if the below configuration is correct
{
"http_filters": [
{
"name": "envoy.filters.http.header_to_metadata",
"config": {
"request_rules": [
{
"header": ":path", —> I guess the value
of ":path" will be extracted here
"on_header_present": {
"metadata_namespace": "envoy.lb",
"key": "url",
"type": "STRING"
},
"remove": false
}
]
}
}
]
}
The corresponding upstream cluster configuration will be ..
{
"clusters": [
{
"name": "versioned-cluster",
"type": "EDS",
"lb_policy": "ROUND_ROBIN",
"lb_subset_config": {
"fallback_policy": "DEFAULT_SUBSET",
"default_subset": {
"fallback" : "true" —> fallback to
default subset with key "fallback" and value "true"
},
"subset_selectors": [
{
"keys": [
"url" —> the subsets will be formed
based on the ":path" value extracted in http_filter
]
}
]
}
}
]
}
Thanks
Sahana
…On Thu, Jun 21, 2018 at 2:19 AM, Raúl Gutiérrez Segalés < ***@***.***> wrote:
@sahanaha <https://github.com/sahanaha> fyi, here are the draft docs for
the header-to-metadata filter (#3633
<#3633>), lmk what you think.
thanks!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2436 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AmIodb06QqlFYJgJ9Lov5kIK1Ia78lH6ks5t-rVrgaJpZM4RqYif>
.
|
@sahanaha yeah, that should work. LMK if otherwise. |
Hi, Hope I can ask one more question... I have a scenario where same IP address is part of multiple subsets based on metadata match criteria.. EDS response: "lb_endpoints": [ I was hoping to see 2 subset creation, one for "/v3/Mllib-try1" and another for "/v3/Mllib-try2". However in the envoy trace I see that only subset was created for url= "/v3/Mllib-try1" and then second one was eliminated (probably because it has the same IP as the first one) As per the below document same host can be part of multiple subsets (e7 host is part of 3 subsets) https://github.com/envoyproxy/envoy/blob/master/source/docs/subset_load_balancer.md Is there any way that we can still create subsets when the IP address is duplicated ? |
I just pulled the latest master branch today and tested my scenario. I still don't see subset being created for url=/v3/Mllib-try2. Please note url=/v3/Mllib-try2 is having the same IP address as url=/v3/Mllib-try1. Is there anything I am missing here ? Snippet of envoy trace is below.. [2018-07-14 17:59:06.831][302][debug][upstream] source/common/upstream/subset_lb.cc:263] subset lb: creating load balancer for url="/v3/Mllib-try1" /v3/Mllib-try2 is missing. |
@sahanaha you've got the same IP and port... that won't work, since they are both the same endpoint. Or did you just forget to update the port in your example? |
@rgs1
In my scenario I do end up getting same IP and Port in multiple subsets. Is
there any workaround solution for this problem ?
…On Tue, Jul 17, 2018 at 7:06 AM, Raúl Gutiérrez Segalés < ***@***.***> wrote:
@sahanaha <https://github.com/sahanaha> you've got the same IP *and*
port... that won't work, since they are both the same endpoint. Or did you
just forget to update the port in your example?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2436 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AmIodbOs3BzfXfy7IU0fzfjmOIAf8xOsks5uHT-QgaJpZM4RqYif>
.
|
@sahanaha so an endpoint (ip+port) can be in multiple subsets, but the endpoint should only appear ones in EDS (with all the corresponding metadata that would cause it to be in multiple subsets) — does that make sense? |
Yes, it makes sense, Thanks!. |
Hi, I still have some issues fitting our requirement in the V2 envoy. Let me just start with a background of our requirement… Lets assume I have 3 endpoints in a cluster (ip:port) and multiple URLs (you can assume URLs as some data/file which resides inside the mentioned ip:port) assigned to each endpoint. The number of URLs assigned to each endpoint grows during runtime and the same URL can be part of multiple endpoints (we duplicate it in multiple endpoints as a backup) As shown in the diagram Endpoint 1 contains URL1, URL2 and URL2 is also part of Endpoint 2. When users sends request to ENVOY, we will be having URL1 or URL2 ….. URLn as part of the http request. We fetch this information via dynamic metadata matching and then route the request to corresponding Endpoint which contains the respective URL number. The issue here is.. the number of URLs grows during runtime and is not fixed, hence we can not configure the exact metadata key names in subset_selectors. We could think of 2 possible ways of solving this..
Endpoint 1 { similarly … However in this approach we don’t know how many number of keys gets generated during runtime. This could be in 10,000. It will be a huge number to statically configure in the envoy config subset_selector. Also we might have to put a threshold on number of keys will can grow but we will be loosing the other URLs which crosses beyond this threshold.
In this approach we can solve the issue of having a duplicate IP:port (as the ip:port can be same in multiple clusters) however the main disadvantage is that the cluster number grows with respect to number of URLs… ie if number of URLs are 10,000 we will end up creating 10,000 cluster entries. Do you have any suggestions to overcome this situation ? What is the best approach which we can take in V2 envoy to fit our requirement ? Any help on this will be greatly appreciated . |
Hi Team, Is there any update on my requirement question which I posted above ? |
@rgs1 hi, in multi-tenant scene, instances of a service may have different label, we could config roure configuration for every instance to match header, however the config is troublesome because the tenat is too much and the match is linear not a LB rule, and we should update the config in ever instance's life-cycle. |
Belatedly closing this because I think the header-to-metadata filter fulfilled the origin feature request. |
Description: Add support for Kotlin robolectric tests by making `envoy_mobile_android_test` macro use `kt_android_local_test` maco as opposed to `android_local_test` macro as the former supports both Java and Kotlin files while the latter supports Java files only. The change should be additive when it comes to functionality. Bump `rules_kotlin` version to v1.7.0-RC-3 to pick up changes from bazelbuild/rules_kotlin#810. Risk Level: None Testing: Existing tests using envoy_mobile_android_test macro. Docs Changes: N/A Release Notes: N/A Signed-off-by: Rafal Augustyniak <[email protected]> Signed-off-by: JP Simard <[email protected]>
Description: Add support for Kotlin robolectric tests by making `envoy_mobile_android_test` macro use `kt_android_local_test` maco as opposed to `android_local_test` macro as the former supports both Java and Kotlin files while the latter supports Java files only. The change should be additive when it comes to functionality. Bump `rules_kotlin` version to v1.7.0-RC-3 to pick up changes from bazelbuild/rules_kotlin#810. Risk Level: None Testing: Existing tests using envoy_mobile_android_test macro. Docs Changes: N/A Release Notes: N/A Signed-off-by: Rafal Augustyniak <[email protected]> Signed-off-by: JP Simard <[email protected]>
Per #2334, the idea is to allow metadata matches (used to select subsets via the subset load balancer) to be computed from request data. Specifically, the intent is to match request header values against a regular expression and allow matched groups to be used as metadata values.
Configuration
One question is how to encode this configuration. Two proposals have been made.
metadata_match
field ofRouteAction
.regex
field ofRouteMatch
and introduce an additional field to allow mapping matched groups to metadata.Option 1 allows headers to be optionally matched and converted into match criteria for the purposes of load balancing. Several optional headers could be defined and the resulting match criteria (if any) applied to routing. This option may cause the same header to be matched against multiple regular expressions (one to match the route, and one to extract metadata values).
Option 2 requires producing a route for each valid combination of headers. N optional, independent headers that produce match criteria would require 2^N routes. For a given route regexes are only matched once.
In either case, care will have to be taken to make sure the match criteria delivered to the subset load balancer are correctly sorted.
Security Considerations
My analysis is that any configuration using dynamically generated match criteria for the subset load balancer can be written using static match criteria, given a sufficient number of routes. There may be need to limit the length of header values or extract match criteria to avoid large memory allocations in the request path (existing header limits may be sufficient). Allowing arbitrary matches might allow an attacker to craft special requests to probe for metadata matches that were not meant to be exposed (e.g. internal-use versions, etc.)
The text was updated successfully, but these errors were encountered: