Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract subset load balancer metadata matches from request #2436

Closed
zuercher opened this issue Jan 23, 2018 · 44 comments
Closed

extract subset load balancer metadata matches from request #2436

zuercher opened this issue Jan 23, 2018 · 44 comments
Labels
area/load balancing enhancement Feature requests. Not bugs or questions. help wanted Needs help!

Comments

@zuercher
Copy link
Member

Per #2334, the idea is to allow metadata matches (used to select subsets via the subset load balancer) to be computed from request data. Specifically, the intent is to match request header values against a regular expression and allow matched groups to be used as metadata values.

Configuration
One question is how to encode this configuration. Two proposals have been made.

  1. Encode the regexes and metadata mappings in a structure parallel to the existing metadata_match field of RouteAction.
  2. Use the regexes already encoded in the regex field of RouteMatch and introduce an additional field to allow mapping matched groups to metadata.

Option 1 allows headers to be optionally matched and converted into match criteria for the purposes of load balancing. Several optional headers could be defined and the resulting match criteria (if any) applied to routing. This option may cause the same header to be matched against multiple regular expressions (one to match the route, and one to extract metadata values).

Option 2 requires producing a route for each valid combination of headers. N optional, independent headers that produce match criteria would require 2^N routes. For a given route regexes are only matched once.

In either case, care will have to be taken to make sure the match criteria delivered to the subset load balancer are correctly sorted.

Security Considerations
My analysis is that any configuration using dynamically generated match criteria for the subset load balancer can be written using static match criteria, given a sufficient number of routes. There may be need to limit the length of header values or extract match criteria to avoid large memory allocations in the request path (existing header limits may be sufficient). Allowing arbitrary matches might allow an attacker to craft special requests to probe for metadata matches that were not meant to be exposed (e.g. internal-use versions, etc.)

@ufodone
Copy link

ufodone commented Jan 26, 2018

@zuercher Do you have a preference between the two options? Either would work for my use case but option 1 is probably a better fit. This option also feels a little more consistent with options like cluster_header in RouteAction since both are driving upstream node selection based on headers (one to choose the cluster and the other to select a subset within the cluster).

Allowing arbitrary matches might allow an attacker to craft special requests to probe for metadata matches that were not meant to be exposed

In my use case, I'm looking to inject the necessary headers used for subset load balancing via a lua filter so I'd probably just remove any headers from the initial request that I use internally for routing to prevent probing.

@zuercher
Copy link
Member Author

I think the choice is down to performance. There are two parts:

  1. Matching. I thought option 2 would avoid re-matching regexes. But in the case where you have to produce 2^N routes for N optional headers, you end up re-evaluating regexes multiple times across routes. For the case where there's a single route has a required header (that's not repeated in another route), option 2 does have the advantage of only processing the regex once.

  2. Generating the match criteria. As long as you're able insert the metadata values as you generate them (as opposed to doing sorting pass afterwards) I think either is ok. Option 2 might make this slightly easer, but I don't think it's faster per se.

@ufodone
Copy link

ufodone commented Jan 26, 2018

Since the number of times regex matches are needed is somewhat dependent on the use case, what about a third option where a new http filter is introduced that enables the addition of new headers based on regex processing of existing headers and then the metadata processing in the router doesn't need to do any regex - it simply pulls the metadata as-is from a configurable set of headers.

Loosly based on the example from #2334, the config might look something like:

"route_config" : {
  "name" : "local_route",
  "virtual_hosts" : [
	{
	  "name" : "local_service",
	  "domains" : ["*"],
	  "routes": [
		{
		  "match" : { "prefix" : "/" },
		  "route" : {
			  "cluster" : "foo",
              "header_metadata_match" : {
                  "filter_metadata" : {
                      "envoy.lb" : {
                          "env" : "x-foo-env-name",
                          "loc" : "x-foo-location",
                          "stage" : "x-foo-stage"
                      }
                  }
              }
		  }
		}
	  ]
	}
  ]
},
"http_filters" : [
  {
	"name" : "envoy.header_rewrite",
	"config" : {
        "rewrite_rules" : [
            {
                "name" : ":authority",
                "match" : "([a-z]*)-[a-z]*)-foo-service",
                "add" : {
                    "x-foo-env-name" : "$1",
                    "x-foo-location" : "$2"
                }
            },
            {
                "name" : "x-do-something",
                "match" : "stage=([a-z]*)",
                "add" : {
                    "x-foo-stage" : "$1"
                }
            }
        ]
	}
  },
  {
	"name" : "envoy.router"
  }
]

We do the regex processing once in the new filter and then the metadata is just grabbing the header values directly without any need for regex. If header matching is required for route selection, you could eliminate the need to have a duplicate regex and instead have the match section reference a header you processed in the new filter. Potentially the header match could be expanded to support matching strictly based on presence of a header rather than matching a specific value to avoid possibly needing to add an extra header simply for matching.

I think this approach potentially gives the flexibility to avoid duplicate regex matches in all use cases.

The new filter isn't event strictly necessary because it could be handled by the lua filter although performance would likely be better if handled in a C++ filter.

@ramaraochavali
Copy link
Contributor

@ufodone I think this approach makes sense. BTW, when rewriting header values, from where does $1 and $2 come from?
Thinking further, this can be used to select a cluster also (not just hosts with in the cluster) dynamically. Let us say for example, we want to route requests based on cluster_header and cluster_header value needs to be derived from :authority , we could write a rule like the following

"route_config" : {
  "name" : "local_route",
  "virtual_hosts" : [
	{
	  "name" : "local_service",
	  "domains" : ["*"],
	  "routes": [
		{
		  "match" : { "prefix" : "/" },
		  "route" : {
			  "cluster_header" : "foo_header",
		  }
		}
	  ]
	}
  ]
},
"http_filters" : [
  {
	"name" : "envoy.header_rewrite",
	"config" : {
        "rewrite_rules" : [
            {
                "name" : ":authority",
                "match" : "([a-z]*)-[a-z]*)-foo-service",
                "add" : {
                    "foo_header" : "$1" // Here foo_header can be derived from :authority header
                }
            }
        ]
	}
  },
  {
	"name" : "envoy.router"
  }
]

So in the above example, a cluster is selected based on foo_header attribute and foo_header attribute is computed based on ":authority" header, added dynamically by this new filter.
What do you think?

@ufodone
Copy link

ufodone commented Jan 29, 2018

The $1, $2 refer to the matched sub-expressions from the regex. I pulled the example directly from the other ticket. It might make more sense to use \1 and \2 to be consistent with typical regex syntax?

I hadn't considered using this for cluster_header as well but it makes sense. There are probably other use cases where the same pattern could apply.

The side-effect of this approach is that these extra headers will be passed to the upstream host. That probably isn't generally an issue but in many cases this is really an internal detail of Envoy and it might be preferred not to leak that info upstream. I wonder if it would be worth having a way to add "private" headers which are used internally but not propagated upstream? Or alternately have a different way of passing info between filters and the router?

@zuercher
Copy link
Member Author

Thanks for taking the time to think it though. In general, this looks like a good approach to me.

One odd bit is that the Metadata message in the API contains a map of strings to protobuf Struct messages but in the case of the header_metadata_match field, the values can really only be strings (header names). I don't think there's necessarily any better type to use, though.

@ramaraochavali
Copy link
Contributor

@ufodone I think one way to not pass these header values to upstream host is may be naming them with a different name space like envoy.router.foo_header so that router filter can remove them after route decisions has been made?

@ufodone
Copy link

ufodone commented Jan 30, 2018

I knew there were namespaces for metadata but wasn't aware that headers could also be namespaced such that the router could remove them. But if that's the case, it seems like it would do the trick.

@rgs1
Copy link
Member

rgs1 commented Apr 30, 2018

@zuercher @ufodone what do you think of something like #3254 as a workaround? This would at least — for now — allow us to do this from a filter. And even with regex support, some more advanced use cases would need to be able to do this from a filter.

So I am curious on your thoughts of using RequestInfo.setDynamicMetadata() to tag a request and then construct MetadataMatchCriteria from that.

@ufodone
Copy link

ufodone commented May 17, 2018

@rgs1 I'll have to look into the details a bit more but at first glance I think I could use to to fulfill my use case. I'd originally wanted to avoid writing a custom filter so that a vanilla envoy executable could be used but that's probably less important now as the deployment of such instances will be more contained than I'd originally thought.

I'd been planning on using a LUA filter so I suppose that it would be fairly easy to expose RequestInfo.setDynamicMetadata() to the LUA filter and that would eliminate my need to build a C++ filter (although I may eventually find that I want a C++ one for performance reasons anyway).

@sahanaha
Copy link

sahanaha commented Jun 6, 2018

I am in need of this feature as part of our envoy migration from V1 to V2 API. As in V1 we were routing based on cluster_header
https://www.envoyproxy.io/docs/envoy/latest/api-v1/route_config/route.html?highlight=cluster_header
Now in V2 API we are in need to provide dynamic value for metadata match criteria.

Do we have any timeline when this fix will be available ?

@mattklein123
Copy link
Member

@sahanaha
Copy link

sahanaha commented Jun 7, 2018

@mattklein123
Thanks for the response.
Do we also support dynamic value in metadata match criteria ? In my use case I need to match ":path" in route metadata_match, something like below...

                   "route": {
                          "cluster": "some_service",
                          "metadata_match": {
                            "filter_metadata": {
                              "envoy.lb": {
                                "url":      ":path",
                                "type":   "dev"
                              }
                            }
                          }
                          }

I tried the above configuration but it did not work.

@mattklein123
Copy link
Member

@sahanaha sorry do you just need to match on the path header? Can you describe what you are trying to accomplish exactly?

@sahanaha
Copy link

@mattklein123

Yes, idea here is to dynamically match the :Path value in route metadata with the host metadata ex:

"route": {
"cluster": "some_service",
"metadata_match": {
"filter_metadata": {
"envoy.lb": {
"url": ":path", //match the dynamic value with below EDS response
"type": "dev"
}
}
}
}

and my host metadata from EDS response would be something like below..

{
   "version_info": "0",
   "resources": [
   {
     "@type": "type.googleapis.com/envoy.api.v2.ClusterLoadAssignment",
     "cluster_name": "some_service",
     "endpoints": [
     {
      "lb_endpoints": [
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "host_ip1",
          "port_value": 80
         }
        }
       },
       "metadata": {
        "filter_metadata": {
          "envoy.lb" : {
“url” : "/v1/heartbeat”, //match the URL with :path & route it.
           "type": "dev"
         }
        }
       }
      },
      {
       "endpoint": {
        "address": {
         "socket_address": {
          "address": "host_ip3",
          "port_value": 80
         }
        }
       },
       "metadata": {
        "filter_metadata": {
          "envoy.lb" : {
“url” : "/v1/heartbeat123”, //match the URL with :path & route it.
           "type": "test"
         }
        }
       }
      }

 ]

    }
   ]
  }
 ]
}

So that I just define a single metadata_match in route configuration, as you see metadata value for url is different for each host, however I want to configure a single route match_criteria where url value is dynamic.

@mattklein123
Copy link
Member

@sahanaha sorry don't know off the top of my head. @zuercher is this is the same issue as the original issue?

@rgs1
Copy link
Member

rgs1 commented Jun 11, 2018

@sahanaha we achieve this with a filter, by doing something like:

ProtobufWkt::Struct keyval;
ProtobufWkt::Value val;

const auto* header_entry = headers.get(header);
auto header_value = std::string(header_entry->value().getStringView());

val.set_string_value(header_value);
(*keyval.mutable_fields())["url"] = val;  // url is the key from your example

callbacks.requestInfo().setDynamicMetadata("envoy.lb", keyval);

However, it would be nice to be able to express this via config.

@sahanaha
Copy link

@rgs1
Thanks! I will work on adding the above mentioned filter to suite my requirement.

Do we have any ETA to achieve dynamic metadata matching via config ?

@sahanaha
Copy link

sahanaha commented Jun 12, 2018

@rgs1

I have 2 questions..

#1 with the above filter example you have given how does my route metadata_match configuration looks like ? Should I remove metadata_match config from route ?

#2 can we achieve setting dynamic metadata value with LUA filter ?

I was just exploring the LUA filter. I see that we can extract the ":path" value from the header. Is there any way that we can set :path value extracted from LUA filter into the metadata_match ?

"route": {
"cluster": "some_service",
"metadata_match": {
"filter_metadata": {
"envoy.lb": {
"url" : "/v1/heartbeat", // Not sure how to set "url" from the :path value in LUA filter below
"type": "dev"
}
}
}
}

"http_filters": [
{
"name": "envoy.lua",
"config": {
"inline_code": "
function envoy_on_request(request_handle)
headers = request_handle:headers()
request_handle:logTrace(headers:get(":path")) //this extracts the value which I need for "url"
end"
}
},
{
"name": "envoy.router",
"config": {}
}
],

Since I am able to extract the :path value in inline function, is there a way that I can assign this value back to metadata match, please share any config example that you have done this way.

@rgs1
Copy link
Member

rgs1 commented Jun 12, 2018

@sahanaha:

  1. that's correct, you don't need metadata_match per route anymore because the dynamic metadata that's set by the filter is matched directly against the endpoints' metadata. If you look at 2d25067, you'll see that the request's metadata overrides the route's metadata (if any).

  2. i haven't played with LUA filters, but if the requestInfo object is reachable and you can call setDynamicMetadata, you are good to go...

@moss94
Copy link

moss94 commented Jun 12, 2018

@sahanaha
Unfortunately the LUA filter included with Envoy doesn't expose access to the dynamic meta data feature that was added. Would be great if it was...
I had the same use case, and hacked this together by modifying lua_filter.cc/h (this is a rough outline).

`//Expose requestInfo in Decoder/Encoder callbacks
RequestInfo::RequestInfo& getRequestInfo() override { return callbacks->requestInfo(); }

//Add to lua exported functions
{"dynamicMetadata", static_luaDynamicMetadata}};

//Impelment the lua call with something like
int StreamHandleWrapper::luaDynamicMetadata(lua_State* state) {
ASSERT(state_ == State::Running);

const char* key = luaL_checkstring(state, 2);
const char* value = luaL_checkstring(state, 3);

ENVOY_LOG(info, "Dynamic metadata key={} value={} ", key, value);

auto keyval = MessageUtil::keyValueStruct(key, value);
callbacks_.getRequestInfo().setDynamicMetadata("envoy.lb", keyval);

return 0;
}`

You then can just add the following line to your LUA script
somevalue = request_handle:headers():get(":path")
request_handle:dynamicMetadata("somekey", "somevalue");

Hope that helps. I have dreams of contributing something like that back, but time is lacking, and we haven't decided if we're going this route or not.

@dio
Copy link
Member

dio commented Jun 13, 2018

@moss94 stealing your idea, I have a WIP in here: dio@293fcac

I put the API as follows

function envoy_on_request(request_handle)
  request_handle:requestInfo():dynamicMetadata():set("envoy.lb", "foo", "bar")
  request_handle:requestInfo():dynamicMetadata():get("envoy.lb")["foo"]
end

As exhibited in the test file here. Not sure if we should default to "envoy.lb" key whenever possible.

@sahanaha
Copy link

@moss94

I modified the lua_filter.cc/h files as you suggested…

file lua_filter.h

// added the following line in FilterCallbacks
class FilterCallbacks {

virtual RequestInfo::RequestInfo& getRequestInfo();
..
}

//added to lua exported functions
{"dynamicMetadata", static_luaDynamicMetadata}};

// declared the following for luaDynamicMetadata
DECLARE_LUA_FUNCTION(StreamHandleWrapper, luaDynamicMetadata);

// added the following in DecoderCallbacks and EncoderCallbacks
RequestInfo::RequestInfo& getRequestInfo() override { return callbacks_->requestInfo(); }

file lua_filter.cc

// contains the implementation of luaDynamicMetadata

int StreamHandleWrapper::luaDynamicMetadata(lua_State* state) {
ASSERT(state_ == State::Running);
const char* key = luaL_checkstring(state, 2);
const char* value = luaL_checkstring(state, 3);
ENVOY_LOG(info, "Dynamic metadata key={} value={} ", key, value);
auto keyval = MessageUtil::keyValueStruct(key, value);
callbacks_.getRequestInfo().setDynamicMetadata("envoy.lb", keyval);
return 0;
}

I am getting the following linking error when I build envoy with ./ci/run_envoy_docker.sh './ci/do_ci.sh bazel.dev'

ERROR: /source/source/exe/BUILD:21:1: Linking of rule '//source/exe:envoy-static' failed (Exit 1): envoy_cc_wrapper failed: error executing command
(cd /build/tmp/_bazel_bazel/436badd4919a15958fa3800a4e21074a/execroot/ci &&
exec env -
PWD=/proc/self/cwd
/build/tmp/_bazel_bazel/436badd4919a15958fa3800a4e21074a/external/local_config_cc/extra_tools/envoy_cc_wrapper -o bazel-out/k8-fastbuild/bin/source/exe/envoy-static -pthread '-Wl,--hash-style=gnu' -static-libstdc++ -static-libgcc -Wl,@bazel-out/k8-fastbuild/genfiles/external/envoy/bazel/gnu_build_id.ldscript -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread -pthread '-fuse-ld=gold' -B/usr/bin -Wl,-S -Wl,@bazel-out/k8-fastbuild/bin/source/exe/envoy-static-2.params)
/usr/bin/ld.gold: error: external/envoy/ci/prebuilt/thirdparty_build/lib/libnghttp2.a:1:1: invalid character
bazel-out/k8-fastbuild/bin/source/extensions/filters/http/lua/_objs/config/source/extensions/filters/http/lua/config.pic.o:source/extensions/filters/http/lua/config.cc:function Envoy::Extensions::HttpFilters::Lua::FilterCallbacks::FilterCallbacks(): error: undefined reference to 'vtable for Envoy::Extensions::HttpFilters::Lua::FilterCallbacks'
/usr/bin/ld.gold: the vtable symbol may be undefined because the class is missing its key function
bazel-out/k8-fastbuild/bin/source/extensions/filters/http/lua/_objs/lua_filter_lib/source/extensions/filters/http/lua/lua_filter.pic.o:source/extensions/filters/http/lua/lua_filter.cc:typeinfo for Envoy::Extensions::HttpFilters::Lua::Filter::DecoderCallbacks: error: undefined reference to 'typeinfo for Envoy::Extensions::HttpFilters::Lua::FilterCallbacks'
bazel-out/k8-fastbuild/bin/source/extensions/filters/http/lua/_objs/lua_filter_lib/source/extensions/filters/http/lua/lua_filter.pic.o:source/extensions/filters/http/lua/lua_filter.cc:typeinfo for Envoy::Extensions::HttpFilters::Lua::Filter::EncoderCallbacks: error: undefined reference to 'typeinfo for Envoy::Extensions::HttpFilters::Lua::FilterCallbacks'
bazel-out/k8-fastbuild/bin/source/common/http/http2/_objs/codec_lib/source/common/http/http2/codec_impl.pic.o:source/common/http/http2/codec_impl.cc:function Envoy::Http::Http2::ConnectionImpl::sendPendingFrames(): error: undefined reference to 'nghttp2_session_send'
bazel-out/k8-fastbuild/bin/source/common/http/http2/_objs/codec_lib/source/common/http/http2/codec_impl.pic.o:source/common/http/http2/codec_impl.cc:function Envoy::Http::Http2::ConnectionImpl::sendPendingFrames(): error: undefined reference to 'nghttp2_strerror'
bazel-out/k8-fastbuild/bin/source/common/http/http2/_objs/codec_lib/source/common/http/http2/codec_impl.pic.o:source/common/http/http2/codec_impl.cc:function Envoy::Http::Http2::ConnectionImpl::StreamImpl::submitTrailers(Envoy::Http::HeaderMap const&): error: undefined reference to 'nghttp2_submit_trailer'
bazel-out/k8-fastbuild/bin/source/common/http/http2/_objs/codec_lib/source/common/http/http2/codec_impl.pic.o:source/common/http/http2/codec_impl.cc:function Envoy::Http::Http2::ConnectionImpl::StreamImpl::readDisable(bool): error: undefined reference to 'nghttp2_session_consume'
bazel-out/k8-fastbuild/bin/source/common/http/http2/_objs/codec_lib/source/common/http/http2/codec_impl.pic.o:source/common/http/http2/codec_impl.cc:function Envoy::Http::Http2::ConnectionImpl::ClientStreamImpl::submitHeaders(std::vector<nghttp2_nv, std::allocator<nghttp2_nv> > const&, nghttp2_data_provider*): error: undefined reference to 'nghttp2_submit_request'

Is there anything else I am missing here ? I am new to envoy code , kindly help!

@moss94
Copy link

moss94 commented Jun 13, 2018

@sahanaha
Looks like I left out the interface pure virtual in FitlerCallbacks in lua_filter.h
virtual RequestInfo::RequestInfo& getRequestInfo() PURE;

@bmetzdorf
Copy link
Contributor

FYI: @ggreenway and I are currently working on implementing this in a HTTP filter, by utilizing the changes from #3254. We're planning on supporting regex matching a header and constructing metadata from the regex capture groups.

@ggreenway
Copy link
Contributor

@rgs1 That would be great if you have something built already.

Here's what we were thinking roughly, for configuration:

  message Dest {
    // Defaults to "envoy.lb"
    string namespace = 1;
    string key = 2 [(validate.rules).string.min_bytes = 1];
    uint32 value_regex_group_number = 3;
  }
  message Rule {
    string header_name = 1 [(validate.rules).string.min_bytes = 1];
    string regex = 2 [(validate.rules).string.min_bytes = 1];
    repeated Dest dest = 3 [(validate.rules).repeated .min_items = 1];
  };

  repeated Rule rules = 1 [(validate.rules).repeated .min_items = 1];

@rgs1
Copy link
Member

rgs1 commented Jun 14, 2018

@ggreenway we are not that far away then. This is what ours looks like currently:

enum ValueType {
  STRING = 0;
  NUMBER = 1;
}

message MetadataKeyValuePair {
  // The namespace. If this is empty, the filter's namespace will be used.
  string metadata_namespace = 1;

  // The key — must be present.
  string key = 2;

  // The value — may be absent.
  string value = 3;

  // The type.
  ValueType type = 4;
}

message HeaderToMetadataRule {
    // The header from which to extract a value.
    string header = 1;

    // The metadata key with which the extracted value will be paired.
    MetadataKeyValuePair on_header_present = 2;

    // If the header is not present — and this is set — insert this key/value pair instead.
    MetadataKeyValuePair on_header_missing = 3;

    // Whether or not to remove the header after it's been copied to metadata. By allowing the
    // configuration to express header removal, we easily prevent internal metadata from leaking.
    bool remove = 4;
}

message HeaderToMetadataConfig {
    repeated HeaderToMetadataRule request_rules = 1;
    repeated HeaderToMetadataRule response_rules = 2;
}

A couple of notes:

  • we don't currently use regexes, just direct matches. Easy to fix.
  • we support this both ways, for requests and responses via a StreamFilter.
  • we also support — very important for us — adding metadata when a header is missing (on_header_missing).
  • the type is important for us, because we use some of the dynamic metadata fields from our logs via %DYNAMIC_METADATA()% and in some cases we need ints (not quoted).
  • a corrollary feature given these constructs is that you can match a header but override it's value with the value set in the corresponding MetadataKeyValuePair struct. This is mostly just to make things more ergonomic and allowing reuse of the same struct for both header present and header missing.

Let me know what you think of this approach. I'll try to push a clean branch today.

@rgs1
Copy link
Member

rgs1 commented Jun 14, 2018

@bmetzdorf @ggreenway this is what we are using: #3633, lmk what you think — thanks!

rgs1 pushed a commit to rgs1/envoy that referenced this issue Jun 14, 2018
Add support for extracting dynamic metadata from requests. This can then
be used as static metadata would be used (e.g.: for subset load balancer
metadata matches, logging, etc).

Risk Level: Low

Testing: unit-test

Docs Changes:
Basic docs, more later.

Release Notes:
N/A

Signed-off-by: Raul Gutierrez Segales <[email protected]>
@sahanaha
Copy link

@moss94
Thanks for sharing your solution. This works fine for my requirement, will continue with this until the original FIX is available in envoy.

@rgs1
Copy link
Member

rgs1 commented Jun 20, 2018

@sahanaha fyi, here are the draft docs for the header-to-metadata filter (#3633), lmk what you think. thanks!

zuercher pushed a commit that referenced this issue Jun 21, 2018
Add support for extracting dynamic metadata from requests. This can then
be used as static metadata would be used (e.g.: for subset load balancer
metadata matches, logging, etc).

Risk Level: Low
Testing: unit-test
Docs Changes: Basic docs.
Release Notes: N/A

Signed-off-by: Raul Gutierrez Segales <[email protected]>
@sahanaha
Copy link

sahanaha commented Jun 22, 2018 via email

@rgs1
Copy link
Member

rgs1 commented Jun 22, 2018

@sahanaha yeah, that should work. LMK if otherwise.

@sahanaha
Copy link

Hi,

Hope I can ask one more question...

I have a scenario where same IP address is part of multiple subsets based on metadata match criteria..

EDS response:

"lb_endpoints": [
{
"endpoint": {
"address": {
"socket_address": {
"address": "172.03.88.173",
"port_value": 16503
}
}
},
"metadata": {
"filter_metadata": {
"envoy.lb" : {
"url" : "/v3/Mllib-try1"
}
}
}
},
{
"endpoint": {
"address": {
"socket_address": {
"address": "172.03.94.255",
"port_value": 16503
}
}
},
"metadata": {
"filter_metadata": {
"envoy.lb" : {
"url" : "/v3/Mllib-try1"
}
}
}
},
{
"endpoint": {
"address": {
"socket_address": {
"address": "172.03.88.173", --> IP address is same as url: "/v3/Mllib-try1"
"port_value": 16503
}
}
},
"metadata": {
"filter_metadata": {
"envoy.lb" : {
"url" : "/v3/Mllib-try2"
}
}
}
},
{
"endpoint": {
"address": {
"socket_address": {
"address": "172.03.94.255", IP address is same as url "/v3/Mllib-try1"
"port_value": 16503
}
}
},
"metadata": {
"filter_metadata": {
"envoy.lb" : {
"url" : "/v3/Mllib-try2"
}
}
}
},

I was hoping to see 2 subset creation, one for "/v3/Mllib-try1" and another for "/v3/Mllib-try2". However in the envoy trace I see that only subset was created for url= "/v3/Mllib-try1" and then second one was eliminated (probably because it has the same IP as the first one)

As per the below document same host can be part of multiple subsets (e7 host is part of 3 subsets)

https://github.com/envoyproxy/envoy/blob/master/source/docs/subset_load_balancer.md

Is there any way that we can still create subsets when the IP address is duplicated ?

@rgs1
Copy link
Member

rgs1 commented Jul 11, 2018

@sahanaha I think you'll need two recent PRs:

#3814 (merged)
#3840 (waiting for review)

@sahanaha
Copy link

@rgs1

I just pulled the latest master branch today and tested my scenario. I still don't see subset being created for url=/v3/Mllib-try2. Please note url=/v3/Mllib-try2 is having the same IP address as url=/v3/Mllib-try1. Is there anything I am missing here ?

Snippet of envoy trace is below..

[2018-07-14 17:59:06.831][302][debug][upstream] source/common/upstream/subset_lb.cc:263] subset lb: creating load balancer for url="/v3/Mllib-try1"
[2018-07-14 17:59:06.831][302][debug][upstream] source/common/upstream/subset_lb.cc:263] subset lb: creating load balancer for url="default_fallback"

/v3/Mllib-try2 is missing.

@rgs1
Copy link
Member

rgs1 commented Jul 17, 2018

@sahanaha you've got the same IP and port... that won't work, since they are both the same endpoint. Or did you just forget to update the port in your example?

@sahanaha
Copy link

sahanaha commented Jul 17, 2018 via email

@rgs1
Copy link
Member

rgs1 commented Jul 17, 2018

@sahanaha so an endpoint (ip+port) can be in multiple subsets, but the endpoint should only appear ones in EDS (with all the corresponding metadata that would cause it to be in multiple subsets) — does that make sense?

@sahanaha
Copy link

@rgs1

Yes, it makes sense, Thanks!.
Let me see how I can fit my requirement into this.

@sahanaha
Copy link

Hi,

I still have some issues fitting our requirement in the V2 envoy. Let me just start with a background of our requirement…

screen_shot_2018-07-19_at_4 11 05_pm

Lets assume I have 3 endpoints in a cluster (ip:port) and multiple URLs (you can assume URLs as some data/file which resides inside the mentioned ip:port) assigned to each endpoint.

The number of URLs assigned to each endpoint grows during runtime and the same URL can be part of multiple endpoints (we duplicate it in multiple endpoints as a backup)

As shown in the diagram Endpoint 1 contains URL1, URL2 and URL2 is also part of Endpoint 2.

When users sends request to ENVOY, we will be having URL1 or URL2 ….. URLn as part of the http request. We fetch this information via dynamic metadata matching and then route the request to corresponding Endpoint which contains the respective URL number.

The issue here is.. the number of URLs grows during runtime and is not fixed, hence we can not configure the exact metadata key names in subset_selectors.

We could think of 2 possible ways of solving this..

  1. create each metadata key for every URL in the EDS response. for example …

Endpoint 1 {
ip: "1.2.3.4"
port: 1600
}
metadata {
"key1" : "URL 1",
"key2" : "URL 2" —> URLs can grow more than 10,000 and with that our metadata key count.
}

similarly …
Endpoint 2 {
ip: "1.2.3.5"
port: 1600
}
metadata {
"key2" : "URL 2",
"key3" : "URL 3"
}

However in this approach we don’t know how many number of keys gets generated during runtime. This could be in 10,000. It will be a huge number to statically configure in the envoy config subset_selector. Also we might have to put a threshold on number of keys will can grow but we will be loosing the other URLs which crosses beyond this threshold.

  1. Second approach is to create one cluster for each URL, example if I have URL1, URL2, URL3 ……. URLn, we will be construction that many clusters each having the corresponding endpoint.

In this approach we can solve the issue of having a duplicate IP:port (as the ip:port can be same in multiple clusters) however the main disadvantage is that the cluster number grows with respect to number of URLs… ie if number of URLs are 10,000 we will end up creating 10,000 cluster entries.

Do you have any suggestions to overcome this situation ? What is the best approach which we can take in V2 envoy to fit our requirement ? Any help on this will be greatly appreciated .

@sahanaha
Copy link

Hi Team,

Is there any update on my requirement question which I posted above ?

@wansuiye
Copy link

wansuiye commented Nov 12, 2018

@rgs1 hi, in multi-tenant scene, instances of a service may have different label, we could config roure configuration for every instance to match header, however the config is troublesome because the tenat is too much and the match is linear not a LB rule, and we should update the config in ever instance's life-cycle.
so if exist a way to custom the lb rule to dynamicly match the request to select instance instead of static configs?

@zuercher
Copy link
Member Author

zuercher commented Sep 2, 2020

Belatedly closing this because I think the header-to-metadata filter fulfilled the origin feature request.

@zuercher zuercher closed this as completed Sep 2, 2020
jpsim pushed a commit that referenced this issue Nov 28, 2022
Description: Add support for Kotlin robolectric tests by making `envoy_mobile_android_test` macro use `kt_android_local_test` maco as opposed to `android_local_test` macro as the former supports both Java and Kotlin files while the latter supports Java files only. The change should be additive when it comes to functionality. Bump `rules_kotlin` version to v1.7.0-RC-3 to pick up changes from bazelbuild/rules_kotlin#810.
Risk Level: None
Testing: Existing tests using envoy_mobile_android_test macro.
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Rafal Augustyniak <[email protected]>
Signed-off-by: JP Simard <[email protected]>
jpsim pushed a commit that referenced this issue Nov 29, 2022
Description: Add support for Kotlin robolectric tests by making `envoy_mobile_android_test` macro use `kt_android_local_test` maco as opposed to `android_local_test` macro as the former supports both Java and Kotlin files while the latter supports Java files only. The change should be additive when it comes to functionality. Bump `rules_kotlin` version to v1.7.0-RC-3 to pick up changes from bazelbuild/rules_kotlin#810.
Risk Level: None
Testing: Existing tests using envoy_mobile_android_test macro.
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Rafal Augustyniak <[email protected]>
Signed-off-by: JP Simard <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/load balancing enhancement Feature requests. Not bugs or questions. help wanted Needs help!
Projects
None yet
Development

No branches or pull requests