Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConsistentHashLB: Hash based on request path/url #11554

Closed
berstend opened this issue Feb 6, 2019 · 11 comments
Closed

ConsistentHashLB: Hash based on request path/url #11554

berstend opened this issue Feb 6, 2019 · 11 comments
Labels
area/networking kind/enhancement lifecycle/automatically-closed Indicates a PR or issue that has been closed automatically.

Comments

@berstend
Copy link

berstend commented Feb 6, 2019

I have a use-case where it'd be amazing to use the consistent hash-based load balancer based on the path (or URL) of the request.

Unfortunately it seems like I cannot use httpHeaderName to extract the path or request url.

Or is there a way to do this currently that I missed?

Thanks!

@murarisumit
Copy link
Contributor

@berstend can you pls guide me in explaining more in it. What I found out in envoy documentation here "https://www.envoyproxy.io/docs/envoy/v1.9.0/intro/arch_overview/load_balancing/load_balancers" that it doesn't provide such load-balancing functionality.

Do we need something like this : https://www.haproxy.com/blog/haproxys-load-balancing-algorithm-for-static-content-delivery-with-varnish/

hashing the whole url, including the query string

    backend bk_static
      balance uri whole
      hash-type consistent

Query string parameter hash

    backend bk_static
      balance uri whole
      hash-type consistent

Do we need need to implement such load-balancing at envoy's end ?

@berstend
Copy link
Author

@murarisumit thanks for your followup questions.

Unfortunately I'm not too familiar with the envoy proxy internals (yet), to know with certainty wether or not envoy supports url based load balancing currently.

Let me instead lay out my use-case in more detail:

https://app.com/projectid1/
https://app.com/projectid2/
https://app.com/projectid3/

As a developer using istio I wish to use the URL (or path) as the ring key for load balancing/sticky sessions. Currently that doesn't seem possible.

In my scenario it's not too relevant if query strings are part of the hash or not.

My current workaround:

Add a Cloudflare Worker which will add the URL to the request headers:

addEventListener('fetch', event => {
  event.respondWith(handleRequest(event.request))
})

/**
 * Add `x-url` header to fetch request
 * @param {Request} request
 */
async function handleRequest(request) {
  const headers = new Headers(request.headers)
  headers.set('x-url', request.url)
  return fetch(request, { headers })
}

Use the x-url request header as loadBalancer key:

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: {{ $.Release.Name }}-dest-{{ $versionId }}
spec:
  host: {{ $.Release.Name }}
  trafficPolicy:
    loadBalancer:
      consistentHash:
        httpHeaderName: x-url
  subsets:
    - name: version-{{ $versionId }}
      labels:
        app: {{ $.Release.Name }}
        version: {{ $versionId }}

This works as expected (sticky sessions based on request URL) but has a dependency on having Cloudflare in front of the cluster. Would love to simplify this and have istio allow me to use the request url directly.

Thanks!

@berstend
Copy link
Author

berstend commented Feb 19, 2019

This envoy issue might be related: envoyproxy/envoy#2436 (found through the more specifically related envoyproxy/envoy#2334).

@stale
Copy link

stale bot commented May 20, 2019

This issue has been automatically marked as stale because it has not had activity in the last 90 days. It will be closed in the next 30 days unless it is tagged "help wanted" or other activity occurs. Thank you for your contributions.

@stale stale bot added the stale label May 20, 2019
@stale
Copy link

stale bot commented Jun 19, 2019

This issue has been automatically closed because it has not had activity in the last month and a half. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted". Thank you for your contributions.

@stale stale bot closed this as completed Jun 19, 2019
@rlenglet rlenglet modified the milestones: Nebulous Future, 1.2 Jul 10, 2019
@epa095
Copy link

epa095 commented Oct 1, 2019

Is it possible to get this re-opened @rlenglet ?

@istio-policy-bot
Copy link

🚧 This issue or pull request has been closed due to not having had activity in the last 195 days. If you feel this issue or pull request deserves attention, please reopen the issue. Please see this wiki page for more information. Thank you for your contributions.

Created by the issue and PR lifecycle manager.

@istio-policy-bot istio-policy-bot added the lifecycle/automatically-closed Indicates a PR or issue that has been closed automatically. label Nov 6, 2019
@kaiburjack
Copy link

In case you want to have consistent hash-based load balancing (i.e. for "sharding" purposes) that is only dependent on the request's path (and query string), you can use a DestinationRule like the following:

spec:
  host: {{ $.Release.Name }}
  trafficPolicy:
    loadBalancer:
      consistentHash:
        httpHeaderName: ':path'

This uses the pseudo-header :path (written with the colon) that Envoy understands (also works for HTTP/1.1 requests).
Earlier, I actually added an intermediate proxy (nginx) between the clients/callers and the load-balanced/sharded destination service (in our case Varnish) where that nginx proxy would add a custom header based on the URL of the request, but that was too complicated in the end.
Then I found that for our case, at least, sharding/load-balancing on the :path pseudo-header was sufficient (no need to also make the destination 'host' part of the hash).
This does work and we use it in production right now.

@junoriosity
Copy link

@kaiburjack Many thanks for your input, could you provide some examples, such that I can see whether this would be well-suited for our use case?

Specifically, I do not quite understand how I would have to understand :path.

@kaiburjack
Copy link

@junoriosity :path is a so-called "pseudo-header" in Envoy.
See for example: https://www.envoyproxy.io/docs/envoy/latest/configuration/http/http_conn_man/headers#path
That's because it is not actually a header in a HTTP request, but is the "request URI" (in terms of the HTTP 1.1 specification) in the "request line" of a HTTP request, such as the /some/path?query=123&abc=xyz in GET /some/path?query=123&abc=xyz HTTP/1.1.
So, it will effectively include the path as well as the query string of the request URI.
You can treat it like a header name everywhere where Envoy expects a header name, such as in the hash_policy of a virtual host's routes configuration (expressed in Istio via the DestinationRule's trafficPolicy.loadBalancer.consistentHash.httpHeaderName field).
So, like I said, you can use :path as described in the DestinationRule spec above for hash-based load balancing when you want to hash based on the HTTP's request URI (path + query string).
Our use-case here was to have horizontal load balancing for HTTP caches, where same requests (with same HTTP request line) will not go to different HTTP cache instances and thusly not fetched by each of these caches from the backend servers individually, so effectively implementing "sharding" (based on the request URI). This had a nice effect of reducing the time-to-first-byte (and effectively the "Largest Contentful Paint") metric.

@musabshak
Copy link

musabshak commented Nov 29, 2024

This is super neat.

I wonder if there's another similar pseudo-header that only takes into account the path and not the query parameters. I have requests of form /<bucketName>/<gitLFSObjecthash>?awsSigV4QueryParams.... I would like requests for the same git object to go to the same pod consistently. :path doesn't work because the AWS SigV4 pre-signed URL query params will be different everytime (different signature, different timestamp, etc), even for the same git object.

As mentioned above, I could always set up a reverse proxy to add a custom header (ignoring the query params). But wondering if there is an Envoy hack to accomplish the same.

EDIT - doesn't seem like there is, unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking kind/enhancement lifecycle/automatically-closed Indicates a PR or issue that has been closed automatically.
Projects
None yet
Development

No branches or pull requests