-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aeron epoll #1637
Comments
We have discussed it a few times, but currently, no. How many subscriptions do you have? Aeron scales quite well in terms of throughput, so curious where your scaling limitations come from. |
Thanks for your answer. Here's a few more details about our use case: We have more subscriptions than threads available to listen to them and the traffic on each subscription varies significantly across subscriptions and over time. We want to keep the subscriptions separated as we save them independently with Aeron archive and that helps other downstream sharded systems. Because there's no epoll, we put threads to sleep for a fixed amount of time if there's nothing to read from Aeron to avoid context-switching active threads with idle ones that are just no-op spinning. Given there's no way to trigger an "early wake up" if data becomes available, we end up with very low IPC and either:
Please let me know your thoughts on this use case and how to best leverage Aeron to:
Thank you! |
Scaling up the number of subscriptions depends on the QoS needed for them. Assuming the typical case of some streams being latency-sensitive and some not... Obviously, the streams with low latency demands should be isolated from all other threads. No multiplexing will solve that for you. It would add latency to that path. So, you would want to poll those and assign them to isolated, pinned threads taking into account the data path from the NIC to the CPU, etc. For latency, you don't want those sharing a duty cycle with a lot of other subscriptions that don't have latency demands. Nothing really saves you from having to do this for latency-sensitive streams. It's work that has to be done. The less latency demanding streams (more throughput demanding maybe) can be combined into a single thread (or a couple) and polled in a round-robin or other ratios.... i.e. you don't have to poll ALL subscriptions on each duty cycle iteration. You can proportion it and use a more aggressive idle scheme such as half of the subscriptions each iteration and no-op/pause/yield idle. A sleeping idle for many subscriptions isn't normally a great idea because the sleep will do exactly what you mention. If the system has to do a lot of other things and needs those threads, then you have to figure out how you balance the latency demands with the thread demands. In essence, if you have a set of latency-sensitive streams, then place them on specific CPUs.... don't have to be pinned even, just removed from other threads. And the rest balanced out. If, on the other hand, you have streams that are all the same on demands, then experiment. Perhaps service a proportion each time with a round-robin or with certain more active ones polled more often and an idle that is NOT sleeping, but may be yielding at most. Aeron drivers do this with publications in that there is a ratio of polls of the network to send attempts. Polls of the network are about 1 out of 4 to send attempts. More than happy to set up a chat to talk more about this if desired. |
Thanks for your thoughts. Replying in line:
None of the streams are particularly latency sensitive, but the point is to have the smallest possible latency while maximising throughput across all subscriptions. Basically maximise number of useful instructions-retired-per-cycle and overall CPU usage. Ideally, when a thread is running, it's doing useful work, and no CPU should be idle when there is useful work to do. That's easy to do with an
We have no particularly latency-sensitive streams, but there's many streams (more than CPUs available) and their data rate is variable. Therefore, we don't want to isolate anything. In general, we'd rather void any unnecessary complexity on our end.
I understand what you mean, but no matter how clever we are with busy polling, we are still wasting cycles when there's nothing to read from a subscription. We have hundreds of subscriptions and only a dozen CPUs to handle them. Most importantly, we don't want to maintain this complex scheduler on our end. All we want is a single thread on a
That's where we are right now - experimenting. It's hard to maximise efficiency without complex IO scheduling logic like the one you're proposing, which would still not be as good and easy as an epoll. Latency is not a problem for this use case, but we want to do as much as possible as quickly as possible with what we've got. In summary, it feels like |
We do see some value in having a demux API for Aeron such as |
Thank you. Let me take this back to my team and discuss supporting its development. I'll be in touch. |
Is there a way to have an epoll-like behaviour in Aeron where we can put one or more subscriptions in a set and yield the thread until there's data available to read in any of them? This would greatly help us scale a number of systems with high throughput loads and no particularly low latency requirements.
Thanks.
The text was updated successfully, but these errors were encountered: