fix: Use PSS instead of RSS to estimate children process memory usage on Linux#1210
fix: Use PSS instead of RSS to estimate children process memory usage on Linux#1210
PSS instead of RSS to estimate children process memory usage on Linux#1210Conversation
| ) | ||
|
|
||
|
|
||
| def _get_used_memory(memory_full_info: Any) -> int: |
There was a problem hiding this comment.
This internal type hint does not seem to be available. The actual type is dependent on the OS as well.
|
Based on the docs: Using |
USS instead of RSS to estimate children process memory usagePSS instead of RSS to estimate children process memory usage on Linux
|
To come up with the test was really hard. The test is not nice at all but testing the memory usage estimation is really tricky due to to Python being too high-level for some precise memory control. |
vdusek
left a comment
There was a problem hiding this comment.
LGTM, thanks for looking into this.
Btw. so in Windows are giving up, since the PSS isn't a concept there? Or is there a chance to use some alternative metric? If so, maybe we can open a follow-up issue?
|
Couple of questions 🙂
|
So far I can only guess. I have to make some experiments with the JS version to have some data. But if it relies on RSS only, then I think it could also overestimate used memory.
Yes, I think it could be good safety measure to bound it like that, regardless of this change.
In Crawlee only probably not, but my guess is, that multiple Playwright processes could actually use some shared memory which would be overestimated by RSS and probably underestimated with USS. So PSS seems to me like the best in our case as it takes into account shared memory in somewhat predictable way. |
Keep in mind that on the platform, memory usage (and pretty much all the scaling metrics) is coming over websockets, we don't measure it ourselves, so it's very much possible we dont do it perfectly, and nobody noticed, since on localhost, we use 1/4 of the available memory by default. Also given the memory scales with CPU, you usually run things with enough memory. |
Description
To estimate process memory usage use
Proportional Set Size (PSS)to estimate process memory usage of the process and all it's children to avoid overestimation of used memory due to same shared memory being counted multiple times when usingResident Set Size (RSS).PSSis available only on Linux, so this improved estimation will work only there.Add test.
Issues