You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Judging from system monitor metastore -s only uses one thread. I'm naively assuming that at some point it has to walk down a file and directory tree and visit it's nodes recursively or iteratively. I propose to put file paths in a directory in groups of <= 100 into queues from which n threads can poll and create the file output which can then be written into a large buffer (in order to avoid an I/O bottleneck). In case it's necessary the output needs to be ordered all threads need a sequence number and others must not proceed until the lowest has finished (all threads have to do nothing, but stat calls which should cause quite equal load on each thread).
The text was updated successfully, but these errors were encountered:
Good idea. It's not something for upcoming v1.1, though, which is mostly meant to constitute what metastore was so far - no behavioral or bigger changes, only simple fixes.
Judging from system monitor
metastore -s
only uses one thread. I'm naively assuming that at some point it has to walk down a file and directory tree and visit it's nodes recursively or iteratively. I propose to put file paths in a directory in groups of <= 100 into queues from whichn
threads can poll and create the file output which can then be written into a large buffer (in order to avoid an I/O bottleneck). In case it's necessary the output needs to be ordered all threads need a sequence number and others must not proceed until the lowest has finished (all threads have to do nothing, butstat
calls which should cause quite equal load on each thread).The text was updated successfully, but these errors were encountered: