doc/cephfs: document purge queue and its perf counters#60794
doc/cephfs: document purge queue and its perf counters#60794
Conversation
|
Request @ceph/cephfs for suggestions/corrections. |
|
jenkins test docs |
|
@dparmar18, I'll take care of all the corrections and alterations. Thanks for this contribution! |
doc/cephfs/purge-queue.rst
Outdated
|
|
||
| MDS maintains a data structure known as **Purge Queue** which is responsible | ||
| for managing and executing the sequential deletion of files. | ||
| There is a Purge queue for every MDS rank. Purge queues consist of purge items |
There was a problem hiding this comment.
The first Purge here should be lowercased.
There was a problem hiding this comment.
There's a lot of stuff that needs to be rewritten here. I'm going to make a "purge queue" PR to hash out the purge queue definition, and I will also make a grammar-and-elegance pass. This suggestion will be incorporated into that latter PR. As always, good catch.
|
From what I understand is - @zdover23 will take care of all the grammar and touch ups in this PR but just FYI we'd still need an approval from @ceph/cephfs to validate the content credibility. |
@dparmar18, This is correct. If you get someone from CephFS to verify the technical accuracy of this content, I will merge this PR and raise a new PR in which I will clean the English and make sure that the RST file is ready to be backported to release branches. @vshankar, could you assign someone to check the technical accuracy of the information added to the docs in this PR? |
I will have a look at this now. |
Cool. |
doc/cephfs/purge-queue.rst
Outdated
|
|
||
| .. note:: Generally, the defaults are adequate for most clusters. However, in | ||
| case of huge clusters, if the need arises, values might be tuned to | ||
| 4-5 times of the default value as a starting point and further |
There was a problem hiding this comment.
4x-5x for all the above configurations?
There was a problem hiding this comment.
You should also provide a sample configuration.
There was a problem hiding this comment.
4x-5x for all the above configurations?
So, the users can start with the most basic one i.e. setting filer_max_purge_ops to 40-50 which mostly should work. If it doesn't then rest three mentioned above can be tuned to 4x-5x.
There was a problem hiding this comment.
You should also provide a sample configuration.
I can but it's usually just one or two conf val changes (as I mentioned above). Do you want me to add example of how it's done in the doc?
There was a problem hiding this comment.
Yeh, I think so. Providing a sample would help users not to second guess. The configs may not be perfect for their cluster, but it provides an understanding of what all to change.
There was a problem hiding this comment.
Yeh, I think so. Providing a sample would help users not to second guess. The configs may not be perfect for their cluster, but it provides an understanding of what all to change.
Added some examples, PTAL.
doc/cephfs/purge-queue.rst
Outdated
|
|
||
| When a client requests deletion of a directory (say ``rm -rf``): | ||
|
|
||
| - MDS queues the files and subdirectories (purge items) from journal in the |
There was a problem hiding this comment.
So, there is a purge queue journal (pq). Its a bit unclear which journal is being referred to here.
There was a problem hiding this comment.
The journal here comes from osdc/Journaler.h". Since this is MDS's side, would it be correct to call it MDS Journal?
There was a problem hiding this comment.
So, there is MDS metadata journal (mdlog) and the MDS purge queue journal (pq). Both of course use osdc/Journaler.h class, but my question here is which of the above two journals is being referred.
There was a problem hiding this comment.
It's purge queue journal
Lines 129 to 131 in 5061b31
There was a problem hiding this comment.
Right. So, let's mention that explicitly in the doc.
There was a problem hiding this comment.
@vshankar, I'll get this in the document when I get back to my office. It'll be in today.
There was a problem hiding this comment.
At ease @zdover23. I have requested one small update from @dparmar18 and then its good to go 👍
|
@zdover23 made some minor changes and fixes while also adding some more content i felt was necessary. PTAL https://github.com/ceph/ceph/compare/ddd994291dba374fe122a7d59dfc3835fe665356..65bd5c2f0e08c68f18e02e0c21466bdd8fa4c8ee |
@dparmar18 I'm on it. |
|
oh shoot while resolving conflicts, i messed up this part: |
removed redundant entries. https://github.com/ceph/ceph/compare/65bd5c2f0e08c68f18e02e0c21466bdd8fa4c8ee..611478de3af6a38c593949cc3101d14a3e549311 |
@dparmar18 Just add the information that Venky asked for, and I'll make sure that this builds correctly. Let me know when it's ready for me to fix up. |
Fixes: https://tracker.ceph.com/issues/68571 Signed-off-by: Dhairya Parmar <[email protected]>
vshankar
left a comment
There was a problem hiding this comment.
Good start documenting this. Nice work @dparmar18
@zdover23 FYI
@vshankar, Which release branches should I backport this change to? |
Updated the tracker - quincy,reef,squid |
Fixes: https://tracker.ceph.com/issues/68571
Contribution Guidelines
To sign and title your commits, please refer to Submitting Patches to Ceph.
If you are submitting a fix for a stable branch (e.g. "quincy"), please refer to Submitting Patches to Ceph - Backports for the proper workflow.
When filling out the below checklist, you may click boxes directly in the GitHub web UI. When entering or editing the entire PR message in the GitHub web UI editor, you may also select a checklist item by adding an
xbetween the brackets:[x]. Spaces and capitalization matter when checking off items this way.Checklist
Show available Jenkins commands
jenkins retest this pleasejenkins test classic perfjenkins test crimson perfjenkins test signedjenkins test make checkjenkins test make check arm64jenkins test submodulesjenkins test dashboardjenkins test dashboard cephadmjenkins test apijenkins test docsjenkins render docsjenkins test ceph-volume alljenkins test ceph-volume toxjenkins test windowsjenkins test rook e2e