-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wasteful re-compression #543
Comments
What jobsub_lite is doing is rewriting the tarfile with the permissions modified, to prevent people putting things into cvmfs that they cannot read. |
We tried that, but users complained they were
Also, decompressing and reading the whole tarfile to check the permissions on everything is not significantly faster than copying it and modifying it. Just how big is this tarfile you're sending? |
Also, why are you asking that a file already in /pnfs/mu2e/resilient be re-copied to a dropbox: location, when it is already in resilient? Just leave the dropbox: off of the front and use it where it is... |
I had to run a test job and knew of a code file I could use, it happened to be in resilient. Let's not get sidetracked by this specific example, the issue is that jobsub_submt unconditionally performs moves that are not needed for well-formed inputs, and those moves inconvenience some users. Code tarballs can be large, in the Mu2e case many hundred MB, so rewriting them is slow. And this re-writing was probably the root reason for the TMPDIR issue (#545). Without re-writing code tarballs the rest of the temp files would fit into /tmp without problems. I think it will be best to streamline job submission by removing the tarball rewrite from jobsub_submit. It can have an option like --redo-my-tarball-just-in-case, but this "extra help" functionality may be better placed into a separate script to prepare a job submission tarball.
Andrei
…________________________________________
From: Marc Mengel ***@***.***>
Sent: Thursday, February 29, 2024 9:25 AM
To: fermitools/jobsub_lite
Cc: Andrei Gaponenko; Author
Subject: Re: [fermitools/jobsub_lite] wasteful re-compression (Issue #543)
[EXTERNAL] – This message is from an external sender
Also, why are you asking that a file already in /pnfs/mu2e/resilient be re-copied to a dropbox: location, when it is already in resilient? Just leave the dropbox: off of the front and use it where it is...
—
Reply to this email directly, view it on GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_fermitools_jobsub-5Flite_issues_543-23issuecomment-2D1971377531&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=O47fc5vzDTR2V_gla4Ub0Q&m=xZH8ySU4mBtdnsCO1j-5w-5pPoK8OEvno-GSK-AjefCx54bob0iEfjC_GKAPcHgj&s=E0kIKVEYoV9aIBM43UrpwyPFhGGt5mK7-3no35cLWhA&e=>, or unsubscribe<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AAXVCGR5SIREZLETDFSTS7LYV5D5JAVCNFSM6AAAAABD64YGM2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZRGM3TONJTGE&d=DwMFaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=O47fc5vzDTR2V_gla4Ub0Q&m=xZH8ySU4mBtdnsCO1j-5w-5pPoK8OEvno-GSK-AjefCx54bob0iEfjC_GKAPcHgj&s=ksVuRwZtR7bdHlAu9K7oSirlAzYxmMjaS8M0IcykQ3k&e=>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Hello,
I have just waited for several minutes as jobsub_submit was
re-compressing an already compressed code tarball. It was specified
with the
--tar_file_name dropbox:///pnfs/mu2e/resilient/users/gandr/gridexport/tmp.9I7Gv1adwT/Code.tar.bz
option, and then I saw a large file named Code.tar.bz2473.tbz2 appear
in my working directory as I was waiting for the submission to
complete.
Maybe the compression step should be delegated to the user, and
jobsub_submit should not try to re-pack the user-provided file. Just
upload it as is from its original location.
Andrei
The text was updated successfully, but these errors were encountered: