-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re enable ASM speedups on Win #701
Conversation
zlib 1.2.9 added "z_streamp strm" in "struct inflate_state" in inflate.h, new offset/binding is required zlib 1.2.9 started "enum inflate_mode" in inflate.h @ 16180 zlib 1.2.3.4 added 3 values in "enum inflate_mode" in inflate.h zlib 1.2.2.1 removed GUNZIP condition in "struct inflate_state" in inflate.h, compilation directive was no longer required
inffas32.asm struct/enum binding to zlib 1.2.9
Did you verify that the code actually works as expected after all security updates to the C code? Several bound checks were added to the C code, but never copied/implemented in the Several years ago it was easy to prove data corruption with the then easy to enable asm code on Windows. (There is a regression test with the proof in the Apache Subversion code to make sure repackagers don't accidentally enable that knob and silently break their users) |
Note that some of the original optimizations are no longer optimizations... They used to be better than compiler generated code from the 1990s... and in my testing the automatically vectorized and inlined code from recent C compilers was usually as good as or better than the old asm code in separated code. |
option(ASM686 "Enable building i686 assembly implementation") | ||
option(AMD64 "Enable building amd64 assembly implementation") | ||
option(ASM032 "Enable building i686 assembly implementation") | ||
option(ASM064 "Enable building amd64 assembly implementation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x64 or ARM64, (or Sparc64, ... ?)
I think the old name makes more sense.
Thank you for your work on this, but @rhuijben's comments are one of the reasons I removed this particular assembler code. The commit (288f108) is "Remove old assembler code in which bugs have manifested.", with the comment "In addition, there is not sufficient gain from the inflate assembler code to warrant its inclusion." |
Thank you for inputs. There seems to be more to this than meets the eye. The problem was that (on Win, with assemblies):
This is (mainly) what the PR proposes. I didn't go that far to check security updates (I'd give it a try if provided a method). Regarding naming: AMD64 is an architecture name, and is misleading. Also, is not consistent with the other. These names are supposed to say whether assemblies are enabled on 032 or 064 bits (whatever underlying architecture is). I updated the SO question with some test results that I did, and it's true that in some cases C code is faster. Anyway, even if this PR is rejected (which makes total sense if ASM code is error prone to attacks - buffer overruns and stuff), it will be self sufficient for anyone who want to use them. |
I plan to leave the assembly code deleted. Thanks. |
Fair enough. I will keep building the speedup variants as well (for a while) and post it to [GitHub]: CristiFati/Prebuilt-Binaries - (master) Prebuilt-Binaries/ZLib. |
While working on #376 (closed it as it's contained in this one), I didn't notice the speedups have been removed. Anyway added them back and also fixed some bugs (build and run time). More details: [SO]: CMake zlib build on Windows - (@CristiFati's answer).
Tests:
I built (via CMake: NMake Makefiles - VStudio 2019) 8 variants ((064bit / 032bit) X (with / without speedups) X (Debug / Release)) and ran each minigzip.exe (which links to zlib(d).dll) on a bunch of big (~100MiB) files (text / binary); both compression (levels: 1, 5, 9) and decompression, and each time checked md5sum of original file and decompressed one.