Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re enable ASM speedups on Win #701

Closed
wants to merge 6 commits into from

Conversation

CristiFati
Copy link

@CristiFati CristiFati commented Sep 3, 2022

While working on #376 (closed it as it's contained in this one), I didn't notice the speedups have been removed. Anyway added them back and also fixed some bugs (build and run time). More details: [SO]: CMake zlib build on Windows - (@CristiFati's answer).

Tests:

I built (via CMake: NMake Makefiles - VStudio 2019) 8 variants ((064bit / 032bit) X (with / without speedups) X (Debug / Release)) and ran each minigzip.exe (which links to zlib(d).dll) on a bunch of big (~100MiB) files (text / binary); both compression (levels: 1, 5, 9) and decompression, and each time checked md5sum of original file and decompressed one.

chris19782 and others added 6 commits July 11, 2018 16:21
zlib 1.2.9 added "z_streamp strm" in "struct inflate_state" in
inflate.h, new offset/binding is required

zlib 1.2.9 started "enum inflate_mode" in inflate.h @ 16180

zlib 1.2.3.4 added 3 values in "enum inflate_mode" in inflate.h

zlib 1.2.2.1 removed GUNZIP condition in "struct inflate_state" in
inflate.h, compilation directive was no longer required
inffas32.asm struct/enum binding to zlib 1.2.9
@rhuijben
Copy link

rhuijben commented Sep 5, 2022

Did you verify that the code actually works as expected after all security updates to the C code?

Several bound checks were added to the C code, but never copied/implemented in the unmaintained assembler code.

Several years ago it was easy to prove data corruption with the then easy to enable asm code on Windows. (There is a regression test with the proof in the Apache Subversion code to make sure repackagers don't accidentally enable that knob and silently break their users)

@rhuijben
Copy link

rhuijben commented Sep 5, 2022

Note that some of the original optimizations are no longer optimizations... They used to be better than compiler generated code from the 1990s... and in my testing the automatically vectorized and inlined code from recent C compilers was usually as good as or better than the old asm code in separated code.

option(ASM686 "Enable building i686 assembly implementation")
option(AMD64 "Enable building amd64 assembly implementation")
option(ASM032 "Enable building i686 assembly implementation")
option(ASM064 "Enable building amd64 assembly implementation")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x64 or ARM64, (or Sparc64, ... ?)
I think the old name makes more sense.

@madler
Copy link
Owner

madler commented Sep 5, 2022

Thank you for your work on this, but @rhuijben's comments are one of the reasons I removed this particular assembler code. The commit (288f108) is "Remove old assembler code in which bugs have manifested.", with the comment "In addition, there is not sufficient gain from the inflate assembler code to warrant its inclusion."

@CristiFati
Copy link
Author

Thank you for inputs. There seems to be more to this than meets the eye. The problem was that (on Win, with assemblies):

  • 064bit: didn't build

  • 032bit: deflate crashed

This is (mainly) what the PR proposes. I didn't go that far to check security updates (I'd give it a try if provided a method).

Regarding naming: AMD64 is an architecture name, and is misleading. Also, is not consistent with the other. These names are supposed to say whether assemblies are enabled on 032 or 064 bits (whatever underlying architecture is).

I updated the SO question with some test results that I did, and it's true that in some cases C code is faster.

Anyway, even if this PR is rejected (which makes total sense if ASM code is error prone to attacks - buffer overruns and stuff), it will be self sufficient for anyone who want to use them.

@madler
Copy link
Owner

madler commented Oct 7, 2022

I plan to leave the assembly code deleted. Thanks.

@madler madler closed this Oct 7, 2022
@CristiFati
Copy link
Author

Fair enough. I will keep building the speedup variants as well (for a while) and post it to [GitHub]: CristiFati/Prebuilt-Binaries - (master) Prebuilt-Binaries/ZLib.

@Neustradamus Neustradamus mentioned this pull request Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants