-
-
Notifications
You must be signed in to change notification settings - Fork 264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark: zlib-ng vs isa-l, zlib, libdeflate, brotli #1486
Comments
I assume memcpy is just raw memory bandwidth with no compression? Both libdeflate and igzip have the advantage/disadvantage of not being forks from zlib but ground up implementations (on the flip side with incompatible APIs). Not to say there's no room for improvement for zlib-ng, but that is at least a footnote to be provided here. It looks like everybody's compression speed is a bit anemic. Naturally we'd expect compression to be slower, I guess, but I do wonder what's left on the table there without sacrificing compression ratios. |
AFAIK libdeflate requires whole-buffer and does not support streaming. So if all you look at is just speed then libdeflate will always win. On the higher levels zlib-ng compresses twice as fast as zlib, with slight decrease in compression size which is expected. |
Somewhat interesting is at level 6 the decompression speed is higher than level 3, but perhaps that speaks to having an inflate that is working more in inflate_fast rather than decompressing literals? I'm just guessing, I don't have profiles to really evaluate that delta, today. I've definitely wanted something that chews through literals in the main inflate loop faster for a while now. Err well, all the implementations share that trait. I guess it really is just not having to rely on memory read bandwidth as much with those higher compression ratios. |
I've added a web content benchmark now. |
Heh, the sorting by size rather than throughput is throwing me off a bit. Looks like we don't do too much worse (in terms of compression throughput) than libdeflate at level 9, albeit with some trade-offs in compression. Better compression algorithms of course, do better. But I'm not losing sleep over that, those things aren't zlib-ng's purview. It might be worth a weekend dive into the techniques libdeflate is benefiting from for decompression. |
Why libdeflate is faster than vanilla zlib for decompression: |
Pretty sure we do this, at least.
I would hope we do this but I'll have to look at the main loop to be sure. I'm not entirely convinced we couldn't do multiple words at a time and try to decode every possible op at once.
That merits a deeper dive to figure out what they're talking about.
I think we're doing a form of this now.
100% doing this, now. |
I do envy the quality of docstrings that libdeflate has. |
Additionally, libdeflate uses an extra 32KB hash table for 3-byte matches. I don't think we want to consume that much more memory. |
which one is the isa-l in the results? |
igzip |
Extended benchmark TurboBench: Dynamic/Static web content compression benchmark including zstd and memory usage. |
@powturbo Can you provide some detail on how you ran the benchmarks and how memory was measured? |
This done by TurboBench. The allocate/free functions are intercepted and the memory usage is monitored. |
There is a fork of libdeflate now which added streaming support (and multithreaded compression/decompression:
|
The zlib that ships with most distributions is fairly slow, even when you allow for the fact that zlib itself is quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. Creating a GB mbtiles before: Creating a GB mbtiles after: ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib that ships with most distributions is fairly slow, even when you allow for the fact that zlib itself is quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` This is kinda distasteful, in that we're snapshotting an external dependency. On the other hand, it builds quickly and the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib that ships with most distributions is fairly slow, even when you allow for the fact that zlib itself is quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` This is kinda distasteful, in that we're snapshotting an external dependency. On the other hand, it builds quickly and the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib that ships with most distributions is fairly slow, even when you allow for the fact that zlib itself is quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` This is kinda distasteful, in that we're snapshotting an external dependency. On the other hand, it builds quickly and the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib that ships with most distributions is fairly slow, even when you allow for the fact that zlib itself is quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` Snapshotting an external dependency is sorta distasteful - I worry about how much is too much from a maintenance POV. To mitigate that, the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. This also lets us drop the zlib1g-devel and boost-iostreams dependencies, which is nice. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib that ships with most distributions is fairly slow, even when you allow for the fact that zlib itself is quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` Snapshotting an external dependency is sorta distasteful - I worry about how much is too much from a maintenance POV. To mitigate that, the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. This also lets us drop the zlib1g-devel and boost-iostreams dependencies, which is nice. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib that ships with most distributions is fairly slow, even when you allow for the fact that zlib itself is quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` Snapshotting an external dependency is sorta distasteful - I worry about how much is too much from a maintenance POV. To mitigate that, the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. This also lets us drop the zlib1g-devel and boost-iostreams dependencies, which is nice. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib implementation that ships with most distributions is fairly slow, even when you allow for the zlib algorithm itself being quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` Snapshotting an external dependency is sorta distasteful - I worry about how much is too much from a maintenance POV. To mitigate that, the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. This also lets us drop the zlib1g-devel and boost-iostreams dependencies, which is nice. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib implementation that ships with most distributions is fairly slow, even when you allow for the zlib algorithm itself being quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` Snapshotting an external dependency is sorta distasteful - I worry about how much is too much from a maintenance POV. To mitigate that, the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. This also lets us drop the zlib1g-devel and boost-iostreams dependencies, which is nice. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib implementation that ships with most distributions is fairly slow, even when you allow for the zlib algorithm itself being quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` Snapshotting an external dependency is sorta distasteful - I worry about how much is too much from a maintenance POV. To mitigate that, the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. This also lets us drop the zlib1g-devel and boost-iostreams dependencies, which is nice. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib implementation that ships with most distributions is fairly slow, even when you allow for the zlib algorithm itself being quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` Snapshotting an external dependency is sorta distasteful - I worry about how much is too much from a maintenance POV. To mitigate that, the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. This also lets us drop the zlib1g-devel and boost-iostreams dependencies, which is nice. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
The zlib implementation that ships with most distributions is fairly slow, even when you allow for the zlib algorithm itself being quite slow (vs lz4 and zstd). There are faster implementations. `libdeflate` [1] is one such example. According to benchmarks [2], it's maybe 2-3x faster than zlib. This PR updates helper.cpp's compression routines to use libdeflate. It saves ~2-3% of total execution time for me: Creating a GB mbtiles (zlib): ``` real 2m1.706s user 28m24.186s sys 0m41.886s ``` Creating a GB mbtiles (libdeflate): ``` real 1m58.450s user 27m32.579s sys 0m51.848s ``` Snapshotting an external dependency is sorta distasteful - I worry about how much is too much from a maintenance POV. To mitigate that, the snapshot is a direct copy of the upstream folders, so it should be easy to update in the future if needed. This also lets us drop the zlib1g-devel and boost-iostreams dependencies, which is nice. [1]: https://github.com/ebiggers/libdeflate [2]: zlib-ng/zlib-ng#1486
TurboBench : Build or download executables and test with your own data.
Benchmark1:
TurboBench: Dynamic/Static web content compression benchmark
Benchmark 2:
turbobench silesia.tar -eigzip,0,1,2,3/zlib_ng,1,3,6,9/libdeflate,1,3,6,9,12/zlib,1,3,6,9/memcpy
Hardware: Lenovo Ideapad 5 pro - Ryzen 6600hs / (bold = pareto) MB=1.000.000
The text was updated successfully, but these errors were encountered: