Port deflate_quick to ARM #205

sebpop · 2018-09-17T18:22:14Z

If only we could port deflate_quick to ARM too.

sebpop · 2019-01-15T22:05:30Z

deflate_quick is only called at compression level 1 on x86_64.
There is too much code to implement for aarch64/aarch32 before the release.
We may revisit this after the release.

Myriachan · 2019-04-19T00:02:30Z

ARM doesn't have anything similar to the pcmpestri SSE4.2 instruction used by the compare258 function of deflate_quick. This instruction is a huge part of the performance: it's used to do a miniature strstr-like operation to find matching patterns in the dictionary.

nmoinvaz · 2020-02-09T21:58:24Z

It seems like we could replace match_len = compare258(s->window + s->strstart, s->window + s->strstart - dist) with match_len = longest_match(s, hash_head) if SSE4.2 not supported. If that is the case, perhaps we could extend compare258 to deflate_slow, deflate_medium, etc if SSE4.2 is supposed. So of like a functable.longest_match. What do you think?

mtl1979 · 2020-02-10T10:24:04Z

compare258 is essentially compare of 256 bytes followed by compare of trailing 2 bytes... Only difference is that for all iterations of the loop, it needs to find the first non-matching index in the vector and calculate length from it... If we assume that it is likely that all lanes are equal most of the time, it should not have performance penalty when the trailing bytes are not equal. Trailing bytes can be handled using uint8_t, uint16_t, uint32_t and uint64_t.

nmoinvaz · 2020-06-08T16:38:15Z

This is now complete.

Dead2 added optimization help wanted Anyone can contribute labels Jan 17, 2019

nmoinvaz mentioned this issue May 3, 2020

Clean up longest_match variants and abstract match comparisons to compare258 #550

Merged

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue May 24, 2020

Make deflate_quick algorithm available to all platforms. zlib-ng#205

392cfdd

nmoinvaz mentioned this issue May 24, 2020

Make deflate_quick algorithm available to all platforms. #590

Merged

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue May 24, 2020

Make deflate_quick algorithm available to all platforms. zlib-ng#205

3f9864a

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue May 25, 2020

Make deflate_quick algorithm available to all platforms. zlib-ng#205

cbeca6a

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue May 25, 2020

Make deflate_quick algorithm available to all platforms. zlib-ng#205

5287edd

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue May 25, 2020

Make deflate_quick algorithm available to all architectures. zlib-ng#205

07215f9

nmoinvaz removed the help wanted Anyone can contribute label May 26, 2020

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue May 31, 2020

Make deflate_quick algorithm available to all architectures. zlib-ng#205

3cf5e90

Dead2 pushed a commit that referenced this issue Jun 8, 2020

Make deflate_quick algorithm available to all architectures. #205

b6b8ad6

nmoinvaz closed this as completed Jun 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port deflate_quick to ARM #205

Port deflate_quick to ARM #205

sebpop commented Sep 17, 2018

sebpop commented Jan 15, 2019

Myriachan commented Apr 19, 2019

nmoinvaz commented Feb 9, 2020 •

edited

Loading

mtl1979 commented Feb 10, 2020

nmoinvaz commented Jun 8, 2020

Port deflate_quick to ARM #205

Port deflate_quick to ARM #205

Comments

sebpop commented Sep 17, 2018

sebpop commented Jan 15, 2019

Myriachan commented Apr 19, 2019

nmoinvaz commented Feb 9, 2020 • edited Loading

mtl1979 commented Feb 10, 2020

nmoinvaz commented Jun 8, 2020

nmoinvaz commented Feb 9, 2020 •

edited

Loading