-
Notifications
You must be signed in to change notification settings - Fork 0
fix compile error of clang #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: st_table_with_array2
Are you sure you want to change the base?
fix compile error of clang #1
Conversation
add more hash benchmarks to test alternatives (patch author is Vladimir Makarov)
Previous implementation had an issues: - macros murmur1 assumes murmur_step takes rotation value as a second argument - but murmur_step second argument is "next block" - this makes st_hash_uint and st_hash_end to not mix high bits of hash value into lower bits - this leads to pure hash behaviour on doubles and other. It didn't matter when bins amount were prime numbers, but it hurts when bins are powers of two. Mistake were created when implementor tries to co-exist Murmur1 and Murmur2 in a same code. Change it to single "Murmur3" implementation. It is not exact implementation: - 64bit version is an extrapolation of 32bit version instead of separate algorithm - 64bit finalizer is taken from http://zimbry.blogspot.ru/2011/09/better-bit-mixing-improving-on.html - on final block call murmur_step (do not shortcut last rotation of h and multiplication) instead of custom block. Also remove ST_USE_FNV1: it lack many function, and looks to be abandoned
Main Murmur weakness is permutation of a block without seed. Just adding a random seed to each block and finalization we greatly increase security of Murmur.
SipHash is secure PseudoRandom Function, and could be used in a counter mode to produce strong pseudorandom stream of bytes. So use it to securely initialize both hashseed and sipseed. (so no way to recover sipseed from hashseed)
Note, that ident becomes a bit slower, cause hash value is more "random". New versions calculates fair hash value, and so provides statistically fair collision rate. Previous version relies on internal pattern of symbols and "usually" provides lesser collisions. Though it could be compromised easely.
…hash24 SipHash13 is secure enough to be used in hash-tables, and SipHash's author confirms that. Rust already considered switch to SipHash13: rust-lang/rust#29754 (comment) Jean-Philippe Aumasson confirmation: rust-lang/rust#29754 (comment) Merged pull request: rust-lang/rust#33940
While for cryptography it has halved security, it is just enough for hash tables. See Jean-Philipee Aumasson answeres to Ocaml pull request: ocaml/ocaml#24 (comment) ocaml/ocaml#24 (comment)
- instead of allocating st_table_entry per element with malloc, store all elements in one array - use order in the array as an insertion order instead of list -- after array's last element is filled, hash is rebuilt and grow/shrunk if needed. - reduce sizeof(st_table_entry) to 24byte on 64bit platform by default -- use 32bit next index instead of pointer and 32bit hash -- there is compile time option to use 64bit index and 64bit hash to have ability to use huge hashes. - reduce sizeof(st_table) to 32bytes (on x86_64) by couple of tricks: -- alloc bins together with entries, so only 1 pointer needed -- pack allocated size into 8bit and rebuilt count into 24bit - fix test_st_foreach -- disable test_st_foreach(:unpack_delete) cause it behaves differently on already unpacked table (it may lead to segfault in previous st implementation) -- remove checks for packed/unpacked - bins are dinamically sized: -- if array's size is less than 256, then bins are 8bit indices -- if array is less than 65536 elements, then bins are 16bit indices -- otherwise bins are indices of maximum size (32bit or 64bit depending on architecture and compile time option)
|
Answering first comment-question: good catch. But, it is better just change type of new_sz argument, then check will be done in a loop of new_sz. (and, probably add assert when compiled outside of ruby instead of Answering second question: no, it never overflows. Check is in st_rehash after comment 'else grow table'. Thank you for catch. Would you mind if i embed fixes into my commit instead of merging? |
|
btw, has you gain any benefits from patch? |
|
Thank you for replying.
Sorry, I can’t get it. What benefit are you mentioning? |
e38ae90 to
2ea42c6
Compare
17+ years passed since standardized in ISO, 8 years since we added AC_HEADER_STDBOOL to configure.in. I'm quite confident that it's already safe to use <stdbool.h>. I understand that when we introduced AC_HEADER_STDBOOL, <stdbool.h> was remain not included because C standard and SVR4 curses conflicted miserably back then (#1). Though I believe such situation has been fixed already(ruby#2), I'm afraid of your operating system might ship a proprietary curses that still conflicts with the standard. So to avoid potential problem, we limit the inclusion to our internal use only. #1 : 1997 version of SUSv2 said bool is "defined though typedef" in <curses.h>, while C99 said bool is a macro, plus in C++ bool is a keyword. AFASIK the curses library has never been a part of POSIX. ruby#2 : In reality ncurses and NetBSD curses both just follow C99 to include <stdbool.h> from <curses.h>. I think C99 is now widely adopted. ---- * internal.h: #include <stdbool.h> if present. That is believed to be the case for 99.9% systems that lives today. Non-C99, non-C++ situations are intentionally left undefined, advised by Motohiro Kosaki. If you have such compiler, please fill the definition appropriately. git-svn-id: svn+ssh://ci.ruby-lang.org/ruby/trunk@57460 b2dd03c8-39d4-4d8f-98ff-823fe69b080e
I fixed compilation error on Clang (probably for -Wshorten-64-to-32).
It may be inappropriate fixes since I have not read through the implementation yet.
I embedded XXX comments into codes. Check them please.