ããã·ã¥ã使ãã¾ãã
å°å ¥
ããç¨åº¦ã®ä»®å®ã¯æ¬²ããã§ãã ã¨ããã®ãããããä¸è¬ã®åã§ã§ãããæå¾ $O(n)$ æéã§ã½ã¼ããã§ãããã¨ã«ãªãããã§ã*1ã
$a = (a_0, a_1, \dots, a_{n-1})$ ãä¸ãããã$a_0 \lt a_1 \lt \dots \lt a_{n-1}$ ã§ããã¨ãã¾ãã
ãããã¯ãword size ãããæããªã®ãä»®å®ãã¦åºæ°ã½ã¼ãã®é¡ã§ $\Theta(n)$ æéã§ã½ã¼ããã§ããç¶æ³ã¨ãã¾ãã
ããã«å¯¾ãã¦åå¦çããã¦ãããã$a_i$ ã®å¤ã®ã¿ãä¸ããããã®ã§ã$i$ ãè¿ããã¨ããã¯ã¨ãªãææª $O(1)$ æéã§å¦çãããã§ãã $a$ ã«å«ã¾ããªãå¤ã¯ä¸ãããã¾ãã*2ã
ç´¹ä»
çªç¶ã§ãããcuckoo hashing ã¨ããææ³ãããã¾ãã 競ããçéã§ã¯ããã·ã¥ãããã®é¡ããã¾ãæåã§ãªãæ°ããã¾ã*3ã
CS166.1226/6 ãªã©ã§ãçµµã¤ãã®ã¹ã©ã¤ãã§ç´¹ä»ããã¦ãã¾ãã
ããã¯ãè¦ç´ ã®è¿½å ãæå¾ $O(1)$ æéãåé¤ã¨åå¾ãææª $O(1)$ æéã§ã§ãããã¼ã¿æ§é ã§ãã ããã«å¯¾ã㦠$a_i\mapsto i$ ã®å¯¾å¿ã¥ããå ¥ããã°ããã¾ãã§ã*4ããããã
ãã以å¤ã®ç´¹ä»
æ¬å½ã¯ä»¥ä¸ã§ç´¹ä»ãã話ã調ã¹ã¦è¨äºã«ããäºå®ã ã£ãã®ã§ãããcuckoo hashing ã§çµãããããã¨æ°ã¥ãã¦ãã¾ã£ãã®ã§ããã£ããç´¹ä»ã ãã«ãªãã¾ãã ç°¡æ½ãã¼ã¿æ§é å¯ãã®æèã®è©±ãªã®ã§ãcuckoo hashing ãã空éè¨ç®éã«ã¯æ°ã使ã£ã¦ããã®ã§ããããããã競ããçéã§ã¯èå³ãèã人ãå¤ããã§ã*5ã
座æ¨å§ç¸®ã«ç¸å½ããããã·ã¥é¢æ°ã monotone minimal perfect hash function ã¨å¼ã¶è©±ã¨ãããããæå¾ $O(n)$ æéã§æ§æããæç®ã®ç´¹ä»ã¨ãã以ä¸ã§ãã¾ãã
ããã·ã¥ã«ã¤ãã¦ã®è©±
ãã¼ã¨ãã¦ãããããã®ã®éå $\mathcal{X}$ ã¨ãã¾ãããã¨ãã°ãæ´æ°å ¨ä½ã®éå $\Z$ ã ã£ããæååã®éå $\Sigma^*$ ã ã£ãããã¾ãã ãããã $n$ åé¸ãã§ãããã®ã $\mathcal{K}$ ã¨ãã¾ã ($\mathcal{K}\subseteq\mathcal{X}$, $|\mathcal{K}| = n$)ã
ããã«å¯¾ãã¦ãã¯ã¨ãª $x\in\mathcal{X}$ ãä¸ããã㦠$x\in\mathcal{K}$ ãå¤å®ãããç¶æ³ã¯é »åºã§ã (set)ã ãããã¯ãå $x\in\mathcal{K}$ ã«å¤ $y$ ãé¢é£ã¥ãããã¦ãã¦ãä¸ãããã $x\in\mathcal{K}$ ã«å¯¾ãã¦ãã® $y$ ãçãããç¶æ³ãããã¾ã (dict, map)ã
以ä¸ã$[n]$ 㧠$\{0, 1, \dots, n-1\}$ ãæå³ããã¨ãã¾ãã
ä½ããã®é¢æ° $h: \mathcal{X}\to[m]$ ($m \ge n$) ãç¨ãã¦ãé·ã $m$ ã®é å $A$ ã« $A[h(x)] = x$ ($x\in\mathcal{K}$) ã¨ãã¦ããã¦ã¿ã¾ãã ããã«ãã£ã¦ãã¯ã¨ãªã® $x$ ã«å¯¾ã㦠$A[h(x)] = x$ ãªã $x\in\mathcal{K}$ ã¨å¤å®ã§ãããã§ã*6ã ãã® $h$ ãããã·ã¥é¢æ°ã¨å¼ã³ã¾ãã
ãããã$h$ ã®é¸ã³æ¹ã«ãã£ã¦ã¯ã$x, y\in\mathcal{K}$ ($x\neq y$) ã«å¯¾ã㦠$h(x) = h(y)$ ã¨ãªã£ã¦ãã¾ããã¨ã¯ãããã¾ãã ãããè¡çªã¨å¼ã³ã¾ãã è¡çªããã¨èª¤ã£ã¦ $y\notin\mathcal{K}$ ã¨å¤å®ãã¦ãã¾ã£ãããè¡çªãåé¿ããããã«è¨ç®éãç ç²ã«ãããã¨ã«ãªã£ãããããæ£ã ã§ãã
ãå®æ°æéãå®ç¾ããããã«ããã·ã¥ãå°å ¥ããã®ã«ãææªã±ã¼ã¹ã§ã¯ç·å½¢æéã¨ãã«ãªãããããã¯ãã°ãã°ããã ð©ãã¨ãªã競ãã er ã¯å¤ããã§ããããããããã
以ä¸ã§ã¯è¡çªã®ãã¨ã¯æ°ã«ããªãã®ã§ãè¡çªãèµ·ããéã®å¦ç (open hashing, closed hashing) ã¨ãè¡çªãèµ·ãã確çã¨ãã®è©±ã¯ãã¾ããã
+perfect
è¡çªããªãããã·ã¥é¢æ°ã¯ perfect hash function ã¨å¼ã°ãã¦ãã¾ãã ããªãã¡ã$x, y\in\mathcal{K}$ ($x\neq y$) ã«å¯¾ã㦠$h(x) \neq h(y)$ ã¨ãªã $h$ ã§ã*7ã
$\mathcal{K}$ ãéç*8ã§ããã° perfect hash function ãæ§æãããã¨ãã§ãã¾ãã ã¾ããdict ã®ã¯ã¨ãªã«ããã¦ã¯ã$x\in\mathcal{K}$ ã¯ä¿è¨¼ããã¦ãã¦ã対å¿ããå¤ãè¿ãã°ããåé¡è¨å®ã§ããã±ã¼ã¹ãå¤ãã§ãã ããªãã¡ã$h:\mathcal{K}\to[m]$ ãæ§æããã°ããã§ãã
ããããç¶æ³ã§ã¯ãé å $A$ ã®ããã® $\log(\binom{|X|}{n})$ bits ãå¿ è¦ãªããªãã$2.46n + o(n)$ bits ã¨æå¾ $O(n)$ æéã§æ§æããææ³ããããããã§ãã
+minimal
ããã«ãperfect hash function ã®ãã¡ã§ $m = n$ ã¨ã§ãããã®ãããªãã¡ $h: \mathcal{K}\to[n]$ ($x\neq y \implies h(x) \neq h(y)$) ã minimal perfect hash function ã¨ããã¾ãã
ããããªãããããã㦠$2.46n + o(n)$ bits ã«ã§ãããããã§ãã
+monotone
ãã¼ã®éåãæ´æ° $\mathcal{X} = [u]$ ã¨ãã¾ãã ãã®ã¨ããminimal perfect hash function ã $x\lt y \implies h(x)\lt h(y)$ ãæºããã¨ããmonotone minimal perfect hash function ã¨ããã¾ã*9ã
è¦ããã«ããããã座æ¨å§ç¸®ã¯ monotone minimal perfect hash function ã«ç¸å½ãã¦ãã¾ããã 競ããçéã§ããããå¼ã³æ¹ããã¦ãã人ã¯ã»ã¼ããªãããã§ãã
ã¾ããæ´æ°ã®å¤§å°é¢ä¿ã ãã§ã¯ãªããä¸ããããä»»æã®é åºãä¿åã§ããããã·ã¥é¢æ° ($x\prec y \implies h(x)\lt h(y)$) 㯠order-preserving ã¨å¼ãã§åºå¥ããæµæ´¾ãããããã§ãã
ãã㯠$O(n\log(\log(u)))$ bits ã¨æå¾ $O(n)$ æéã§ã§ããã¿ããã§ããã¯ã¨ãªã $O(\log(\log(u)))$ æéã«ãã¦ãããªã $O(n\log(\log(\log(u))))$ bits ã«è½ã¨ãææ³ãããã¨ããªãã¨ãã
æç®
ãã®è¾ºã®è©±ã¯ Navarro, Gonzalo. Compact data structures: A practical approach. Cambridge University Press, 2016. ã® 4.5.3 Dictionaries, Sets, and Hashing ã«è¼ã£ã¦ãã¾ããçµæ§ãé«ãããã©ãã®ãããå¹´çãããã£ãã®ã§æ°å¹´æ©ã è²·ã£ã¦ãã¾ãã¾ãã*10ãä»æã¡ãã£ã¨ãã³ãã§ãã
以ä¸ãèªãã§ã¿ãã¨é¢ç½ãããããã³ã¡ããã¯ã¾ã ã§ãã
- Belazzougui, Djamal, Paolo Boldi, Rasmus Pagh, and Sebastiano Vigna. âTheory and practice of monotone minimal perfect hashing.â Journal of Experimental Algorithmics (JEA) 16 (2008): 3-1.
- Belazzougui, Djamal, Paolo Boldi, Rasmus Pagh, and Sebastiano Vigna. "Monotone minimal perfect hashing: searching a sorted table with $O(1)$ accesses." In Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms, pp. 785â794. Society for Industrial and Applied Mathematics, 2009.
å®æ¸¬
ä¸å¿ãã¦ã¿ã¾ããã
BTreeMap
ã§æç´, 51 ms- åºæ°ã½ã¼ã㨠cuckoo hashing, 46 ms
- é åã§äºåæ¢ç´¢, 40 ms
ãã¼ãã£ã¦æãããã£ã¨ãã¥ã¼ãã³ã°ãããã°ãã°ããæãã«ãªãã®ããããã¾ããã
ãããã¡
ããã·ã¥ç³»ã¨ãã®ãã¼ã¿æ§é ã¯ããã¾ãæµè¡ããªãã®ããªãã¨ããæ°ããã¾ãã ãã¥ã ã¨ããã·ã¥ã使ã人ãããã®ããªãããã³ã¡ããã¯ãã¥ã«èå³ãæã¦ãªãã®ã§ãããã¾ããã
競ããã§åºã¦ãããã®ã®å¤§åã¯æ´æ°ï¼integer alphabet ã®æååå«ãï¼ãªã®ã§ããã£ã¨æ´æ°ç³»ã®ãã¼ã¿æ§é ã¨ãã¯æµè¡ã£ã¦ããããããªã®ã«ãwavelet matrix ã¨ããï¼æè¿ã¯ããç¨åº¦ç¥ã渡ã£ã¦ããæ°ããããã®ã®ï¼ã¾ã ç¥ã人ãç¥ãæãããæ°ããã¾ãã
ããã©ã¡ããã¨ããã¨ããWM ã®åºæ¬æä½ããã ãããè¶ ãã使ãæ¹ãããï¼+ ã»ã°æ¨å¹³é¢èµ°æ»ãªã©ã§ä»£æ¿å¯è½ã§ãªãï¼åé¡ããã¾ãæãã¤ãããªãã¦ãçµæçã«ã¡ã¸ã£ã¼ãªã³ã³ãã¹ãã§ãããæ³å®ã®åé¡ãåºãªãããã横éãã¼ã¿æ§é ã¿ãããªæ±ãã«ãªã£ã¦ãã®ããª
— ãã³ã¡ãã (@rsk0315_h4x) 2022å¹´12æ11æ¥
WM ã¨ãããæ´æ°ã®æ§è³ªã使ãç³»ã®ãã¼ã¿æ§é ãå¤ä»£ã®ç«¶ããçéã§ã¯ãºã«ã ã¨æããã¦ãå°è±¡ããï¼ã§ã (binary) trie ã¨ãã¯ãããªãã¨ãªãããã ããªãWM ããºã«ã ã£ãã®ãï¼
— ãã³ã¡ãã (@rsk0315_h4x) 2022å¹´12æ11æ¥
ä»ã«ã log ãè½ã¨ãç³»ã®ãã®ãããã¾ã人æ°ããªãããã§ããä»æ¹ãªãããã
ãlog ãè½ã¨ãç³»ã¯å®æ°åãæªããã¨ãå¤ããã...ãã¨æ¬é ããå²ã«ã¯ãããã並åã§é«éåãããã¤ã¯ãéæ¬è³ªãªå®æ°åé«éåã ã...ãã¨è¨ã人ãããããããã§ãã
人ã«ã¯äººã®èå³ãªã®ã§ãèå³ã®ãããããªã¢ã«ã´ãªãºã ã調ã¹ã¦ã¿ãã¨ããããã§ãã
ããã
ä»å¹´ããããããé¡ããã¾ãã
*1:ææªã¨è¨ã£ã¦ã¯ããªãã®ã§ä¸å¯è½ã§ã¯ãªãããï¼ ãããã¾ããããã ç°¡åããã§ã¯ãªãæ°ããã¾ãã
*2:ç¹ã«ãä¸ãããã $x$ 以ä¸ã§æ大㮠$a_i$ ã¯ï¼ã¨ããå½¢å¼ã® predecessor query ã¯èãã¾ããã
*3:ææªè¨ç®é以å¤èå³ãªããæå¾ è¨ç®é㯠FAKE ã®ãããªé¢¨æ½®ããã£ãããªãã£ãããæ°æã¡ã¯ãããã
*4:$a_i$ ã«å¯¾ãã¦é©å½ãªããã·ã¥å¤ãè¨ç®ããã®ãå®æ°æéã§ã§ããã®ã¯ä»®å®ãã¾ãã
*5:競ããçéã®èå³ã®ããã«è¨äºãæ¸ãã¦ããããã§ããªãã®ã§ããã
*6:ãã® $A$ ã®ããã« $\log(\binom{|X|}{n}) = n\log(|X|/n) + O(n)$ bits å¿ è¦ã«ãªãã¾ãã
*7:$x\in\mathcal{X}, y\in\mathcal{X}\setminus\mathcal{K}$ ã«ã¤ã㦠$h(x) = h(y)$ ã¨ãªã£ã¦ãåé¡ãªãæ°ããããã©ã ãï¼
*8:追å ããããããªãç¶æ³ãåé¤ãããã®ã¯ããã§ã¯åé¡ãªãããã ãã©åé¤ããããªãã¨ããã®ãæ®éããã
*9:minimal ãããªããã®ã«ã¤ãã¦ã¯ monotone perfect hash function ã¨ãã«ãªãããããã¾ãããããã§ã¯ããããã®ã¯èãã¾ããã
*10:@rsk0315_h4x ãããããã¨ããããã¾ãã