Clang ã $\sum_{i=0}^n i$ ã $n(n+1)/2$ ã«ãã¦ããããã¨ã¯æåã§ã*1ã
ã¾ãã$\sum_{i=0}^n i^2$ ã $n(n+1)(2n+1)/6$ ã«ãã¦ããã¾ãã ãã®éç¨ã§ã¯ã
unsigned v1 = n * (n - 1) * (n - 2) / 2 * 1431655766u; unsigned v2 = n * (n - 1) / 2; return 3 * v2 + v1 + n;
ã®ãããªè¨ç®ããã¦ãã¾ãããããã§ã$1431655766 = (2^{32}+2)/3$ ã§ããã $$ x\equiv 0\pmod{3} \implies (x\times 1431655766)\bmod 2^{32} = \tfrac23 x $$ ãæãç«ã¡ã¾ãã
ä»åã¯ã$\sum_{i=0}^n i^k$ ã®æé©åãããã大ãã $k$ ã«ã¤ãã¦ãã£ã¦ããã£ããå¥ã®çºè¦ãããã®ã§ã¯ãªããï¼ã¨ããè¨äºã§ãã çºè¦ããªãã£ãããèµå ¥ãã«ãªãäºå®ã ã£ãã®ã§ãããçºè¦ããã£ãã®ã§ããã£ãã§ãã
追è¨
â ãã£ã¡ã®è¨äºãèªãã æ¹ãããããã
ä¸è¨ã§ã¯ãåãããã¢ã»ã³ããªã観å¯ãã¦ã¨ã¹ãã¼ããæµãã«ãªã£ã¦ãã¾ãããä¸è¨ã®è¨äºã§ã¯ LLVM ã®ã½ã¼ã¹ã³ã¼ããè¦ã¦ãã¦ãããã§ããä¸è¨ãããå¦ã³ãããã¨ã¯æãã¾ãããClangã»LLVM ã®æé©åææ³èªä½ãå¦ã¶ã¨ããããã¯ããã®æé©åææ³ãè¦ã¦å¦ã³ãå¾ãã¹ã¿ã¤ã«ã«è¿ãæ°ããã¾ãã
è¦ã¦ã¿ã
ãã¤ããä¸è©±ã«ãªã£ã¦ããã¾ããx86-64 clang 16.0.0 㧠-O3
ã«è¨å®ãã¾ãã
ã¢ã»ã³ããªã«é¢ãã¦ã¯ä¸è¨ã®è¨äºã§ãã£ãã説æããã®ã§ãããç¨åº¦ã¯èªããã¨æã£ã¦è©³ãã解説ã¯ãã¾ããã
0 ä¹å
ããããã®ã¯ 0 ããããã¾ãããã$0^0 = 1$ ã¨ãããã¨ã«ãã¾ãã
unsigned sum(unsigned n) { unsigned res = 0; for (unsigned i = 0; i <= n; ++i) res += 1; return res; }
sum(unsigned int): lea eax, [rdi + 1] ret
$\sum_{i=0}^n i^0 = n + 1$ ã¨ãããã¨ã§ãããããããã¨ããæãã§ãã
1 ä¹å
unsigned sum(unsigned n) { unsigned res = 0; for (unsigned i = 0; i <= n; ++i) res += i; return res; }
sum(unsigned int): mov ecx, edi lea eax, [rdi - 1] imul rax, rcx shr rax add eax, edi ret
$\sum_{i=0}^n i = \tfrac12 n(n-1) + n$ ã¨ãã¦ãããã§ãã
2 ä¹å
.pow(2)
çãªãã®ãæ´æ°ã«ãªãã®ã§ãè·äººãæ°æã¡ãè¾¼ãã¦ä¸ã¤ã㤠* i
ãæ¸ãã¦ããã¾ãã
unsigned sum(unsigned n) { unsigned res = 0; for (unsigned i = 0; i <= n; ++i) res += i * i; return res; }
sum(unsigned int): mov eax, edi lea ecx, [rdi - 1] imul rcx, rax lea eax, [rdi - 2] imul rax, rcx shr rax imul edx, eax, 1431655766 shr rcx lea eax, [rcx + 2*rcx] add eax, edi add eax, edx ret
æå¾ã® lea
以é㧠eax
ã«çãã足ãã¦ãã£ã¦ãã¾ããã次ã®ãããªæãã§ããæ´çãã¼ãã¯ãã ã®æã®éåã§ãã
$$ \begin{aligned} \sum_{i=0}^n i^2 &= \underbrace{\tfrac32 n(n-1)}_{\text{\texttt{3*rcx}}} + \underbrace{\vphantom{\tfrac12} n}_{\text{\texttt{edi}}} + \underbrace{\tfrac12 n(n-1)(n-2) \times 1431655766}_{\text{\texttt{edx}}} \\ &\equiv \tfrac32 n(n-1) + n + \tfrac13 n(n-1)(n-2) \pmod{2^{32}} \\ &= \tfrac12 n(n-1) + n + n(n-1) + \tfrac13 n(n-1)(n-2) \\ &= \tfrac12 n(n+1) + \tfrac13 n(n-1)(3+(n-2)) \\ &= \tfrac12 n(n+1) + \tfrac13 n(n-1)(n+1) \\ &= \tfrac16 n(n+1)(3+2(n-1)) \\ &= \tfrac16 n(n+1)(2n+1). \end{aligned} $$
3 ä¹å
unsigned sum(unsigned n) { unsigned res = 0; for (unsigned i = 0; i <= n; ++i) res += i * i * i; return res; }
sum(unsigned int): mov eax, edi lea ecx, [rdi - 1] imul rcx, rax lea eax, [rdi - 2] mov edx, ecx lea esi, [rdi - 3] imul rsi, rax imul rsi, rcx shr rcx lea r8d, [8*rcx] sub r8d, ecx imul edx, eax and edx, -2 shr rsi, 2 and esi, -2 add r8d, edi lea eax, [r8 + 2*rdx] add eax, esi ret
é·ããªã£ã¦ãã¾ãããã¬ã¸ã¹ã¿ã rsi
ã r8d
ãªã©ãç»å ´ãã¦ãã¾ãããr8d
㯠r8
ã®ä¸ä½ 32 bits (dword) ã§ãã
ç®æ°ããããªãã¤ã³ãã¨ãã¦ã¯ and edx, -2
ã®ãããã§ããããã
ä»ãã人éåãã«è§£éããã®ã§å°ã ãå¾ ã¡ãã ãããåã¬ã¸ã¹ã¿ã®æçµçãªå¤ã«åºã¥ãã¦é«ç´è¨èªã£ã½ãæ¸ãã¨ã次ã®ããã«ãªãã¾ããã
using ul = unsigned long; unsigned sum(unsigned n) { unsigned edx = (ul(n - 2) * ul(n - 1) * ul(n)) & -2; unsigned long rsi = ul(n - 3) * ul(n - 2) * ul(n - 1) * ul(n) / 4; unsigned esi = rsi & -2; unsigned long r8d = ul(n - 1) * ul(n) / 2 * 7 + n; return ul(r8d) + 2 * ul(edx) + esi; }
-2 == 0xfffffffe
ãããªãã¡ ~1
ï¼æä¸ä½ bit 以å¤ãç«ã£ã¦ããï¼ã§ããã¤ã¾ã k & -2
ã¨ããã®ã¯ä»¥ä¸ãæå³ãã¾ããæ°å¼ä¸ã§ã¯ &
㯠$\wedge$ ã§è¡¨ãã¾ãã
$$ k \wedge (-2) = \begin{cases} k, & \text{if }k\equiv 0\pmod{2}; \\ k-1, & \text{if }k\equiv 1\pmod{2}. \end{cases} $$
ãã¦ã$n(n-1)(n-2)$ 㯠$2$ ã®åæ°ã§ããã$n(n-1)(n-2)(n-3)/4$ ã $2$ ã®åæ°ã§ãããã& -2
ã¯è¡ããªãã¦ãå¤ã¯å¤ãããªãããã§ã*2ã
ã¨ãããã¨ã§ããå°ãæ¸ãæãã¾ãã
using ul = unsigned long; unsigned sum(unsigned n) { unsigned edx = ul(n - 2) * ul(n - 1) * ul(n); unsigned long rsi = ul(n - 3) * ul(n - 2) * ul(n - 1) * ul(n) / 4; unsigned esi = rsi; unsigned long r8d = ul(n - 1) * ul(n) / 2 * 7 + n; return ul(r8d) + 2 * ul(edx) + esi; }
ãã¾ããã¨æ´çããæ¹æ³ãæãã¤ããªãã£ãã®ã§ç«¯æãã¾ãããçµæã¯ææã®ãã®ã«ãªã£ã¦ãã¾ãã
$$ \begin{aligned} \sum_{i=0}^n i^3 &= \underbrace{\tfrac72 n(n-1)+n}_{\text{\texttt{r8d}}} + \underbrace{\vphantom{\tfrac12} 2n(n-1)(n-2)}_{\text{\texttt{2*edx}}} + \underbrace{\tfrac14 n(n-1)(n-2)(n-3)}_{\text{\texttt{esi}}} \\ &= \tfrac14 n^2(n+1)^2. \end{aligned} $$
ã¡ãã£ã¨æ´ç
ã¢ã»ã³ããªã§è¨ç®ãã¦ãããã®ãè¦ãã«ã $\gdef\perm#1#2{{{}_{#1}\mathrm{P}_{#2}}}$ $$\sum_{i=0}^n i^k = c_{k, 0}\cdot \perm{n}{k+1} + c_{k, 1}\cdot \perm{n}{k} + \dots + c_{k, k}\cdot \perm{n}{1}$$ ã®ãããªå½¢å¼ã§æ¸ãããã㪠$c_{\ast, \ast}$ ãæ±ãã¦ããæããªã®ã§ããããã$\perm{n}{j}$ 㯠$j$ 次å¼ã§ãããã¨ã«æ³¨æããã¨ãææã®å¤é å¼ã«æé«æ¬¡ããé ã«å®ãã¦ãããã¨ãã§ãã¦ãä¸æã«å®ã¾ãããã§ããå°ãèããã¨å®æ°é ã $0$ ã§ãããã¨ããããã¾ãã
ããã¾ã§ã® $c_{k, j}$ ã®è¡¨ãæ¸ãã¦ã¿ã¾ãããã
$k$ \ $j$ | $0$ | $1$ | $2$ | $3$ |
---|---|---|---|---|
$1$ | $\tfrac12$ | $1$ | - | - |
$2$ | $\tfrac13$ | $\tfrac32$ | $1$ | - |
$3$ | $\tfrac14$ | $2$ | $\tfrac72$ | $1$ |
ã¨ããã§ãããã¯ããå¤é å¼è£éããã¦ãã ããã¨ããåé¡ã«è¦ãã¾ããã$k$ ãåºå®ããã¨ãã$k+1$ 次ã®å¤é å¼ã«ãªã£ã¦ãå®æ°é ãå«ã㦠$k+2$ åã®ä¿æ°ãæ±ãããã®ã§ãå é $k+2$ åã® $k$ ä¹åããæ±ããã°å®ãããã¨ãã§ããããã§ããå®æ°é ã $0$ ãªã®ã¯ãããã®ã§ã以ä¸ã§ã¯ãããé¤ãã¦èãã¾ãã
$(i, j)$ æåã $\perm{i}{k+1-j}$ ã§ãããã㪠$k\times k$ è¡å $A_k$ã$i$ æåã $\sum_{u=0}^i u^k$ ã§ãããããªãã¯ãã« $b$ ã«å¯¾ãã¦ã$Ax=b$ ãªã $x$ ã $x=(c_{k, 0}, c_{k, 1}, \dots, c_{k, k})^{\top}$ ãæºããããã§ãã$4$ ä¹åã®ä¿æ°ãå èªã¿ãã¡ããã¾ãããã
$$ \begin{pmatrix} 0 & 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 2 & 2 \\ 0 & 0 & 6 & 6 & 3 \\ 0 & 24 & 24 & 12 & 4 \\ 120 & 120 & 60 & 20 & 5 \end{pmatrix} \cdot \begin{pmatrix} c_{4, 0} \\ c_{4, 1} \\ c_{4, 2} \\ c_{4, 3} \\ c_{4, 4} \end{pmatrix} = \begin{pmatrix} % 1^4 \\ 1 \\ % 1^4 + 2^4 \\ 17 \\ % 1^4 + 2^4 + 3^4 \\ 98 \\ % 1^4 + 2^4 + 3^4 + 4^4 \\ 354 \\ % 1^4 + 2^4 + 3^4 + 4^4 + 5^4 979 \end{pmatrix} $$
$c_4 = (\tfrac15, \tfrac52, \tfrac{25}3, \tfrac{15}2, 1)^{\top}$ ã¨ã®ãã¨ã§ãã
$5$ ä¹åããã£ã¡ããã¾ãããã
$c_5 = (\tfrac16, 3, \tfrac{65}4, 30, \tfrac{31}2, 1)^{\top}$ ã¨ã®ãã¨ã§ãã
表ãè¦ç´ãã¾ãã
$k$ \ $j$ | $0$ | $1$ | $2$ | $3$ | $4$ | $5$ |
---|---|---|---|---|---|---|
$1$ | $\tfrac12$ | $1$ | - | - | - | - |
$2$ | $\tfrac13$ | $\tfrac32$ | $1$ | - | - | - |
$3$ | $\tfrac14$ | $\tfrac63$ | $\tfrac72$ | $1$ | - | - |
$4$ | $\tfrac15$ | $\tfrac{10}4$ | $\tfrac{25}3$ | $\tfrac{15}2$ | $1$ | - |
$5$ | $\tfrac16$ | $\tfrac{15}5$ | $\tfrac{65}4$ | $\tfrac{90}3$ | $\tfrac{31}2$ | $1$ |
ãããããªãã§ï¼ ãããããã $c_{k, j} = \frac{{k+1 \brace k+1-j}}{k+1-j}$ ã§ããããã æ£å½æ§ããããªããããªçç±ããªã«ããããã¾ãããã©ããã¦ã§ãããããªã«ãããããã¡ããã¡ãããã¨åºã¦ããã®ã§ããããã
ã³ã£ãããã¦åãä¹±ãã¾ãããããã㧠${n\brace k}$ ã¯ç¬¬äºç¨® Stirling æ°ã§ã*3ã
æ£å½æ§ã«ã¤ãã¦ã¯è¨äºã®å¾åã§ç¤ºãã¾ãã
n 次å¤é å¼ f(x) ã x^i ã®ç·å½¢çµåã®å½¢ãã P(x, i) ã®ç·å½¢çµåã®å½¢ã«æ¸ãæããã®ã£ã¦é«éã«ã§ãããããï¼ ç¡çããï¼å¿ è¦ãªã f(0), ..., f(n) ã®å¤ãå¾ããã¦ããã®ãä»®å®ãã¦ããï¼
— ãã³ã¡ããððð¦ (@rsk0315_h4x) 2023å¹´9æ17æ¥
https://t.co/V04MYjYJIj
— hotman (@hotmanww) 2023å¹´9æ17æ¥
O(N(logN)^2)ã§ãï¼ï¼ï¼
$k$ ä¹åã®å¤é å¼èªä½ã¯å¤é å¼è£é㧠$O(n\log(n)^2)$ æéã§ã§ããã®ã§ãä¸è¨ãçµã¿åããããã¨ã§ã${n\brace 0}, {n\brace 1}, \dots, {n\brace n}\pmod{998244353}$ ãã¾ã¨ã㦠$O(n\log(n)^2)$ æéã§æ±ãããããã§ããã³ã£ãããã¾ããï¼ãã£ã¦ã¾ãããï¼ï¼ã
â ãããã $n$ ãåºå®ããéã®ç¬¬äºç¨® Stirling æ°èªä½ã¯ $O(n\log(n))$ æéã§æ±ãããã¾ãããâ ãããå©ç¨ããæ¹æ³ã§ $k$ ä¹åã $O(k\log(k))$ æéã§æ±ãããã¨ãã§ãããã§ã*4ã
ã¨ããããClang æ§ãåãã¢ã»ã³ããªãã©ããªããäºæ³ã§ããããã«ãªã£ãã¨æãã¾ãã
4 ä¹å
æ°ãåãç´ãã¦ãå
ã®ã³ã¼ãã¼ãé²ãã¾ãããã
æ°ã¥ãããã§ããããã += i * i * i * i
ã¨ãæ¸ãã¦ããã ãã® C++ ã®ã³ã¼ãã¯è²¼ãå¿
è¦ããªãã§ãããClang æ§ã®ãåºãããã¢ã»ã³ããªããã¡ãã«ãªãã¾ãã
sum(unsigned int): mov ecx, edi lea eax, [rdi - 1] imul rax, rcx lea ecx, [rdi - 2] imul rcx, rax lea edx, [rdi - 3] imul rdx, rcx lea esi, [rdi - 4] imul rsi, rdx shr rsi, 3 imul esi, esi, 1717986920 shr rcx imul ecx, ecx, 1431655782 shr rdx, 3 lea edx, [rdx + 4*rdx] shr rax lea eax, [rax + 4*rax] lea r8d, [rax + 2*rax] add ecx, edi add ecx, esi lea eax, [rcx + 4*rdx] add eax, r8d ret
immutable ã«æ¸ãã¨æ¬¡ã®ããã«ãªãã¾ããecx
ãããããã¨ã«ãªã£ã¦ãã¾ãã
unsigned sum(unsigned n) { unsigned ecx = unsigned(ul(n - 2) * ul(n - 1) * n / 2) * 1431655782u + n + unsigned((ul(n - 4) * (n - 3) * (n - 2) * (n - 1) * n) / 8) * 1717986920u; unsigned edx = 5 * (ul(n - 3) * (n - 2) * (n - 1) * n / 8); unsigned r8d = 3 * 5 *(ul(n - 1) * n / 2); return ecx + 4 * edx + r8d; }
ã¾ã㯠* 1431655782u
㨠* 1717986920u
ã«ã¤ãã¦èãã¾ãããã
$1431655782 = \tfrac13\,(2^{32}+50)$, $1717986920 = \tfrac25\,(2^{32}+4)$ ã§ãã
ã¤ã¾ãã次ã®ããã«ãªãã¾ãã
$$
\begin{aligned}
3x\times 1431655782
&= 3x\times \tfrac13\,(2^{32}+50) \\
&= x\times(2^{32}+50) \\
&\equiv 50x \pmod{2^{32}}, \\
5x\times 1717986920
&= 5x\times \tfrac25\,(2^{32}+4) \\
&= 2x\times(2^{32}+4) \\
&\equiv 8x \pmod{2^{32}}.
\end{aligned}
$$
$\tfrac12 n(n-1)(n-2)$ 㯠$3$ ã®åæ°ã$\tfrac18 n(n-1)(n-2)(n-3)(n-4)$ 㯠$5$ ã®åæ°ã§ãããã¨ã«æ³¨æããã¨ã $$ \begin{aligned} \sum_{i=0}^n i^4 &= \underbrace{\tfrac{50}3\,\tfrac12\,n(n-1)(n-2) + n + \tfrac85\,\tfrac18\, n(n-1)(n-2)(n-3)(n-4)}_{\text{\texttt{ecx}}} + {} \\ &\phantom{{}={}} \qquad \underbrace{4\cdot\tfrac58 n(n-1)(n-2)(n-3)}_{\text{\texttt{4*edx}}} + \underbrace{3\cdot \tfrac52 n(n-1)}_{\text{\texttt{r8d}}} \\ &= \tfrac15\, \perm{n}{5} + \tfrac52\, \perm{n}{4} + \tfrac{25}3\, \perm{n}{3} + \tfrac{15}2\, \perm{n}{2} + n \end{aligned} $$ ã¨ãªãã¾ãã
ããã¯å ã»ã©æ±ããä¿æ°ã¨ä¸è´ãã¦ãã¾ããå±éã㦠$n^i$ ã®ç·å½¢çµåã§è¡¨ããã¨ã¯ãããã¾ããã
5 ä¹å
Clang ã¡ããã¯ã¾ã é³ãä¸ããªãã¿ããã§ãã
sum(unsigned int): mov ecx, edi lea eax, [rdi - 1] imul rax, rcx lea edx, [rdi - 2] imul rdx, rax lea r8d, [rdi - 3] imul r8, rdx lea ecx, [rdi - 4] imul rcx, r8 lea esi, [rdi - 5] imul rsi, rcx shr rsi, 4 imul esi, esi, 1431655768 shr r8, 3 mov r9d, r8d shl r9d, 7 lea r8d, [r9 + 2*r8] shr rdx imul edx, edx, 60 shr rax mov r9d, eax shl r9d, 5 sub r9d, eax shr rcx, 3 lea eax, [rcx + 2*rcx] add r8d, edi add r8d, edx add r8d, r9d add r8d, esi lea eax, [r8 + 8*rax] ret
immutable ã«ç´ãã®ã¯è·äººãæä½æ¥ã§ãã£ã¦ãã¦ã大å¤ã§ãã
unsigned sum(unsigned n) { unsigned edi = n; unsigned esi = ((ul(n - 5) * (n - 4) * (n - 3) * (n - 2) * (n - 1) * n) / 16) * 1431655768u; unsigned r8d = ul(n - 3) * (n - 2) * (n - 1) * n / 8 * 128 + 2 * ul(n - 3) * (n - 2) * (n - 1) * n / 8; unsigned edx = (ul(n - 2) * (n - 1) * n / 2) * 60; unsigned r9d = ul(n - 1) * n / 2 * 31; unsigned eax = 3 * ul(n - 4) * (n - 3) * (n - 2) * (n - 1) * n / 8; return edi + edx + r9d + esi + r8d + 8 * eax; }
* 1431655768u
ã«ã¤ãã¦èãã¾ãã$1431655768 = \tfrac13\,(2^{32}+8)$ ãªã®ã§ã
$$3x\times1431655768 \equiv 8x\pmod{2^{32}}$$
ã§ããæ
£ãããã®ã§ããã
$$ \begin{aligned} \sum_{i=0}^n i^5 &= \underbrace{\vphantom{\tfrac12}n}_{\text{\texttt{edi}}} + \underbrace{60\cdot\tfrac12 n(n-1)(n-2)}_{\text{\texttt{edx}}} + \underbrace{31\cdot \tfrac12 n(n-1)}_{\text{\texttt{r9d}}} + {} \\ &\phantom{{}={}}\qquad \underbrace{\tfrac83\tfrac1{16} n(n-1)(n-2)(n-3)(n-4)(n-5)}_{\text{\texttt{esi}}} + {} \\ &\phantom{{}={}}\qquad \underbrace{128\cdot\tfrac18 n(n-1)(n-2)(n-3) + 2\cdot\tfrac18n(n-1)(n-2)(n-3)}_{\text{\texttt{r8d}}} + {} \\ &\phantom{{}={}}\qquad \underbrace{8\cdot 3\cdot\tfrac18 n(n-1)(n-2)(n-3)(n-4)}_{\text{\texttt{8*eax}}} \\ &= \tfrac16\, \perm{n}{6} + 3\, \perm{n}{5} + \tfrac{65}4\, \perm{n}{4} + 30\, \perm{n}{3} + \tfrac{31}2\, \perm{n}{2} + n \end{aligned} $$
å ã®è¡¨ã¨åãã«ãªã£ã¦ãã¾ãã
ãããæé©åããã¦ããã¯ããªã®ã«æ°é®®å³ããªããªã£ã¦ãã¾ããããæµããããã£ã¦ãã証æ ã§ãã
6 ä¹å
ããå°ãç¶ãã¾ããããå°ãã§æµããå¤ããã®ã§ã
sum(unsigned int): mov eax, edi lea ecx, [rdi - 1] imul rcx, rax lea eax, [rdi - 2] imul rax, rcx lea r8d, [rdi - 3] imul r8, rax lea r9d, [rdi - 4] imul r9, r8 lea esi, [rdi - 5] imul rsi, r9 lea edx, [rdi - 6] imul rdx, rsi shr rdx, 4 imul edx, edx, 1840700272 shr rax imul eax, eax, 1431655966 shr r8, 3 imul r8d, r8d, 700 shr r9, 3 imul r9d, r9d, 224 shr rcx mov r10d, ecx shl r10d, 6 sub r10d, ecx shr rsi, 4 imul ecx, esi, 56 add eax, edi add eax, r8d add eax, r9d add eax, r10d add eax, ecx add eax, edx ret
è·äººãæ £ãã¦ããã®ã§ä½æ¥ãæ©ããªã£ã¦ãã¾ãããæåã«åã¬ã¸ã¹ã¿ã« $\perm{n}{i}$ ãè©°ãã¦ããã¨ã¯è³¢ãä¿æ°åãããããã ãã§ããã
unsigned sum(unsigned n) { unsigned edi = n; unsigned edx = ul(n - 6) * (n - 5) * (n - 4) * (n - 3) * (n - 2) * (n - 1) * n / 16 * 1840700272u; unsigned eax = ul(n - 2) * (n - 1) * n / 2 * 1431655966u; unsigned r8d = ul(n - 3) * (n - 2) * (n - 1) * n / 8 * 700; unsigned r9d = ul(n - 4) * (n - 3) * (n - 2) * (n - 1) * n / 8 * 224; unsigned r10d = ul(n - 1) * n / 2 * 63; unsigned ecx = ul(n - 5) * (n - 4) * (n - 3) * (n - 2) * (n - 1) * n / 16 * 56; return eax + edi + r8d + r9d + r10d + ecx + edx; }
* 1840700272u
㨠* 1431655966u
ãèãã¾ãã
è·äººããã¯æ¬¡ã®ãããªãã¨ããã¦æ±ãã¦ãã¾ãã
>>> 2**32 / 1840700272 # ã¨ããããå²ã 2.3333333304358854 >>> 3 * 2**32 / 1840700272 # åæ¯ã« 3 ãããã®å¤ããããããªã®ã§ 3 ãæãã 6.999999991307656 >>> 7 * 1840700272 % 2**32 # 7 ã«è¿ãã®ã§ã7 ã«æããã¨ãã®æåãè¦ã 16 >>> divmod((3*2**32 + 16), 7) # æ¤ç® (1840700272, 0)
$1840700272 = \tfrac17\,(3\cdot 2^{32}+16)$, $1431655966 = \tfrac13\,(2^{32}+602)$ ã§ã $$ \begin{aligned} 7x\times 1840700272 &= x\times(3\cdot 2^{32}+16) \\ &\equiv 16x \pmod{2^{32}}, \\ 3x\times 1431655966 &= x\times(2^{32} + 602) \\ &\equiv 602x \pmod{2^{32}} \end{aligned} $$ ã§ãã
edx
ãè¦ãã«ã$\perm{n}{7}/16$ 㯠$7$ ã®åæ°ãªã®ã§ã$1840700272$ 㯠$\tfrac{16}7$ ã¨èªã¿æ¿ãã¦ããããã§ãã
åæ§ã« eax
ã® $\perm{n}{3}/2$ 㯠$3$ ã®åæ°ãªã®ã§ã$1431655966$ 㯠$\tfrac{602}3$ ã¨èªã¿æ¿ãããã¾ãã
ãã¨ã¯ãææã®ä¿æ°ã ãæã£ã¦ããã°ååã§ãããã
$$ \sum_{i=0}^n i^6 = \tfrac17\,\perm n7 + \tfrac72\,\perm n6 + 28\,\perm n5 + \tfrac{175}2\,\perm n4 + \tfrac{301}3\,\perm n3 + \tfrac{63}2\,\perm n2 + n. $$
ä¿æ°åãä½æ¬¡ã®æ¹ãã並ã¹ã㨠$(\tfrac11, \tfrac{63}2, \tfrac{301}3, \tfrac{350}4, \tfrac{140}5, \tfrac{21}6, \tfrac17)$ ã§ãã ãã£ã¦ãããã§ãããé©å® Stirling æ°ã®è¡¨ã調ã¹ã¦ãã ããã
7 ä¹å
æ®å¿µã§ãããã¾ã æµãã¯å¤ããã¾ããããããé·ãã§ãã
sum(unsigned int): mov eax, edi lea ecx, [rdi - 1] imul rcx, rax lea edx, [rdi - 2] imul rdx, rcx lea eax, [rdi - 3] imul rax, rdx lea esi, [rdi - 4] imul rsi, rax shr rax, 3 imul eax, eax, 3402 lea r8d, [rdi - 5] imul r8, rsi shr rsi, 3 imul esi, esi, 1680 shr rdx imul edx, edx, 644 shr rcx mov r9d, ecx shl r9d, 7 sub r9d, ecx lea ecx, [rdi - 6] mov r10d, r8d imul r10d, ecx and r10d, -16 lea r11d, [rdi - 7] imul r11, rcx imul r11, r8 shr r11, 3 and r11d, -16 shr r8, 4 imul ecx, r8d, -1431655056 add eax, edi add eax, edx add eax, r9d add eax, esi lea eax, [rax + 4*r10] add eax, r11d add eax, ecx ret
è·äººããã¯ããªãæ £ãã¦ãã¾ãããã次ããæµããå¤ããã¾ããããªãããã
unsigned sum(unsigned n) { unsigned edi = n; unsigned eax = ul(n - 3) * (n - 2) * (n - 1) * n / 8 * 3402; unsigned esi = ul(n - 4) * (n - 3) * (n - 2) * (n - 1) * n / 8 * 1680; unsigned edx = ul(n - 2) * (n - 1) * n / 2 * 644; unsigned r9d = ul(n - 1) * n / 2 * 127; unsigned r10d = ul(n - 6) * (n - 5) * (n - 4) * (n - 3) * (n - 2) * (n - 1) * n; // r10d &= -16; unsigned long r11 = ul(n - 7) * (n - 6) * (n - 5) * (n - 4) * (n - 3) * (n - 2) * (n - 1) * n / 8; // r11 &= -16; unsigned ecx = ul(n - 5) * (n - 4) * (n - 3) * (n - 2) * (n - 1) * n / 16 * -1431655056u; eax += edi; eax += edx; eax += r9d; eax += esi; eax += 4 * r10d; eax += r11; eax += ecx; return eax; }
and r10d, -16
ã¨and r11d, -16
ãããã¾ãã-16 == 0xfffffff0
ã§ã$k \wedge (-16) = \floor{k/16}\cdot 16$ ã§ãã
ããããåååæ§ãr10d
㯠$2\cdot 4\cdot 2=16$ ã®åæ°ãr11d
ã $2\cdot 4\cdot 2\cdot 8/8 = 16$ ã®åæ°ãªã®ã§ãä½ãããªãã®ã¨åæ§ã«è¦ãã¾ãã
ãã¨ã¯ -1431655056u
ã§ãã32-bit 符å·ãªããªã®ã§ 2863312240u
ã¨åãã§ãã
$2863312240 = \tfrac23\,(2^{32}+1064)$ ãªã®ã§ã$3x\times 2863312240\equiv 2128 \pmod{2^{32}}$ ã§ãã
r10d
ã«é¢ãã¦ã¯ eax
ã¸ã®å¯ä¸ã 4 *
ã§ãããã¨ã«æ°ãã¤ãã¤ã¤ãæ±ãã¦ãããã®ã¯æ¬¡ã®ããã«ãªãã¾ãã
$$ \sum_{i=0}^n i^7 = \tfrac18\,\perm n8 + 4\,\perm n7 + \tfrac{133}3\,\perm n6 + 210\,\perm n5 + \tfrac{1701}4\,\perm n4 + 322\,\perm n3 + \tfrac{127}2\,\perm n2 + n. $$
ä¿æ°åãä½æ¬¡ã®æ¹ãã並ã¹ã㨠$(\tfrac11, \tfrac{127}2, \tfrac{966}3, \tfrac{1701}4, \tfrac{1050}5, \tfrac{266}6, \tfrac{28}7, \tfrac18)$ ã§ãããã£ã¦ãããã§ããã
8 ä¹å
æ¥ã¾ãããæµããå¤ããã¾ãã å¤ãã£ãçµæãSIMD ã使ã£ãæ³¥èãæé©åã«ãªã£ã¦ãã¾ã£ãã®ã§ãï¼ããã°ã£ã¦æ¸ããã®ã§ããï¼ãã®ç¯ã¯èªã¾ãªãã¦ãããã§ãã
.LCPI0_0: .long 0 # 0x0 .long 1 # 0x1 .long 2 # 0x2 .long 3 # 0x3 .LCPI0_1: .long 4 # 0x4 .long 4 # 0x4 .long 4 # 0x4 .long 4 # 0x4 .LCPI0_2: .long 8 # 0x8 .long 8 # 0x8 .long 8 # 0x8 .long 8 # 0x8 sum(unsigned int): inc edi xor ecx, ecx mov eax, 0 cmp edi, 8 jb .LBB0_4 mov ecx, edi and ecx, -8 pxor xmm0, xmm0 movdqa xmm1, xmmword ptr [rip + .LCPI0_0] # xmm1 = [0,1,2,3] movdqa xmm3, xmmword ptr [rip + .LCPI0_1] # xmm3 = [4,4,4,4] movdqa xmm4, xmmword ptr [rip + .LCPI0_2] # xmm4 = [8,8,8,8] mov eax, ecx pxor xmm2, xmm2 .LBB0_2: # =>This Inner Loop Header: Depth=1 movdqa xmm5, xmm1 paddd xmm5, xmm3 movdqa xmm6, xmm1 pmuludq xmm6, xmm1 pshufd xmm7, xmm1, 245 # xmm7 = xmm1[1,1,3,3] pmuludq xmm7, xmm7 pshufd xmm8, xmm5, 245 # xmm8 = xmm5[1,1,3,3] pmuludq xmm5, xmm5 pmuludq xmm8, xmm8 pmuludq xmm7, xmm7 pmuludq xmm6, xmm6 pmuludq xmm8, xmm8 pmuludq xmm5, xmm5 pmuludq xmm6, xmm6 pshufd xmm6, xmm6, 232 # xmm6 = xmm6[0,2,2,3] pmuludq xmm7, xmm7 pshufd xmm7, xmm7, 232 # xmm7 = xmm7[0,2,2,3] punpckldq xmm6, xmm7 # xmm6 = xmm6[0],xmm7[0],xmm6[1],xmm7[1] paddd xmm0, xmm6 pmuludq xmm5, xmm5 pshufd xmm5, xmm5, 232 # xmm5 = xmm5[0,2,2,3] pmuludq xmm8, xmm8 pshufd xmm6, xmm8, 232 # xmm6 = xmm8[0,2,2,3] punpckldq xmm5, xmm6 # xmm5 = xmm5[0],xmm6[0],xmm5[1],xmm6[1] paddd xmm2, xmm5 paddd xmm1, xmm4 add eax, -8 jne .LBB0_2 paddd xmm2, xmm0 pshufd xmm0, xmm2, 238 # xmm0 = xmm2[2,3,2,3] paddd xmm0, xmm2 pshufd xmm1, xmm0, 85 # xmm1 = xmm0[1,1,1,1] paddd xmm1, xmm0 movd eax, xmm1 jmp .LBB0_5 .LBB0_4: mov edx, ecx imul edx, ecx imul edx, edx imul edx, edx add eax, edx inc ecx .LBB0_5: cmp edi, ecx jne .LBB0_4 ret
$O(1)$ æéãããªãæ°é
ãæãã¾ãããã¨ãããã解èªãã¾ãããã
å½ä»¤ãã¬ã¸ã¹ã¿ãæ°ãããã®ãããããè¦ãã¾ããdq
ã¨ã¤ãã¦ããå½ä»¤ãããããããã¾ããã¬ã¸ã¹ã¿ããxmm
ã«çªå·ãã¤ãããã®ãç»å ´ãã¦ãã¾ãã
inc
å½ä»¤ã¯ãå¼æ°ã increment ããå½ä»¤ã§ããcmp
å½ä»¤ã¯ãå¼æ°ã compare ããå½ä»¤ã§ãã
xmm0
, xmm1
, ... ãã¡ã¯ããããã 128-bit ã®ã¬ã¸ã¹ã¿ã§ãã
æ¬æ¥ã¯ã8-bit æ´æ° 16 åã並åã«å¦çãããã64-bit æµ®åå°æ°ç¹æ° 2 åã並åã«å¦çããããªã©ãããããªå½ä»¤ã«å¯¾å¿ãã¦ããã®ã§ãããããã§ã¯ 32-bit æ´æ° 4 ã¤ã並åã«å¦çããã®ã«ãã使ã£ã¦ããªãã®ã§ãããããåæã§è©±ãã¨ãã¦ãxmm0 = [a, b, c, d]
ã®ãããªè¡¨è¨ã§ãããã表ããã¨ã«ãã¾ãã
å½ä»¤ã®æå³åãã¯æ¬¡ã®ããã«ãªãã¾ãã
movdqa xmm, [e, f, g, h] # xmm = [e, f, g, h] paddd [a, b, c, d], [e, f, g, h] # a += e; b += f; c += g; d += h; pmuludq [a, b, c, d], [e, f, g, h] # [a, b] = a * e; [c, d] = c * g; pxor xmm, xmm # xmm = [0, 0, 0, 0] pshufd xmm, [e, f, g, h], 245 # xmm = [f, f, h, h] punpckldq [a, b, c, d], [e, f, g, h] # xmm = [a, e, b, f]
è£è¶³ãå¿
è¦ãããªã®ã¯ãpmuludq
pshufd
punpckldq
ã§ããããã
pmuludq
ã¯ãå¶æ°çªç®ã®è¦ç´ å士ã®ç©ã 64-bit ã§è¨ç®ããçµæãæ ¼ç´ãã¾ãã
ããã§ã¯ä¸ä½ 32-bit ãã¤ã¯ä½¿ã£ã¦ããªããããa *= e; c *= g
ã ã¨è§£éãã¦ã大ä¸å¤«ã§ãããã
pshufd
ã¯ã第ä¸å¼æ°ã§æå®ãããæ·»åã«å¾ã£ã¦ [e, f, g, h]
ã®è¦ç´ ãã³ãã¼ãã¾ãã
$245 = 3311_{(4)}$ ãªã®ã§ã[[1], [1], [3], [3]]
çªç®ãåå¾ãã¦ãã¾ãï¼ä¸ä½æ¡ãå
é å´ã«æ¥ã¾ãï¼ã
ããã§ã¯ãã[0]
㨠[2]
ã«ææã®å¤ãå
¥ãããã[1]
㨠[3]
ã¯ã©ãã§ããããã®ãããªä½¿ããæ¹ãå¤ãã§ãã
punpckldq
ã¯ãå
é å´äºã¤ã®è¦ç´ ã交äºã«è©°ãã¦ãã¾ãã
ãã¦ããã®ããããããã£ã¦ããã°æ¦ãèªããã§ããããå¦çã®å 容ã¯æ¬¡ã®ããã«ãªã£ã¦ãã¾ãã
ç¯å² | å¦çå 容 |
---|---|
.LBB0_2 ããå |
å®æ°ãã«ã¼ãåæ°ã®åæå |
.LBB0_2 ãã .LBB0_4 ã¾ã§ |
å¤ 8 ã¤ãã¨ã«ã¾ã¨ã㦠8 ä¹ãè¨ç® |
.LBB0_4 ãã .LBB0_5 ã¾ã§ |
端æ°ã 1 ã¤ãã¤è¨ç® |
.LBB0_5 ããå¾ |
return |
$\gdef\register#1{r_{\text{\texttt{#1}}}}$
æ°å¼ä¸ã§ã¯ããã¨ãã° ecx
ã®å¤ã¯ $\register{ecx}$ ã®ããã«è¡¨ããã¨ã«ãã¾ãã
åæåã®æ®µéã§ã¯ $\register{ecx} = \floor{\tfrac{n+1}8}\cdot 8$ ã¨ãã$8k\lt \register{ecx}$ ãªã $k$ ã«ã¤ã㦠$(8k+0)^8 + (8k+1)^8 + \dots + (8k+7)^8$ ãã¾ã¨ãã¦è¨ç®ããæºåããã¾ãã
$k$ åç® ($0\le k\lt \floor{\tfrac{n+1}8}$) ã« .LBB0_2
ã«å°éããæç¹ã§ã¯ãåã¬ã¸ã¹ã¿ã¯æ¬¡ã®ããã«ãªã£ã¦ãã¾ãã
xmm0
, xmm2
ãåºåãxmm1
ãã«ã¼ãå¤æ°ãxmm3
, xmm4
ãã«ã¼ãå¤æ°ã®å¢åï¼ã¹ãããï¼ strideï¼ï¼ãæã£ã¦ããå®æ°ã§ãã
$$
\begin{aligned}
\register{xmm0} &= \left[\sum_{i=0}^{k-1} (8i+0)^8, \sum_{i=0}^{k-1} (8i+1)^8, \sum_{i=0}^{k-1} (8i+2)^8, \sum_{i=0}^{k-1} (8i+3)^8\right], \\
\register{xmm2} &= \left[\sum_{i=0}^{k-1} (8i+4)^8, \sum_{i=0}^{k-1} (8i+5)^8, \sum_{i=0}^{k-1} (8i+6)^8, \sum_{i=0}^{k-1} (8i+7)^8\right], \\
\register{xmm1} &= [8k+0, 8k+1, 8k+2, 8k+3], \\
\register{xmm3} &= [4, 4, 4, 4], \\
\register{xmm4} &= [8, 8, 8, 8].
\end{aligned}
$$
ã«ã¼ãå
ã§ã¯ãxmm5
ãã xmm8
ãç¨ãã¦ç¹°ãè¿ãäºä¹æ³ããã¤ã¤ã$\register{xmm6} = [8k+0, 8k+1, 8k+2, 8k+3]$ ã $\register{xmm5} = [8k+4, 8k+5, 8k+6, 8k+7]$ ãè¨ç®ãã¦ãã¾ãã
.LBB0_2
ã®ã«ã¼ããçµäºããã¨ãpshufd
ãé§ä½¿ãã¤ã¤ $\register{xmm0}+\register{xmm2}$ ãè¨ç®ããããã¦ã
$$\register{eax} = \sum_{k=0}^{\floor{\tfrac{n+1}8}\cdot 8-1} k^8$$
ã¨ã㦠.LBB0_4
ã®ã«ã¼ãã«åããã¾ãã
.LBB0_4
ã§ã¯ãç¹°ãè¿ãäºä¹æ³ã使ãã¤ã¤ $\register{edx} = k^8$ ãæ±ãã$\register{eax}$ ã«è¶³ãã¦ããã¾ãã
.LBB0_4
ã¯ç«¯æ°ã«é¢ããå¦çã®ãããé«ã
7 åããè¡ããã¾ãããecx
ãã«ã¼ãå¤æ°ã§ãã
æçµçã« $\register{eax} = \sum_{i=0}^n i^8$ ã«ãªãããããè¿ãã¦çµäºã§ãã
9 ä¹åã»10 ä¹å
8 ä¹åã¨åãããã« xmm
ã使ã£ã¦ãã¾ããã大å¤ãªã®ã§ãã解説ã¯ãã¾ãããç®æ°ããé¨åã¯ãªãããã§ãã
èå³ã®ããèªè
ã¯èªåã§ãã£ã¦ã¿ãã¨ããã§ãããã
11 ä¹å
xmm
ã使ãããªããªãã¾ãããããæ°ããªããªã£ãã®ã§ããããããããã¯ãã®æ¹ãå¹çãããã¨å¤æããã®ã§ããããã
sum(unsigned int): lea edx, [rdi + 1] test edi, edi je .LBB0_1 mov esi, edx and esi, -2 xor ecx, ecx xor eax, eax .LBB0_6: # =>This Inner Loop Header: Depth=1 mov edi, ecx imul edi, ecx imul edi, edi imul edi, ecx mov r8d, edi imul r8d, ecx imul r8d, edi add r8d, eax lea eax, [rcx + 1] mov edi, eax imul edi, eax imul edi, edi imul edi, eax imul eax, edi imul eax, edi add eax, r8d add ecx, 2 cmp esi, ecx jne .LBB0_6 test dl, 1 je .LBB0_4 .LBB0_3: mov edx, ecx imul edx, ecx imul edx, edx imul edx, ecx imul ecx, edx imul ecx, edx add ecx, eax mov eax, ecx .LBB0_4: ret .LBB0_1: xor ecx, ecx xor eax, eax test dl, 1 jne .LBB0_3 jmp .LBB0_4
çããããããç¨åº¦èªããããã«ãªã£ã¦ããã¨æãããã®ã§ã詳細ãªè§£èª¬ã¯ãã¾ããã
大ã¾ãã«ã¯ãã«ã¼ãå¤æ° ecx
ã 2
ãã¤å¢ããã¦ãããåã«ã¼ãã§ã¯ $\register{ecx}^{11} + (\register{ecx}+1)^{11}$ ãè¨ç®ãã¦ãã¾ããä¸é㯠$\register{esi} = \floor{\tfrac{n+1}{2}}\cdot 2$ ã§ãã
ããã¯ãã¾ãç¹çãã¹ãç¹ã¯ãªãããªï¼ã¨æã£ã¦ããã®ã§ããããããªãã¨ã¯ããã¾ããã§ããã ãç´¯ä¹ãªãã¦ç¹°ãè¿ãäºä¹æ³ãªã©ãé§ä½¿ãã¦ãªã¼ãã¼ãè½ã¨ãã®ã¯å½ç¶ã§ããããã¨ããæè¦ã競ãã er çã«ã¯ããæ°ããã¦ãã³ã³ãã¤ã©ããªã¼ãã¼ãè½ã¨ãã¦ãããã®ãå½ç¶ãã£ã¦ãã¾ããããå·éã«ãªãã¨ãããç·ååæ§ã«è³¢ããã£ã¦ããã¦ãããã®ã®ä¸ã¤ã§ããã
ã¨ããã§ãç¹°ãè¿ãäºä¹æ³ã¯ä¹ç®åæ°ã®è¦³ç¹ã§ã¯æé©ã¨ã¯éããªããã§ãããã addition-chain exponentiation ã¨ã Knuth's power tree ã¨ãã§èª¿ã¹ãã¨æ¥½ããããªãã®ãåºã¦ãã¾ããä¸è¬ã«æé©ãªåæ°ãæ±ããã®ã¯ NP-complete ãããã§ãã ã³ã³ãã¤ã©ãåºããã³ã¼ãã¯ç¹°ãè¿ãäºä¹æ³ã«åºã¥ãã¦ããããã«è¦ãã¾ãããä¹ç®åæ°ãæ¸ãããã¨ããã¨ä¸æå©ç¨ã®ã¬ã¸ã¹ã¿ãå¢ããã¡ãªã®ã§ä¸é½åãªã®ã§ããããã
â éãã§ããæ§å
unsigned pow(unsigned n) { return n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n * n ; }
pow(unsigned int): mov eax, edi imul eax, edi imul eax, edi imul eax, eax imul eax, edi imul eax, eax imul eax, edi imul edi, eax imul eax, edi ret
$\gdef\mulgets{\xleftarrow{\times}}$
- $\register{eax} \gets \register{edi}$ ($\register{eax} = n^1$)
- $\register{eax} \mulgets \register{edi}$ ($\register{eax} = n^2$)
- $\register{eax} \mulgets \register{edi}$ ($\register{eax} = n^3$)
- $\register{eax} \mulgets \register{eax}$ ($\register{eax} = n^6$)
- $\register{eax} \mulgets \register{edi}$ ($\register{eax} = n^7$)
- $\register{eax} \mulgets \register{eax}$ ($\register{eax} = n^{14}$)
- $\register{eax} \mulgets \register{edi}$ ($\register{eax} = n^{15}$)
- $\register{edi} \mulgets \register{eax}$ ($\register{edi} = n^{16}$)
- $\register{eax} \mulgets \register{edi}$ ($\register{eax} = n^{31}$)
æå¾ã«è¿ãåã« edi
ã®æ¹ã«æãã¦ããã®ããããããã¾ããããããããæ¹ãé½åãããã®ã§ããããã
ä¹ç®åæ°ã§è¨ãã°ããã¨ãã°æ¬¡ã®ããã«ããã°ä¸åæ¸ããããã§ãã
- $\register{eax} \gets \register{edi}$ ($\register{eax} = n^1$)
- $\register{eax} \mulgets \register{edi}$ ($\register{eax} = n^2$)
- $\register{eax} \mulgets \register{edi}$ ($\register{eax} = n^3$)
- $\register{eax} \mulgets \register{eax}$ ($\register{eax} = n^6$)
- $\register{ecx} \gets \register{eax}$ ($\register{ecx} = n^6$)
- $\register{eax} \mulgets \register{eax}$ ($\register{eax} = n^{12}$)
- $\register{eax} \mulgets \register{eax}$ ($\register{eax} = n^{24}$)
- $\register{eax} \mulgets \register{ecx}$ ($\register{eax} = n^{30}$)
- $\register{eax} \mulgets \register{edi}$ ($\register{eax} = n^{31}$)
ãã以é
12â16 ä¹å㯠xmm
ããããã®å¾ 17â40 ä¹åã¾ã§ã¯ç¢ºèªãã¾ããã xmm
ãªãã§ããã
æ°å¦
ãããã示ãã¾ãã
Lemma 1: $$ \sum_{k=0}^n \textstyle{n \brace k}\, \perm{x}{k} = x^n $$
Proof
çµåãçã«è§£éãã¾ãã
$1$ çªãã $n$ çªã¾ã§ã®ãã¼ã«ããã£ã¦ãåã ã $x$ è²ã®ãã¡ã®ããããã«å¡ããã¨ãèãã¾ãã ããã¯å½ç¶ $x^n$ éãã®å¡ãæ¹ãããã¾ãã
å¥ã®æ°ãæ¹ããã¾ãã åãè²ã§å¡ããã¼ã«ãã¨ã«åãããã¨ãèããã¨ãåãæ¹ã¯ ${n\brace k}$ éãããã¾ãï¼${n\brace k}$ 㯠$n$ è¦ç´ ã®éåã $k$ åã®é¨åéåã«åå²ããéãæ°ãªã®ã§ï¼ã ãã® $k$ åã®ã°ã«ã¼ãã«ã©ã®è²ãå²ãå½ã¦ãã㯠$\perm{x}{k}$ éãããã¾ããããªãã¡ã$\sum_{k=0}^n {n\brace k}\, \perm{x}{k}$ éãã§ãã
ãã£ã¦ã$\sum_{k=0}^n {n\brace k}\, \perm{x}{k} = x^n$ ã¨ãªãã¾ãã$\qed$
Theorem 2: $$ \sum_{i=1}^{k+1} \tfrac{1}{i} {\textstyle{k+1\brace i}}\,\perm{n}{i} = \sum_{i=1}^n i^k. $$
Proof
ã¾ã $k = 0$ ã®ã¨ãã $$ \begin{aligned} \sum_{i=1}^{1} \tfrac{1}{i} {\textstyle{1\brace i}}\,\perm{n}{i} &= \tfrac11 {\textstyle{1\brace 1}}\,\perm{n}{1} \\ &= n = \sum_{i=1}^n 1. \end{aligned} $$
以ä¸ã$k\ge 1$ ãåºå®ããã左辺ã $n$ ã«é¢ãã $k+1$ 次å¼ã§ãããã¨ã«æ³¨æããã
$n = 0$ ã®ã¨ãã $$ \sum_{i=1}^{k+1} \tfrac{1}{i} {\textstyle{k+1\brace i}}\,\perm{0}{i} = \sum_{i=1}^0 i^k = 0 $$ ãæãç«ã¤ã次ã«ã $$ \sum_{j=1}^{k+1} a_j\,\perm nj = \sum_{j=1}^n j^k $$ ãªã $(a_1, \dots, a_{k+1})$ ãèããã
$n = 1$ ã®ã¨ãã$n\lt j\implies \perm nj = 0$ ã«æ³¨æã㦠$$ \sum_{j=1}^{k+1} a_j\,\perm 1j = a_1 = \sum_{j=1}^1 j^k = 1 = \tfrac11 {\textstyle {k+1\brace 1}}. $$
$n = i$ ã§åºå®ããã¨ã$\perm nj$ ã $0$ ã§ãªãç¯å²ã«æ³¨æã㦠$$ \sum_{j=1}^{k+1} a_j\,\perm{n}{j} = \sum_{j=1}^i a_j\,\perm{i}{j} $$ ã¨ãªããããªãã¡ã$a_1, \dots, a_i$ ã®ã¿ã®å¼ã¨ãªãã
$j\lt i$ ã«å¯¾ã㦠$a_j = \tfrac 1j {\textstyle{k+1\brace j}}$ ãæãç«ã¤ã¨ãã$a_i = \tfrac 1i {\textstyle{k+1\brace i}}$ ãæãç«ã¤ãã¨ã示ãã ããªãã¡ã $$ \sum_{j=1}^{i-1} \tfrac 1j {\textstyle{k+1\brace j}}\,\perm ij + a_i\,\perm ii = \sum_{j=1}^i j^k $$ ã $a_i$ ã«ã¤ãã¦è§£ãã
$$ \begin{aligned} a_i &= \frac1{i!} \left(\sum_{j=1}^i j^k - \sum_{j=1}^{i-1} \tfrac 1j {\textstyle{k+1\brace j}}\, \perm ij\right) \\ &= \frac1{i!} \left(i^k + \sum_{j=1}^{i-1} j^k - \sum_{j=1}^{i-1} \tfrac 1j {\textstyle{k+1\brace j}}\, \perm ij\right). \end{aligned} $$ 両辺㫠$i\cdot i!$ ãæãããã帰ç´æ³ã®ä»®å®ãç¨ããããã¦å¤å½¢ãé²ããã $$ \begin{aligned} i\cdot i!\cdot a_i &= i^{k+1} + i\sum_{j=1}^{i-1} j^k - i\sum_{j=1}^{i-1} \tfrac 1j {\textstyle{k+1\brace j}}\, \perm ij \\ &= i^{k+1} + i\sum_{j=1}^{i-1} \tfrac 1j{\textstyle{k+1\brace j}}\,\perm {i-1}j - i\sum_{j=1}^{i-1} \tfrac 1j {\textstyle{k+1\brace j}}\, \perm ij \\ &= i^{k+1} + \sum_{j=1}^{i-1} \tfrac 1j{\textstyle{k+1\brace j}}\,(\perm i{j+1} - i\cdot\perm ij). \end{aligned} $$ Lemma 1 ã使ãã¤ã¤ã$\perm ij \gt 0$ ã®ç¯å²ã ${k+1\brace 0} = 0$ ã§ãããã¨ãªã©ã«æ³¨æãã¦ã $$ \begin{aligned} i\cdot i!\cdot a_i &= \sum_{j=0}^{k+1} {\textstyle{k+1 \brace j}}\, \perm ij + \sum_{j=1}^{i-1} \tfrac 1j{\textstyle{k+1\brace j}}\,(\perm i{j+1} - i\cdot\perm ij) \\ &= {\textstyle {k+1\brace i}}\,\perm ii + \sum_{j=1}^{i-1} {\textstyle{k+1 \brace j}}\, \perm ij + \sum_{j=1}^{i-1} \tfrac 1j{\textstyle{k+1\brace j}}\,(\perm i{j+1} - i\cdot\perm ij) \\ &= {\textstyle {k+1\brace i}}\,i! + \sum_{j=1}^{i-1} \tfrac 1j{\textstyle{k+1\brace j}}\,\underbrace{(j\cdot\perm ij + (i-j)\cdot\perm ij - i\cdot\perm ij)}_0. \\ \end{aligned} $$ ããã«ããã$a_i = \tfrac1i {\textstyle {k+1\brace i}}$ ãå¾ãã
$1\le i\le k+1$ ã«å¯¾ãã¦ã$n=i$ ã®éã«çå¼ãæãç«ã¤ããã« $a_i$ ãå®ããã®ã§ã$n=0$ ã®ã±ã¼ã¹ã¨åãã㦠$k+2$ åã®ç¹ã§çå¼ãæãç«ã£ã¦ããã 左辺㯠$k+1$ 次ã®å¤é å¼ãªã®ã§ãä»»æã® $n$ ã«ã¤ãã¦çå¼ãæãç«ã¤ã
ãã£ã¦ãä»»æã® $k\ge 1$ ã«ã¤ãã¦ã示ãããã®ã§ãä»»æã® $k\ge 0$ ã«å¯¾ã㦠$$ \sum_{i=1}^{k+1} \tfrac{1}{i} {\textstyle{k+1\brace i}}\,\perm{n}{i} = \sum_{i=1}^n i^k $$ ãæãç«ã¤ã$\qed$
Lemma 1 ã«ãã£ã¦ ${\textstyle{k+1\brace j}}$ ã®å½¢ãä½ãããã«ã両辺㫠$i$ ãæããã¨ããããããããã£ãã§ãã
è¨äºã®åé ã§ã¯ $i=0$ ãã足ãã¦ãã¾ãããã$0$ ä¹åã«ããã¦ä¸å¼ããèªç¶ã«åºã¦ããå¤ãèæ ®ããã¨ã$i=1$ ãã足ãæ¹ãããããªã¨ãªã£ã¦ãããã¾ããã
é¢é£è³æ
è¨äºãã»ã¼æ¸ãçµãã¦ãããclang sum of power optimizationãã§ã°ã°ã£ã¦ä¸ã®æ¹ã«åºã¦ãããµã¤ããã¡ã§ãããã¾ãèªãã§ãã¾ããã
ææ
å ã ãClang ãç´æ¥ $\tfrac12 n(n+1)$ ãè¨ç®ãã¦ããã®ã§ã¯ãªãä½ãããç¹æ®ãªãã¨ããã¦ããããªãã¨ã¯ç¥ã£ã¦ããã®ã§ãããå®éã«ãã£ã¦ã¿ãã¨ãããã«ä¾¿å©ãããªå½¢ã§ãã£ã¦ãããã§ç´å¾ã§ããã 第äºç¨® Stirling æ°ã§ããã° DP ã§æ軽ã«æ±ãããã¾ããã$\tfrac1a(b\cdot 2^{32}+c)$ ãªã©ã®å½¢ã®å®æ°ãç¨æãã¦ã³ã³ãã¤ã©ãæé©åããã®ããããããã¨ã ã¨æãã®ã§ãããããæé©åããã¦ãããã®èªä½ã¯ãªãã»ã©ãªãã¨ããæãã§ããããã¾ã§ã㦠$k$ ä¹åãæé©åãããã¨ãã¦ãããã®ã¯ããããªã¨æãã¾ãã
å é¨å®è£ ã«ã¤ãã¦ã¯ç¥ããªãã®ã§å®éã« DP ã§æ±ãã¦ãããã©ãããªã©ã¯ç¥ãã¾ãããåãªã $k$ ä¹åã§ãªã $k$ 次å¼ã®å ´åã§ããã¾ããã£ã¦ããã¯ãã§ãããå¥ã®ãã¾ãæ¹æ³ãªã©ãããã®ããããã¾ããã
ã¾ãããã®è¨äºã§ã¯ unsigned
ã§ã®æé©åãæ¤è¨¼ãã¾ãããã$k\ge 31$ ã®ã¨ã $1^k+2^k\ge 2^{31}$ ã§ããããsigned
ã§ããã° $n\ge 2$ ã®ã¨ããªã¼ãã¼ããã¼ãã¦æªå®ç¾©ã«ãªãã¯ãã§ãã$n\le 0$ ã§ããã° $0$ã$n = 1$ ã®ã¨ã $1$ ãªã®ã§ã次ã®ãããªæé©åãå¯è½ãªã¯ãã§ãã
signed sum(signed n) { // 1 ä»¥ä¸ n 以ä¸ã® 31 ä¹åãè¿ã return n > 0; }
試ããéãã§ã¯ããã®ãããªæé©åã¯è¡ããã¦ããªãããã§ããã
ã¾ããåæ§ã®æ¦å¿µã¨ã㦠Bernoulli æ°ã¨ãããã®ãããã¨è¨æ¶ãã¦ããã®ã§ãããæé©åãããä¸ã§ã¯ãã¾ãç¸æ§ããããªãã®ããªï¼ã¨æãã¤ã¤ãå®ã¯ã¡ããã¨èª¿ã¹ã¦ãã¾ããã
ä»åãã¢ã»ã³ããªãå
ã«ã㦠3 ä¹åãã 7 ä¹å㧠C++ 風ã®ã³ã¼ããæ¸ãã¾ããããããã«é¢ãã¦ã¯ 0
ãã -1u
ã¾ã§ä»»æã®å¤ãå
¥åãã¦ãæç´ãªæ¹æ³ã¨å¤ãåãã«ãªããã¨ã¯æ¤è¨¼ãããã¾ããã
追è¨
â ãã£ã¡ã®è¨äºã§ã¯ãï¼ãã¶ããï¼ããåºã $k$ 次å¼åã«é¢ãã¦èªç¶ã«æ±ããããªæ°ããã¾ãã
ããã
ããæè¿ã¯éãã®è¨äºãã½ãã½ãæ¸ãã¦ãã¦ãèªè ã®äººã ã大å¤ããã§ããã
*1:ç§ã¯ç¥ã£ã¦ãã¾ãã¨ããç¨åº¦ã®æå³ã
*2:ãªã¼ãã¼ããã¼ããå ´åã§ãããã¯å¤ãããªãã§ããããªããã£ã¦ããã®ãã¯ãããããã¾ããã§ãããä½ããã®äºæ ãããããè¦è½ã¨ãããããã ã¨æãã¾ãã
*3:è¡ãå½ããã°ã£ããã§æ¸ãã¦ããã®ã§æ¬å½ã«åãä¹±ãã¾ãããå³ç«¯ã $1$ ã«ãªã£ã¦ãããã¨ã¨ãã®é£ã $\tfrac12(2^k-1)$ ã«ãªã£ã¦ãããã¨ããã¨å·¦ç«¯ã $\tfrac1{k+1}$ ã«ãªã£ã¦ãããã¨ã¯ããæ°ã¥ãã¨æãã¾ãããåæ¯ã $k+1-j$ ã«æãããã©ããã¨ããã®ã¯ $k=5$ ãè¨ç®ãããããã¾ã§æ°ã¥ãã¾ããã§ããã3, 6, 7 ã®ä¸¦ã³ã«æ¢è¦æããã£ã¦ Stirling æ°ã®è¡¨ãè¦ã«è¡ã£ã¦ã³ã£ãããã¾ããã
*4:ã¨ã¯ããç·å½¢ç¯©ã使ãã° $1\le i\le k+1$ ã«å¯¾ãã $i^k$ ãã¡ã $O(k)$ æéã§æ±ããããã®ã§ãçå·®æ°åã«ãªã£ã¦ããå ´åã®å¤é å¼è£éãã¡ããã¨ãã£ãããããã¨ã§ $O(k)$ æéãéæã§ãããã§ãã