競ãã er ã¯ããè¨ç®éã®è¦ç©ããããã¾ãããããããã®è¨ç®é㯠$O(\dots)$ ãªã®ã§ååé«éã§ãããã¨ãã£ãå ·åã§ä¸ããæãããã¨ãå¤ãã§ãã ã¾ãããããããã®è¨ç®é㯠$\Omega(\dots)$ ãªã®ã§ TLE ããããã¨ãã£ãå ·åã§ä¸ããæãããã¨ããã°ãã°ããã¾ãã
note:ãããããã®è¨ç®é㯠$O(2^{2^n})$ ãªã®ã§ TLE ããããã¨ãã£ãè¨å·ã®ä½¿ãæ¹ï¼$O$ ã§ä¸ããæãããã¨ããï¼ã¯ãä¸æ£ç¢ºãªç¨æ³ãªã®ã§æ°ãã¤ãã¾ããããç¥ããã«ä½¿ã£ã¦ãã人ã¯ã¡ããã¨åå¼·ãã¾ãããã
ãä¸ããæãããã«ã¤ãã¦
ä¸ããæããã¨ããã®ã¯ãè¦ç©ããããå¤ã¯ãã以ä¸ã§ããã¨ããå¤ï¼ä¸ç ã¨å¼ã°ãã¾ãï¼ãæ±ããã¨ããæå³ã®è¨ãåãã§ãã ãã $a$ ã使ã£ã¦ $a\le x$ ã¨æ¸ãããã$x$ 㯠$a$ ã§ä¸ããæãããããã¨è¨ãã¾ãã éã«ã$x\le b$ ã¯ã$x$ 㯠$b$ ã§ä¸ããæãããããã¨è¨ãã¾ãã
ãä¸ï¼ä¸ï¼ããæãããããä¸ï¼ä¸ï¼ããè©ä¾¡ãããã¨è¨ã£ããããããæ±ãããã¨ããä¸ï¼ä¸ï¼ããã®è©ä¾¡ãã¨å¼ãã ããã¾ãã
$O$ ã¯å®ç¾©ããä¸ããæããããã®è¨æ³ï¼è¨ç®é㯠$O(f(n))$ ã§ãã¨è¨ã£ãããè¨ç®é㯠$f(n)$ ã®ãªã¼ãã¼ä»¥ä¸ã§ããã¨ãããã¨ï¼ãªã®ã§ãä¸ããã®è©ä¾¡ããããæèã¨ã¯ç¸æ§ãæªãã§ãã
ä»æ¥ã¯ãã½ã¼ã¹ã³ã¼ããè¦ã¦ããã®è¨ç®éã¯ä½ã ã ãã TLE ããã§ãããã¨ãã決ãã¤ããå¿ ãããæ£ãããªãã§ãã¨ãã話ããã¾ãããå®éã«è¨ç®é㯠$\Theta(n^2)$ ã ãã©ã¢ã¯ã»ã¹ã®å¹çãããã¦å®æ°åããã¡ãå°ããã®ã§ AC ã§ãããã¨ãã話ã¯ãã¾ããã
!! ãã¾ã (2) ãä¸çªã³ã£ããããããã¾ããã!!
åé¡æèµ·
ãã¦ã次㮠C++ ã³ã¼ãã®è¨ç®éã¯ã©ããªãã§ãããããä¸ãããä¸ããããã£ããè¦ç©ããã¹ãã$O$ ã $\Omega$ ã§ã¯ãªã $\Theta$ ã使ãã¾ãã
int sum_n(int n) { int res = 0; for (int i = 0; i <= n; ++i) res += i; return res; }
disclaimer: æè¿è©±é¡ã®ãã¤ã¼ãã«èµ·å ãã¦æ¸ãã¦ãããã®ã§ã¯ããã¾ãã*1ã
å¤ãã®äººã¯ $\Theta(n)$ ã¨æãã®ã§ã¯ãªãã§ãããããã«ã¼ãä¸ã§ $n+1$ åã® res += i
ãè¡ããããªããã§ãï¼ãã¡ãã i <= n
ã ++i
ãèæ
®ã¯å¿
è¦ã§ãï¼ã
å®é GCC ã§ã¯ $\Theta(n)$ æéã§ãããClang ã§ã¯ $\Theta(1)$ æéã¨ãªãã¾ãï¼ã©ã¡ãã -O2
ã§ã®æé©åã¯åæã¨ãã¦ãã¾ãï¼ã
ã³ã³ãã¤ã©ããã®ãããªæé©åãè¡ã£ã¦ãããã±ã¼ã¹ãããã¨ããã®ãè¦ãã¦ããã¹ãã§ãããã
解説
æ©æ¢°ãå®éã«å®è¡ããã®ã¯ C++ ã®ã½ã¼ã¹ã³ã¼ãã§ã¯ãªãæ©æ¢°èªã§ããããããã¨å¯¾å¿ãã¦ããã¢ã»ã³ããªãèªãã§ã¿ã¾ããããããã¯ä¸è¨ã® GCC ã Clang ãçæãããã®ã§ãã æå ã®ç°å¢ã§æ§ã ãªã³ã³ãã¤ã©ãç¨æããã®ã¯é¢åãªã®ã§ããããããµã¼ãã¹ã使ãã¾ãã
ç»é¢å·¦ã®ã¦ã£ã³ãã¦ã«å ã»ã©ã®ã½ã¼ã¹ã³ã¼ããå ¥åããç»é¢ä¸é¨ã®è¨å®ã«ã¯ä¸è¨ãæå®ãã¾ãï¼ä¸åº¦ã«æå®ã§ããã®ã¯ä¸ã¤ãã¤ã®ã¿ã§ãï¼ã
è¨èª | ã³ã³ãã¤ã© | ãªãã·ã§ã³ |
---|---|---|
C++ | x86-64 gcc 13.2 | -O2 |
C++ | x86-64 clang 16.0 | -O2 |
ç»é¢å³ã®ã¦ã£ã³ãã¦ã§åºåã®ã¢ã»ã³ããªãè¦ãããã®ã§ããããè¦ã¦ããã¾ãããã ã¢ã»ã³ããªã®èªã¿æ¹ã«é¢ãã¦ã説æãä¸å¯§ã«æ¸ãããã¨æã£ãã®ã§ãããéä¸ã§é¢åã«ãªã£ãã®ã§ããã¦ãã¾ãã¾ãããè¨äºã®æ«å°¾ã«åèã«ãªããããªãã®ãæãã¦ããã®ã§åèªåå¼·ãã¦ãã ããã
ããã§ã¯ãä¸è¨ã§å¾ãããã¢ã»ã³ããªãèªããç¨åº¦ã®ç°¡åãªèª¬æã ãããã¦æ¸ã¾ãããã¨ã«ãã¾ãã
æåã®å¼æ°ã edi
ã¨å¼ã°ããã¬ã¸ã¹ã¿*2ã«å
¥ãã¾ããrdi
㨠edi
ã¯ä¸ä½ 32 bits ãå
±æãã¦ãã¦ãrdi
㯠64 bitsãedi
㯠32 bits ã§ãï¼ä¸å³ã®ã¤ã¡ã¼ã¸ï¼ãè¿ãå¤ã¯ eax
ã¨å¼ã°ããã¬ã¸ã¹ã¿ã«å
¥ãã¾ãã
[ ---------------- ---------------- ] # <- rdi ^^^^^^^^^^^^^^^^ # <- edi
åè¡ã¯ op arg, ...
ã®ãããªå½¢å¼ããã¦ãã¾ãï¼;
以éã¯ã³ã¡ã³ãã§ããå¦çç³»ã«ãã£ã¦ã¯ #
ã ã£ããããããã§ãï¼ãop
ãå½ä»¤ã®ååãarg
ãå¼æ°ã§ãã
å½ä»¤ãå®è¡ããããã³ã«ããã©ã°ã¬ã¸ã¹ã¿ã¨å¼ã°ããã¬ã¸ã¹ã¿ã®å¤ãå¤ãã£ããå¤ãããªãã£ãããã¾ãã
ãã©ã°ã¬ã¸ã¹ã¿ã«ã¯ãè¨ç®çµæã 0 ã ã£ããã©ããã¨ãããªã¼ãã¼ããã¼ãããã©ããã¨ããè² ã ã£ããï¼ç¬¦å·ããããç«ã£ã¦ãããï¼ã©ããã¨ãã®æ
å ±ãå
¥ã£ã¦ãã¾ãã
test x y
ã¨ããå½ä»¤ã¯ x & y
ãè¨ç®ãã¾ããjs .label
ã®å½ä»¤ã§ã¯ãã©ã°ã¬ã¸ã¹ã¿ã®ç¶æ
ï¼ãã¨ãã° x & y
ã®è¨ç®çµæã«ããï¼ã«ãã£ã¦ .label
ã¨æ¸ãããä½ç½®ã«ã¸ã£ã³ããããããªãã£ãããã¾ããjxx
ã® xx
ã®é¨åããã©ã°ã¬ã¸ã¹ã¿ã®ã©ã®ãã©ã°ãåç
§ãããã«å¯¾å¿ãã¾ããjs
ã§ã¯ SFï¼ç¬¦å·ãã©ã°ãè² ã ã£ãã¨ãã« trueï¼ãè¦ã¾ãã
xor
mov
add
ãªã©ã®å½ä»¤ã¯ååããæ³åã§ãããããªå¦çããã¾ããè¨æ³ã«ã¯ããã¤ãæµæ´¾ãããã®ã§ãããããã§ã¯è¨ç®çµæã¯å·¦å´ã®å¼æ°ã«å
¥ãã¾ãã
ãã¨ãã° add eax 2
ã§ããã° eax += 2
ã®ãããªãã®ã«ç¸å½ãã¾ãã
lea
ã¯ããããå
ã
ã¯ã¢ãã¬ã¹è¨ç®ã«é¢ããå½ä»¤ãªã®ã§ãããä½ãã¨é½åããã ã®ã§ãå ç®ãä¹ç®ããããã¨ãã«ãã°ãã°ç»å ´ãã¾ããã©ã®ãããªè¨ç®ããã¦ãããã«ã¤ãã¦ã¯ã³ã¼ãä¸ã®ã³ã¡ã³ããåç
§ãã¦ãã ããã
ã³ã¡ã³ããæ·»ãã¦ããã¾ããé¢æ°ã«æ¸¡ãããæç¹ã§ã®å¼æ°ã n
ã¨ç½®ãã¦ããã¾ãã
ã¾ã㯠GCC ã§ããedi
㨠rdi
ã¯ãä¸ä½ 32 bits ãå
±æãã¦ããï¼æé»ã«åæããã¦ããï¼ç¹æ®ãªå¤æ°ã§ãããã®ãããªã¤ã¡ã¼ã¸ã§èªãã§ãã ãããeax
㨠rax
ãedx
㨠rdx
ãªã©ã«ã¤ãã¦ãåæ§ã§ãã
sum(int): ; int sum(int edi) { test edi, edi ; if ((edi & edi) < 0) js .L4 ; goto L4; lea ecx, [rdi+1] ; ecx = rdi + 1; xor eax, eax ; eax ^= eax; xor edx, edx ; edx ^= edx; and edi, 1 ; edi &= 1; jne .L3 ; if (edi != 0) goto L3; mov eax, 1 ; eax = 1; cmp eax, ecx ; if (eax == ecx) je .L1 ; goto L1; .L3: ; L3: lea edx, [rdx+1+rax*2] ; edx = rdx + 1 + rax*2; add eax, 2 ; eax += 2; cmp eax, ecx ; if (eax != ecx) jne .L3 ; goto L3; .L1: ; L1: mov eax, edx ; eax = edx; ret ; return eax; .L4: ; L4: xor edx, edx ; edx ^= edx; mov eax, edx ; eax ^= edx; ret ; return eax; ; }
åã¬ã¸ã¹ã¿ã§è¨ç®ãã¦ãããã®ã®æå³ãæ±²ãã ãããªååãã¤ãã¦ã³ã¡ã³ããæ·»ããã¨ã次ã®ãããªæãã«ãªãã¾ãã
int sum(int edi) { if ((edi & edi) < 0) goto L4; // if (n < 0) goto L4; ecx = rdi + 1; // limit = n + 1; eax ^= eax; // i = 0; edx ^= edx; // res = 0; edi &= 1; if (edi != 0) goto L3; // if (n % 2 == 1) goto L3; eax = 1; // i = 1; if (eax == ecx) goto L1; // if (i == limit) goto L1; L3: edx = rdx + 1 + rax*2; // res += 1 + 2 * i eax += 2; // i += 2; if (eax != ecx) goto L3; // if (i != limit) goto L3; L1: eax = edx; return eax; // return res; L4: edx ^= edx; eax ^= edx; return eax; // return res; }
å¢çå¤ããããããã§ãããn
ãå¶æ°ãªã (1+2)+(3+4)+...
ãå¥æ°ãªã 1+(2+3)+(4+5)+...
ã®ããã«é£ãåãè¦ç´ ãã¾ã¨ãã¦è¶³ãã¦ãããããªæé©åããã¦ãã¾ãã
ã¨ã¯ãããè¨ç®é㯠$\Theta(n)$ ã§ãã
次㯠Clang ã§ãã
sum(int): ; int sum(int edi) { mov eax, edi ; eax = edi; lea ecx, [rdi - 1] ; ecx = rdi - 1; imul rcx, rax ; rcx *= rax; shr rcx ; rcx >>= 1; add ecx, edi ; ecx += edi; xor eax, eax ; eax ^= eax; test edi, edi ; if ((edi & edi) >= 0) cmovns eax, ecx ; eax = ecx; ret ; return eax; ; }
ãã¡ããæå³ãæ±²ãã¨æ¬¡ã®ãããªæãã§ãã
int sum(int edi) { eax = edi; // tmp = n; ecx = rdi - 1; // sum = n - 1; rcx *= rax; // sum *= tmp; // i.e. sum *= n rcx >>= 1; // sum /= 2; // sum == n * (n - 1) / 2; ecx += edi; // sum += n; // sum == n * (n + 1) / 2; eax ^= eax; // res = 0; if ((edi & edi) >= 0) eax = ecx; // if (n >= 0) res = sum; return eax; // return res; }
$n\ge 0\implies \sum_{i=0}^n i = n(n+1)/2$ ãç¨ã㦠$\Theta(1)$ ã®å¦çã«æé©åããã¦ãã¾ãã
ãã¾ã
Rust ã§ãããããéã¹ãã®ã§éãã§ã¿ã¾ãã
pub fn sum(n: u32) -> u32 { (0..=n).sum() }
pub fn sum_128(n: u128) -> u128 { (0..=n).sum() }
ä¸è¨ã®é¢æ°ãè¦ã¦ã¿ã¾ããpub
ã«ããå¿
è¦ããããã¨ã«æ³¨æãã¦ãã ãããã¤ãå¿ããã¨
<No assembly to display (~5 lines filtered)>
ã®ãããªè¡¨ç¤ºãåºã¾ãããªãã·ã§ã³ã¯ -C opt-level=3
ãªã©ã«ãã¦ããã¾ãã
次ã®ãããªæãã§ããé©å®èªãã§ãã ããã
example::sum: ; fn sum(edi: u32) -> u32 { test edi, edi ; if edi & edi == 0 { je .LBB0_1 ; goto 'LBB0_1; } lea eax, [rdi - 1] ; eax = rdi - 1; lea ecx, [rdi - 2] ; ecx = rdi - 2; imul rcx, rax ; rcx *= rax; shr rcx ; rcx >>= 1; // (n - 1) * (n - 2) / 2 lea eax, [rdi + rcx] ; eax = rdi + rcx; dec eax ; eax -= 1; add eax, edi ; eax += edi; // (n - 1) * (n - 2) / 2 + n - 1 + n ret ; return eax; // == (n - 1) * n / 2 + n == (n + 1) * n / 2 .LBB0_1: ; 'LBB0_1: xor eax, eax ; eax ^= eax; add eax, edi ; eax += edi; ret ; return eax; // 0 ; }
128-bit æ´æ°ã®æ¹ã¯é·ãã§ããã128-bit æ´æ°å士ã®æ¼ç®èªä½ã«ããã¤ãã®å½ä»¤ã使ããã¦ããã ãã§ããããããã¨ã¨ãã¦ã¯å¤§å·®ãªãã§ããããå®ã¯ã¡ããã¨èªãã§ãã¾ããã
example::sum_128: ; fn sum_128(rdi:rsi: u128) -> u128 { mov rax, rdi ; rax = rdi; or rax, rsi ; rax |= rsi; je .LBB2_1 ; if rax == 0 { goto 'LBB2_1; } push r14 ; tmp_r14 = r14; push rbx ; tmp_rbx = rbx; mov r8, rdi ; r8 = rdi; add r8, -1 ; (add, carry) = r8.carrying_add(18446744073709551615); ; r8 = add; mov r9, rsi ; r9 = rsi; adc r9, -1 ; r9 += 18446744073709551615 + carry mov rbx, rdi ; rbx = rdi; add rbx, -2 ; (add, carry) = rbx.carrying_add(18446744073709551614); ; rbx = add; mov rcx, rsi ; rcx = rsi; adc rcx, -1 ; rcx += 18446744073709551615 + carry; mov rax, r9 ; rax = r9; mul rbx ; rax *= rbx; mov r10, rdx ; r10 = rdx; mov r11, rax ; r11 = rax; mov rax, r8 ; rax = r8; mul rbx ; rax *= rbx; mov rbx, rax ; rbx = rax; mov r14, rdx ; r14 = rdx; add r14, r11 ; (add, carry) = r14.carrying_add(r11); ; r14 = add; adc r10d, 0 ; r10 += 0 + carry; mov rax, r8 ; rax = r8; mul rcx ; rax *= rcx; add rax, r14 ; (add, carry) = rax .carrying_add(r14); ; rax += carry; adc edx, r10d ; edx += r10d + carry; imul ecx, r9d ; ecx *= r9d; add ecx, edx ; ecx += edx; shld rcx, rax, 63 ; rcx:rax >>= 63; shld rax, rbx, 63 ; rax:rbx >>= 63; add rax, r8 ; (add, carry) = rax.carrying_add(r8); ; rax = add; adc rcx, r9 ; rcx += r9 + carry; pop rbx ; rbx = tmp_rbx; pop r14 ; r14 = tmp_r14; add rax, rdi ; (add, carry) = rax.carrying_add(rdi); ; rax = add; adc rcx, rsi ; rcx += rsi + carry; mov rdx, rcx ; rdx = rcx; ret ; return rax:rdx; .LBB2_1: ; 'LBB2_1: xor eax, eax ; eax ^= eax; xor ecx, ecx ; ecx ^= ecx; add rax, rdi ; (add, carry) = rax.carrying_add(rdi); ; rax = add; adc rcx, rsi ; rcx += rsi + carry; mov rdx, rcx ; rdx = rcx; ret ; rax ; }
pub fn sum_3(n: u32) -> u32 { (0..=n).step_by(3).sum() }
ã®ãããªãã®ã¯ $\Theta(1)$ ã«ã¯ãªã£ã¦ããã¾ããã§ããã
ãã¾ã (2)
ãã¶ããããããã§ãã
int square_sum(int n) { int res = 0; for (int i = 0; i <= n; ++i) res += i * i; return res; }
ããã Clang ã«ãã£ã¦ãããã¾ãã
square_sum(int): test edi, edi js .LBB0_1 mov eax, edi lea ecx, [rdi - 1] imul rcx, rax lea eax, [rdi - 2] imul rax, rcx shr rax imul edx, eax, 1431655766 shr rcx lea eax, [rcx + 2*rcx] add eax, edi add eax, edx ret .LBB0_1: xor eax, eax ret
æ
£ãã¦ããªã人㯠1431655766
ã£ã¦ãªããã ï¼ã¨ãªãããã§ãã
ã¡ãã㨠C++ ã§ã³ã³ãã¤ã«ã®éãå½¢ã§æ¸ãã¨æ¬¡ã®ãããªãã®ã«ãªãããã§ãã
int square_sum(int n) { if (n < 0) goto LBB0_1; { unsigned edi = n; unsigned eax = edi; unsigned ecx = edi - 1; unsigned long rcx = (long)ecx * (long)eax; eax = edi - 2; ecx = rcx; unsigned long rax = (long)eax * (long)ecx; rax >>= 1; eax = rax; unsigned edx = eax * 1431655766u; rcx >>= 1; eax = 3 * rcx; eax += edi; eax += edx; return eax; } LBB0_1: return 0; } int main() { for (int i = -10; i <= 10; ++i) { printf("%d%c", square_sum(i), i < 10 ? ' ' : '\n'); } // 0 0 0 0 0 0 0 0 0 0 0 1 5 14 30 55 91 140 204 285 385 }
諸ã ãæ´çããã¨ãèããã¹ããã¼ãã¯æ¦ã次ã®ãããªæãã§ãã
unsigned edx = n * (n - 1) * (n - 2) / 2 * 1431655766u; unsigned ecx = n * (n - 1) / 2; return 3 * ecx + edx + n;
hint: 1431655766 == 0x55555556
.
ããã¤ãä¾ãè¦ã¦ã¿ã¾ãããã32-bit 符å·ãªãæ´æ°ã§èãã¦ããªã¼ãã¼ããã¼ã¯ wrappingï¼$2^{32}$ ãæ³ã¨ãã¦èããï¼ã¨ãã¾ãã
>>> (8000 * 1431655766) % (2**32) 2863316864 >>> (8001 * 1431655766) % (2**32) 5334 >>> (8002 * 1431655766) % (2**32) 1431661100
大èäºæ³ã§ãã $$ x\equiv 0\pmod{3} \implies (x\times 1431655766)\bmod 2^{32} = \tfrac23 x. $$
種æããã¨ããããªãã¨ãããã$1431655766 = (2^{32} + 2)/3$ ã§ãã ãªã®ã§ã$x = 3y$ ã¨ããã¨ä¸è¨ã®ããã«ã§ãã¾ãã $$ \begin{aligned} 3y\times ((2^{32} + 2)/3) &= y\times (2^{32}+2) \\\ &\equiv y\times 2 = 2y \pmod{2^{32}}. \end{aligned} $$ ãªãã»ã©ãã¨ããæãã§ãã
ãã¦ããããè¸ã¾ãã¦è¨ç®ããã°ã$\tfrac16 n(n+1)(2n+1)$ ãæ±ãã¦ãããã¨ããããã§ãããï¼$n(n-1)(n-2)/2$ 㯠$3$ ã®åæ°ã§ãããã¨ã«æ³¨æï¼ã è¨ç®éä¸ã®åå¤ã¯ãå¿ è¦ã«å¿ã㦠64-bit æ´æ°ã使ãã¤ã¤æ±ãã¦ããã®ã§ããªã¼ãã¼ããã¼ããã£ãå ´åã $\tfrac16 n(n+1)(2n+1)\bmod 2^{32}$ ã«ãªã£ã¦ãããã§ãã
ã¨ããã§ãé©åãªç¯å²ã«ããã¦ã$\floor{(x\times 1431655766)/2^{32}} = \floor{x/3}$ ã®ãããªè©±ãããããã§ãã
$n$ ã $2^{32}$ ã§å²ã£ãæ´æ°é¨åã¨ããã®ã¯ãn >> 32
ã ã£ããä¸ä½ dword ãæã£ã¦ããããããã¨ã§é«éã«è¨ç®ã§ãã¾ããããé¤ç®ã®é«éåã«è²¢ç®ãããã§ãï¼å®éãã³ã³ãã¤ã©ã¯ããããé¡ã®æé©åããã¦ããã¾ãï¼ã
ãããã³
ã¢ã»ã³ããªãèªåã§æ¸ãã¦è©¦ããç¶æ ã«ãªã£ã¦ããã¨ãåå¼·ãæãã¨æãã®ã§ããããããã¨ããã¾ãããã
â foo.s â
.intel_syntax .file "foo.s" .text .globl foo .type foo, @function foo: mov %eax, %edi imul %eax, %eax add %eax, 2 ret .section .note.GNU-stack,"",@progbits
â main.c â
#include <stdio.h> int foo(int); int main(void) { printf("%d\n", foo(5)); }
â ã³ã³ãã¤ã«ã»å®è¡ â
% as -o foo.o foo.s % gcc foo.o main.c -o main % ./main
ãããã¯ãé©å½ãªããã°ã©ã prog.c
ãæ¸ã㦠gcc -S prog.c
ãªã©ããã㨠prog.s
ãå¾ãããã®ã§ããããèªãã®ãããããããã¾ããã
ã¾ããM2 Mac ãªã©ã使ã£ã¦ãã人ã¯ä¸è¨ã®ã¢ã»ã³ããªã§ã¯åããªãããï¼æããã¾ããï¼ãªã®ã§ãå¥éèããå¿ è¦ãããã¾ããå½ä»¤ã»ããã¨ãã¬ã¸ã¹ã¿ã®ååã¨ããéãããã§ãã
â foo.s â
.file "foo.s" .text .global _foo _foo: mov w0, w0 mul w0, w0, w0 add w0, w0, #2 ret .align 8
ã¾ããä¸è¨ã®ãããªãã¨ãããã¨æ¥½ããæ°æã¡ã«ãªã人ãããããããã¾ããã
% objdump -D foo.o
é¢é£è³æ
- Introduction to x64 Assembly
- x64 ã®ã¢ã»ã³ããªã®ç´¹ä»ãæ¸ãã¦ããããããããªã¬ã¸ã¹ã¿ããããªã
- Assembly 1: Basics, Assembly 2: Calling convention
- åºç¤ã®ç´¹ä»ãå¼ã³åºãè¦ç´ãªã©ã«ã¤ãã¦æ¸ããã¦ããã
- Intel® 64 and IA-32 Architectures Software Developerâs Manuals
- Intel® 64 and IA-32 Architectures Software Developerâs Manual Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4 ãªã©ã
- ä»æ§ãããããæ¸ãã¦ããã
- åå½ä»¤ã®èª¬æãæ¬ä¼¼ã³ã¼ãã¤ãã§æ¸ããã¦ããã
add rax, imm32
ã¨ãshufpd xmm1, xmm2/m128, imm8
ã®ãããªè¨æ³ã«æ £ããã¨ããããããã- Volume 1 ã® 3.1.1.3 Instruction Column in the Opcode Summary Table ãèªãã
ä¾ã«ãã£ã¦æ¥æ¬èªã®è³æã¯ãã¾ãæ¢ãã¦ãã¾ããããããªã«ãã¦ããããããããã§ãã
ãããã¦èªã¿ãã
- Σé»å - ã¦ãã¨ã¼ãªæ¥è¨
- Hacker's Delight
- ãã¾ã (2) ã§è¦ããããªãæ´æ°é¤ç®ã«é¢ããæé©åãããã以å¤ã®ãããæ¼ç®ã«ã¤ãã¦ããããè¼ã£ã¦ãã
- æ¥æ¬èªç ãããã¾ã
èªã¿ãããã©ããã¯äººã«ããã¨ããã®ã¯ããã
ãã¨ãã
è¾¼ã¿å ¥ã£ã解説ãæ¸ããªãã¨ãã£ããããªè¨äºã«ãªããªãã¨ããæ°æã¡ã§ããã¢ã»ã³ããªã«é¢ãã¦ãªããããããæ¸ããã¨ãããã§ããããæ¸ãã¦èª°ã幸ãã«ãªããã ããããåã ã調ã¹ã¦ããããããããã¨ããæ°åã«ãªã£ã¦æ¶ãã¦ãã¾ãã¾ãããæç®ã¯æããã®ã§æ欲ãèå³ããã人ã¯ããã°ã£ã¦ã»ããã§ãã
ãããããã³ã³ãã¤ã©ããªã¼ãã¼ãè½ã¨ãã¦ããããããªã±ã¼ã¹ã¯åºæ¬çã«ã¯ç¨ã¨ããæ°ããã¦ãã¾ãããã ç¨ã ããã¨ãã£ã¦ç¡è¦ãã¦ããã¨è¶³ãæ¬ããããã§ãã æªå®ç¾©åä½ãå©ç¨ããã¦ããã°ãæé©åãèµ·ãã¦å®æ°æéã®å¦çã«ãªã£ã¦ãããã¨ã¯ãã°ãã°ããæ°ããã¾ãã
å®æ°ã«ããé¤ç®ãªããã¯ï¼é¤ç®å½ä»¤ã¯éãã®ã§ï¼é¤ç®ã使ããªãå½¢ã«æ¸ãæããæé©åããã¦ãããã¡ã§ãããããã話ã«é¢ãã¦ãã¢ã»ã³ããªãèªããã¨å¦ç¿ãæããããªæ°ããã¾ãã
Clang ã $\sum_{i=0}^n i^2$ ã«ã¤ãã¦ãã«ã¼ããªãã«ãã¦ãããä¸ãé¤ç®ã®é¨åãããæãã«æé©åãã¦ãããã®ã§é¢ç½ãã£ãã§ãã $\floor{n/3}$ ã®ãããªã¿ã¤ãã®æé©åã¯ç¥ã£ã¦ããã®ã§ããã$\tfrac23 n$ ã®ãããªã¿ã¤ãã¯ç¥ããªãã£ãã®ã§åå¼·ã«ãªãã¾ããã ããã«å¤§ãã $k$ ã«å¯¾ã㦠$\sum_{i=0}^n i^k$ ãè¦ã¦ã¿ã¦ãé¢ç½ãããããã¾ãããã
ããã
ãããã§ãã
*1:ãããªä¾ã¯ 5000 å¹´åãã 3 ä¸åã¯è¦ã¦ãããããå®éãå ã 3 é±éãããåã«ä¸æ¸ãããã¦ããè¨äºã§ãã
*2:ã¬ã¸ã¹ã¿ã®ãã¨ã¯ãedi ã ecx ãªã©ååã®ã¤ããã¡ã¢ãªã ã¨æã£ã¦ãããã°ããã¨æãã¾ããã©ã®ãããªã¬ã¸ã¹ã¿ããããã¯ãé¢é£è³æã«æãããã®ãèªãã§ããããã¨ããããã§ãã