__builtin_expectã®å¹å
RLogãããã£ã¦ã¦ç¥ã£ã__builtin_expectã試ãã¦ã¿ãã__builtin_expectã¯ããå¼ãã»ã¨ãã©ã®å ´åã«æ±ºã¾ã£ãå®æ°ã«ãªããã¨è¨ãå ´åã«åå²äºæ¸¬ã®ãã³ããªã©ãä¸ãã¦é«éåãè¨ãããã®gccãã£ã¬ã¯ãã£ãã ãããªã
RLogã¯dormant(ä¼ç )ç¶æ
ã®ãã°ãã¡ã·ãªãã£ã«æé©åãã¦ãã£ã¦ããããã¯ã·ã§ã³ã³ã¼ãã«ãã°ã³ã¼ããæ®ãã£ã±ãªãã§ããã»ã©ãã°ã¬ããªãã®ã売ããªãã ãã©ããã®ã«ã©ã¯ãªãããã
ã§ãRLogèªèº«ã«ã¤ãã¦ã¯ãã¨ã§æ¸ãã
ã©ãããï¼
å ·ä½çã«ã¯
__builtin_expect(A,B)
ã¨æ¸ããå ´å Aãå®æ° Bã§ããäºãæå¾
ãããã¨ãããã³ãæ
å ±ã«ãªãã
ä¾ãã°ãæ¯è¼æ¼ç®åãã»ã¨ãã©ã®å ´åæãç«ããªããã¨è¨ãå ´å __builtin_expect(
#include <stdio.h> #ifdef EXPECT #define EXP(foo, bar) __builtin_expect((foo), (bar)) #else #define EXP(foo, bar) (foo) #endif #define N 0x7fffffff int foo(int x) { if (EXP((x>N-2), 0) == 1) printf("%d\n", x); return x+1; } main() { int i; for (i = 0; i< N; ) { i = foo(i); } }
ã¦ãªæãã«æ¸ãã¦ã¿ãã
ãã«ãã¯ãããªæãã
base_exp: $(SRC) $(CC) -O2 -DEXPECT $^ -o $@ base_normal: $(SRC) $(CC) -O2 $^ -o $@
å®ã¯ -freorder-blocksãã¤ãã¦ã³ã¼ãã移åã§ããããã«ãã¦ãããªãã¨ããã£ããã®expectæ å ±ãããã¾ãå½¹ã«ç«ããªãã¿ãããªã®ã§ -O2ã«ãã¦ããã
çµæ
ã§ãtimeã§ãã¹ãã
model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
tkuro@sawshark> time ./base_normal ~/Exp/__builtin_expect 2147483646 ./base_normal 7.18s user 0.00s system 99% cpu 7.188 total tkuro@sawshark> time ./base_exp ~/Exp/__builtin_expect 2147483646 ./base_exp 5.39s user 0.00s system 99% cpu 5.397 total
ã¬ãã確ãã«éããªã£ã¦ããã
éãã¨è¨ãã®ãä¸ã®ãããªæããå·¦ãnormalãå³ã expã
cmpl $2147483645, %edi cmpl $2147483645, %edi .cfi_offset 3, -16 .cfi_offset 3, -16 jle .L2 | jg .L5 > .L2: > leal 1(%rbx), %eax > popq %rbx > ret > .L5: movl %edi, %edx movl %edi, %edx movl $.LC0, %esi movl $.LC0, %esi movl $1, %edi movl $1, %edi xorl %eax, %eax xorl %eax, %eax call __printf_chk call __printf_chk .L2: | jmp .L2 leal 1(%rbx), %eax < popq %rbx < ret < .cfi_endproc .cfi_endproc
ã¤ã¾ãexpãã¦ãã¨ãã»ã¨ãã©ã®å ´åæç«ãã edi < 0x7fffffff-2ãã®å ´åã«åå²ããªãããã«ãªã£ã¦ãããã¨ãå®éã«ã¯åå²äºæ¸¬ã«ãã£ã¦æç«ä¸æç«èªä½ã¯ããããã ãããã©ãã®ã£ããåããªããããªããã§å¤ãããã ãªãã¼ã
ã¤ãã§ãªã®ã§IntelããVTuneã®è©ä¾¡çãæ¾ã£ã¦ãã¦è©¦ãã¦ã¿ãã
ã¨ãããã©ãããä¾ã®å ´æ㧠ã¡ããã© 2Gã¯ããã¯ånormalãä½åã«ããã£ã¦ãããå
¨é¨ã§2Gã«ã¼ããªã®ã§ã¤ã¾ã1clkããã«ãã£ãããã®ã ãã
Intelã«ã¯AMDã®Analystã¿ãããªã®ç¡ãã®ããªãã
ã¡ãªã¿ã«ãããããã£ã¨å¶æªãªã®ããããã¡ã¤ã«ãåã£ã¦ãããã«å¿ãã¦æé©åãè¡ãæ¹æ³ãprofile-based optimizationãgccã®ä¸çã§ã¯feedback based optimizationã¨å¼ã°ãã¦ãããããã
å
·ä½çã«ã¯ã¾ã -fprofile-generateã§ä½ã£ããã¤ããªãå®è¡ãã¦ããããã¡ã¤ã«æ
å ±gcdaãã¡ã¤ã«ãåãã¦ã-fprofile-useãã¤ãã§ããä¸åº¦ã³ã³ãã¤ã«ããã¨æé©åãã¦ãããã
ãããæãããã
tkuro@sawshark> time ./base_profile ~/Exp/__builtin_expect 2147483646 ./base_profile 1.09s user 0.00s system 98% cpu 1.109 total
ã«ãã¼ã¼ã¼ã¼ãå ã®7åè¿ãã
.L7: cmpl $2147483646, %eax je .L27 cmpl $2147483645, %eax je .L29 cmpl $2147483644, %eax je .L27 cmpl $2147483643, %eax .p2align 4,,5 je .L27 cmpl $2147483642, %eax .p2align 4,,5 je .L27 cmpl $2147483641, %eax .p2align 4,,5 je .L27 cmpl $2147483640, %eax .p2align 4,,5 je .L27 cmpl $2147483639, %eax .p2align 4,,5 je .L27 addl $9, %eax cmpl $2147483647, %eax .p2align 4,,3 je .L27 cmpl $2147483646, %eax .p2align 4,,3 jne .L7 movl $2147483646, %edx movl $.LC0, %esi
åæã«ã¢ã³ãã¼ãªã³ã°ããã£ãã
çµè«
ãã¤ããªã¢ã³ã®ä¸çã¯å¥¥ãæ·±ãã
ãã¾ã
Makefileã¤ãã¨ãã£ã¨ã
ALL=base_profile base_profile.s base_exp base_exp.s base_normal base_normal.s SRC=base.c all: $(ALL) base_profile: $(SRC) base.gcda $(CC) -fprofile-use $< -O2 -o $@ base_profile.s: $(SRC) base.gcda $(CC) -fprofile-use $< -O2 -S -o $@ base.gcda: base_tmp ./base_tmp rm -f ./base_tmp base_tmp: $(SRC) $(CC) -fprofile-generate $^ -O2 -o base_tmp base_exp: $(SRC) $(CC) -O2 -DEXPECT $^ -o $@ base_exp.s: $(SRC) $(CC) -O2 -DEXPECT $^ -o $@ -S base_normal: $(SRC) $(CC) -O2 $^ -o $@ base_normal.s: $(SRC) $(CC) -O2 $^ -o $@ -S clean: rm $(ALL) base.gcda