追è¨
ããã32bitçã ã¨CUDAåãã¦ãªãã£ãããã
æ°ããã®ã ã¨ä¿®æ£ããã¯ãã ãã
http://d.hatena.ne.jp/w_o/20150619#1434643288
ãåç
§ãã¦ãã ããã
waifu2x é«éåçæ´æ°
CUDAçãå ¥ããããã¡ããåºåãã¤ããã®ã§32bitçãç½®ãã¦ããã
http://int.main.jp/files/waifu2x-converter_x86_20150616_02.exe
http://int.main.jp/files/waifu2x-converter_x64_20150616_02.exe
æå ã®GTX765M ã ã¨ã¾ã caffeçã®ã»ãã3%ãããéãã®ã§åå¨çç±ã¯ç¡ããã¾ãèªå·±æºè¶³ã ãâ¦
OpenCLããã®ã« CUDA çãä½ã£ãçç±ã¯ã
- OpenCL ã§ã¯ shared ã®ãµã¤ãºãå¤ããããªãããã¼ã«ã«ã¡ã¢ãªãµã¤ãºã¯ããªãå½±é¿ããã®ã§ãcuCtxSetCacheConfig ã¯å¿ é ã¨ãã£ã¦ããã
- ãªãã NVIDIA GPU ã ã OpenCL æ¢ã¾ãçç±ããããã°ããã®ãé¢åã ã£ããã
ã¨ããäºç¹ãã¾ãã©ããç´°ãããã¥ã¼ãã³ã°ãããå¥ã«ãªãã ããã¨æã£ã¦ãäºã¤ä½ãã®ã¯æµæç¡ãã£ãã
ã§ãçµå±ãµãã¤ã¨ãä¼¼ããããªãã®ã«ãªã£ã¦ãã¾ã£ãã®ã§sharedãµã¤ãºä»¥å¤ã¯åããæå³ç¡ãã£ãããªâ¦
1920x1080 ãå ¥ããã¨ãã®æéãã
æé [sec] | æ§è½[GFLOPS] | å¹ç[%] | æ¶è²»é»å[W] | |
FMAç i7-4700MQ @ Win8.1 | 38.23 | 161.29 | 52.5 | 39 |
CUDAç GTX765M @ Win8.1 | 20.80 | 296.40 | 22.3 | 57 |
OpenCLç A10-7850K @ Linux | 21.16 | 291.41 | 39.7 | 53 |
FMAç A10-7850K @ Linux | 80.96 | 76.17 | 64.3 | 95 |
waifu2x-caffe GTX765M(åè) | 20.21 | 301.7(å¤å) | 22.8 | 66 |
æéã¯ãwaifu2x-caffeã¨åãç¯å²ãã¯ãã£ã¦ããã¯ããåæåãå«ã¾ãªããpngã®èªã¿æ¸ããå«ãã
æ¶è²»é»åã¯ã¯ãããã§ãã«ç®è¦ã§ã¢ã¤ãã«æã¨ã®å·®åãã¾ãç®å®ãªã®ã§ããã¾ãä¿¡ç¨ããªãã§ã
ã¿ã¼ããã¼ã¹ãæå¹ãªã®ã§ãFMAçã®å¹çã¯å®éã«ã¯ããã¡ãã£ã¨ä½ãã
æ¼ç®æ°ã¯ããããã¯åå²ã§å¢ããåãå«ãã§ãããã¤ã¾ããçã®çè«å¤ãããFLOPSãå¢ãã¦ããã(ã¾ãã§ããããã¯åå²ã®ã³ã¹ãã®ã»ããã§ããã®ã§ãããã¯åå²ããã»ããFLOPSä¸ããã¨ãããã¨ã¯èµ·ãã£ã¦ããªã)
- CUDAããç°å¢ã§ã¯CUDA
- AMD OpenCLããç°å¢ã§ã¯ OpenCL
- FMA ããã°FMA
- AVX ããã°AVX
- ã©ãããªããã°OpenCVã®filter2D
ã¨ããåªå é ä½ã§åããCUDAãè¤æ°ããå ´åããAMD OpenCL GPUãè¤æ°ããã¨ãã®æåã¯ããããããªã(æåã«è¦ã¤ãã£ãã®ã使ã£ã¦ãã)
Intel GPU ã¯è¦ã¤ãã£ã¦ãå¤ãã¦ãããã¨ããã®ã¯ãFMAçã®ã»ããéãã®ã¨ãIntel GPUã¯ã¢ã»ã³ããªåºåã§ããªãã¦ç´°ãããã¥ã¼ãã³ã°ã§ããªãã®ã§ã©ããããããªããã¾ãã¢ã»ã³ããªåºããããã«ãªã£ããããæ°åºã¦ããããã
--disable-gpu ãä»ããã¨ãGPUã使ããªãããã«ãªãã
--block_size ã§å¦çãããã¯ãµã¤ãºãæå®ãåä½ç¢ºèªãã¦ããªãã®ã§0,512,1024以å¤ã¯å ¥ããããªãã0 ãæå®ããã¨ãããã¯åå²ããªãã
ãããã«æé使ãããã¦çæ´»ã¨ä»äºã«å½±é¿ãåºã¦ããã®ã§é«éåã¯ãã®ã¸ãã§ããããã¨æãããã£ã¨ãã£ã¦æ¬²ãã人ã¯å¼ç¤¾ã®contactã¾ã§é£çµ¡ãã ãã(ã¾ã3人æãããããªãå価ã¯(ãã¼)ä¸å)ããããã¯éã£ã¦ã
ä½æ¥éã¯FMAç3ã4æ¥ãOpenCLç5ã6æ¥ãCUDAç3æ¥ããããOpenCLçã¯ããªã試è¡é¯èª¤ãã¦ã¦ãCUDAçã¯ããã使ã£ãã ããªã®ã§CUDAã®ã»ããç°¡åã¨ããããã§ã¯ãªãã
ãã¡ããDLLåã¾ã§ã¯ããããããã¡ãã£ã¨å¾ ã£ã¦ã
ä»å¾é«éåã«ãã£ã¬ã³ã¸ãã人ã®ããã«ããã¤ãæ¸ãã¦ããã¨ã
ãã®å¦çã¯ãMulti-convolve ã¨è¨ã£ã¦ãDeep Neural Network ã®éè¦ãªå¦çããã(ã¨ãããããªãã¨ã http://on-demand.gputechconf.com/gtc/2014/webinar/gtc-express-sharan-chetlur-cudnn-webinar.pdf ã«æ¸ããã¦ãããåã¯ããããããªã)ã
ãªã®ã§ãæ°åããå ¥ãã¦ä½ã価å¤ã¯ããã®ã§ã¯ãªããã¨æãããã¨é©åº¦ã«é£ããåé¡ãªã®ã§ãç·´ç¿ã«ãããã®ã§ã¯ãªãããªã
ã¾ãã§ãä¸çæéãç®æãã®ã¯ããªãé£ãããã
waifu2x ãªãªã¸ãã«ã®ä½è ã®ultraist ããã®æ å ±ã«ãã㨠https://twitter.com/ultraistter/status/601603101628366848 cuDNNã®å®è£ ã¯æéã¨ããããã§ã¯ãªãã¦ãã¾ã éãå®è£ ãããã¿ããã
cuda-convnet2 ã¯ããªããå¦çé度ãç»åãµã¤ãºã«æ¯ä¾ããªãããã«ãªã£ã¦ãããããã¦ãç»åãµã¤ãºãã§ããã¨æå©ãªã®ããªã¨ããæ°ãããï¼
NervanaSystems ã®å®è£ ã¯ãptxas ãå¹çæªãããã¨è¨ã£ã¦ãèªåã§SASSã解æãããåªããmaxasãå®è£ ãã¦cublasãããéãsgemm(https://github.com/NervanaSystems/maxas/wiki/SGEMM )ãä½ã£ã人å¤ã®ä¸ã®äººå¤ã§ããScott Grayãããä½ã£ã¦ããããã®ã§ãããããå¹ç90%ãªã¼ãã¼ã¨ãåºã¦ãã®ã§ã¯ãªããã¨æãã
ãªã®ã§ãä¸çæéãç®æããªã
- ã¢ã«ã´ãªãºã ç解ãã¦è¨ç®ãªã¼ããæ¸ãã
- ãã®ä¸ã§ãGPU ã§å¹ç90%ãªã¼ãã¼ãç®æã(åã®ããã«å¹ç30%ã§æºè¶³ãã人éã§ã¯åã¦ãªã)
ã¨ãããå¿ è¦ãããããã
追è¨
http://int.main.jp/files/waifu2x-converter_x86_20150616_02.exe
http://int.main.jp/files/waifu2x-converter_x64_20150616_02.exe
ããããã¹ã£ã¦ãã®ã§æ´æ°
- OpenCLçåãã¦ãçç±ãããããããªããã¹ããã¦ãã
- 32bitçã64bitçã ã£ã
- sm30ç¨ã®ptxåºãã¦ãã¨æã£ã¦ããã©sm20ç¨ã®ãåºãã¦ã
åä½ç¢ºèªç°å¢å¤ããªãã¨ã§ããªãããé¢åãªããããå¤åCUDA,OpenCL両æ¹ã¡ãã£ã¨ã¯ãããªã£ã¦ãã¯ãã ãããããã¦