深層ã¢ãã«ã®ãã©ã¡ã¼ã¿ãä¸åã«ä¸¦ã¹ã¦ãã¯ãã«ã«ãã¾ãããã®ãã¯ãã«ã¯å¤§è¦æ¨¡ãªã¢ãã«ã§ããã°ä½åå次å ã«ããªãã¾ããä¸è¦ãæå³ã®ãªãæ°å¤ã®ç¾ åã®ããã§ããããã®ãã¯ãã«ã¯ãã¯ãã«ã¨ãã¦æ·±ãæå³ããããã¨ãåãã£ã¦ãã¦ãã¾ããä¾ãã°ã 㨠ãç°ãªããã©ã¡ã¼ã¿ãã¯ãã«ã¨ããã¨ã
ã ããã©ã¡ã¼ã¿ã¨ãã¦æã¤ã¢ãã«ã¯ã¡ããã¨æ©è½ãã¾ããæ¬ç¨¿ã§ã¯ããã®ãããªã¢ãã«ãã©ã¡ã¼ã¿ã®ç®è¡ãç¨ããææ³ã¨ãã®èå¾ã«ããçè«ã«ã¤ãã¦è§£èª¬ãã¾ãã追è¨ï¼
æèã深層ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ã®é«éåãã«ã¦æ¬ç¨¿ã®å 容ãå¤§å¹ ã«å¢è£ãã¾ãããæ¬ç¨¿ã«èå³ãæã£ãæ¹ã¯ãã¡ããåç §ããã ããã¨å¬ããã§ãã
追è¨ããã¾ã§
ç®æ¬¡
- ç®æ¬¡
- ã¢ãã«ã¹ã¼ã
- ã¿ã¹ã¯ãã¯ãã«
- ã¢ãã«ãã©ã¡ã¼ã¿ã¨ãã¥ã¼ã©ã«ã¿ã³ã¸ã§ã³ãã«ã¼ãã«
- ãããã«
ã¢ãã«ã¹ã¼ã
ã¢ãã«ã¹ã¼ã [Wortsman+ ICML 2022] ã¯è¤æ°ã®ã¢ãã«ãã©ã¡ã¼ã¿ãå¹³åãããã¨ã§æ§è½ãä¸ããææ³ã§ããäºåå¦ç¿ã¢ãã« ããã¯ããã¦ãæ§ã ãªãã¤ãã¼ãã©ã¡ã¼ã¿ã§è¨ç·´ããçµæã®ãã©ã¡ã¼ã¿ã ã¨ãã¾ããããããå¹³åãããã¯ãã«
ã¯åã ã®ã¢ãã«ãããæ§è½ãé«ããåå¸ã®ã·ããã«ãé å¥ã§ãããã¨ãåãã£ã¦ãã¾ãããã®ããã«ãæ§ã ãªãã¤ãã¼ãã©ã¡ã¼ã¿ã§è¨ç·´ããçµæã®ãã©ã¡ã¼ã¿ãå¹³åãã¦æ··ãåããããã¨ãã¢ãã«ã¹ã¼ãã¨ããã¾ããå ¸åçãªè¨ç·´ã®æé ã§ã¯ãæ§ã ãªãã¤ãã¼ãã©ã¡ã¼ã¿ã§è¨ç·´ãããã¨ãæ¤è¨¼ç¨ãã¼ã¿ã§æ§è½ãè©ä¾¡ããæãè¯ãã£ãã¢ãã«ãæ®ããã以å¤ã¯ç ´æ£ãã¾ããã¢ãã«ã¹ã¼ãã¯æ¬æ¥ç ´æ£ãã¦ããã¯ãã®ã¢ãã«ãææã¨ãã¦æ´»ç¨ãã¦æ§è½ãä¸ãããããã¨ãå©ç¹ã§ãã追å ã®è¨ç·´æéãå¿ è¦ããã¾ãããã¾ããæ¤è¨¼ç¨ãã¼ã¿ã»ãããå¿ è¦ç¡ãç¹ãé åçã§ãã
ã¢ãã«ã¹ã¼ãã¯ã¢ã³ãµã³ãã«ã¨ä¼¼ã¦ãã¾ãããæ¨è«æéã¨ãããã¤ã®åç´ãã®ç¹ã§å¤§ããªéããããã¾ãã åã®ã¢ãã«ãã¢ã³ãµã³ãã«ããã¨ãæ¨è«æé㯠åã«ãªãã¾ãããã¢ãã«ã¹ã¼ãã§ã¯é常ã®ã¢ãã«ã¨æ¨è«æéã¯å¤ããã¾ãããã¾ããå®ç°å¢ã«ãããã¤ããéãã¢ã³ãµã³ãã«ã§ã¯è¤æ°ã®ã¢ãã«ã管çããç¹ã§ç ©éã«ãªãã¾ãããã¢ãã«ã¹ã¼ãã§ã¯åä¸ã®ã¢ãã«ã«çµ±åããã¦ãããããé常ã®ã¢ãã«ã¨åãããã«ãããã¤ã§ãã¾ãã
ä¸ã«è¿°ã¹ãæ¹æ³ã¯ä¸æ§ã¹ã¼ãã¨å¼ã°ããæãç°¡åãªãã¼ã¸ã§ã³ã§ãã貪欲ã¹ã¼ãã¨å¼ã°ãããã¼ã¸ã§ã³ã§ã¯ãæ¤è¨¼ç¨ãã¼ã¿ã»ãããç¨ãã¦ãã¢ãã«ãæ¤è¨¼æ§è½ã®è¯ãé çªã«ä¸¦ã¹ãä¸ã¤ãã¤ã¹ã¼ãï¼éåï¼ã«è¿½å ãã¦ããã¾ããæåã¯æãæ¤è¨¼æ§è½ã®è¯ãã¢ãã«ã ããã¹ã¼ãã«å ¥ã£ã¦ãã¾ããããã«æ¤è¨¼æ§è½ã®è¯ãé ã«ã¢ãã«ãå ¥ããã¹ã¼ãå ã®ãã©ã¡ã¼ã¿ã®å¹³åãåã£ãã¢ãã«ã®æ¤è¨¼ç²¾åº¦ãä¸ãããªãã°è¿½å ãæ£å¼ã«æ±ºå®ããä¸ãããªãã°è¿½å ããã®ãåãããã¾ããå ¨ã¦ã®ã¢ãã«ã®å¦çãçµãã£ããã¨ã®ãã¹ã¼ãå ã®ãã©ã¡ã¼ã¿ã®å¹³åãåã£ããã®ãæçµçãªã¢ãã«ã§ãã貪欲ã¹ã¼ãã¯æ¤è¨¼ç¨ãã¼ã¿ãå¿ è¦ã§ãæé ãå°ãè¤éã«ãªãã¾ãããä¸æ§ã¹ã¼ããããæ§è½ãè¯ããã¨ã確èªããã¦ãã¾ãã
ã¢ãã«ã¹ã¼ãã«å ããããã¢ãã«ã¯å ¨ã¦åãäºåå¦ç¿ã¢ãã«ãããã¡ã¤ã³ãã¥ã¼ãã³ã°ãããã®ã§ãããã¨ã«æ³¨æãã¦ãã ãããä¸è¬ã«ãç°ãªãã©ã³ãã åæåãã©ã¡ã¼ã¿ããå§ãã¦è¨ç·´ããã¢ãã«ã©ããã®å¹³åãåã£ã¦ãè¯ãã¢ãã«ã«ã¯ãªãã¾ãããããã¯ãç´è¦³çã«ã¯ãç°ãªãã©ã³ãã åæåãã©ã¡ã¼ã¿ã§ã¯ã¢ã¼ãã®æ´åæ§ãç¡ããããåç´ã«å¹³åããã¨ãã¡ããã¡ããªãã¨ã«ãªãããã§ããå ·ä½çã«ã¯ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ã®é åä¸å¤æ§ãéè¦ãªå½¹å²ãæãããã¨ãç¥ããã¦ãã¾ã [Entezari+ ICLR 2022]ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ã®ãã¥ã¼ãã³ã¯é©å½ã«ä¸¦ã³æ¿ãã¦ãã¢ãã«ã®é¢æ°ã¨ãã¦ã®æ¯ãèãã¯å¤åãã¾ããï¼ä¸å³ï¼ããããããã®ä¸¦ã¹æ¿ãã«ãããã©ã¡ã¼ã¿ã¯æ¬¡å ãå ¥ãæ¿ããããã¯ãã«ã¨ãã¦ã¯å ¨ãå¥ã®ãã®ã«ãªãã¾ããç°ãªãã©ã³ãã åæåãã©ã¡ã¼ã¿ããå§ããã¢ãã«ã§ã¯ãã®ä¸¦ã³æ¿ãæ¹ãåã£ã¦ããªãã®ã§ãå¥ã®æ¬¡å ãå¥ã®æ¬¡å ã¨å¹³åãããã¨ã«ãªããçµæããã¡ããã¡ãã«ãªãã¾ããGit Re-Basin [Ainsworth+ ICLR 2023] ã¨ããææ³ã¯ããã¥ã¼ãã³ã®ãããã³ã°åé¡ã解ããã¨ã§ã並ã³æ¿ãæ¹ãåã£ã¦ããªãã¢ãã«ã©ãããæ´åãã¾ããç°ãªãã©ã³ãã åæåãã©ã¡ã¼ã¿ããå§ããã¢ãã«ã§ã§ãã£ã¦ããGit Re-Basin ãè¡ã£ã¦ããå¹³åãåãã¨ãè¯ãã¢ãã«ã«ãªããã¨ã確èªããã¦ãã¾ãã
確ççéã¿å¹³å (stochastic weight averaging; SWA) [Izmailov+ UAI 2018] ãã¢ãã«ã¹ã¼ãã¨åæ§ã«ããã©ã¡ã¼ã¿ã®å¹³åãåããã¨ã§æ§è½ãä¸ããææ³ã§ãã確ççéã¿å¹³åã§ã¯ãå¦ç¿å¾ã«ããç¨åº¦å¤§ããªå¦ç¿ç㧠SGD ãèµ°ãããã¢ãã«ãã©ã¡ã¼ã¿ã®å ãå¾ã¦ãããããå¹³åãã¾ãã1 åã®å¦ç¿ãã¹ã§ä½ãããæ軽ãªã¢ãã«ã¹ã¼ãã¨è¦ãªããã¨ãã§ãã¾ãã
ã¢ãã«ã¹ã¼ãã¯ã¢ã³ãµã³ãã«ã®è¿ä¼¼ã¨è¦ãªããã¨ãå¯è½ã§ãã ãã¢ãã«ã®è¡¨ç¾ããé¢æ°ã¨ãã¾ããã¢ã³ãµã³ãã«ã¯
ã¨ããããã«ãé¢æ°ç©ºéã§å¹³åãåãã¾ããããã§ã ã ã®å¨ã㧠ã«ã¤ãã¦ãã¤ã©ã¼å±éããã¨ã ã¨ãªãã¾ãããããã¢ã³ãµã³ãã«ã®å¼ã«ä»£å ¥ããã¨ã ã¨ãªãã¾ããããã§ã ã®å®ç¾©ãã ã«æ³¨æããã¨ãã¢ãã«ã¹ã¼ãã¨ã¢ã³ãµã³ãã«ã®å·®ã¯ã¨ãªãã¾ããããªãã¡ãã¢ãã«ã¹ã¼ãã¨ã¢ã³ãµã³ãã«ã®å·®ã¯äºæ¬¡ä»¥ä¸ã¨ãªãã¾ããäºåå¦ç¿ã¢ãã«ãããã¡ã¤ã³ãã¥ã¼ãã³ã°ããå ´åããã©ã¡ã¼ã¿ã®å¤åéã¯å°ããã ãå°ãããä¸æ§ã¢ãã«ã¹ã¼ãã¨ã¢ã³ãµã³ãã«ã¯é¢æ°ã¨ãã¦ãè¿ãã¨èãããã¾ããåè¿°ã®ããã«ãã¢ã³ãµã³ãã«ã§ã¯æ¨è«ã®è¨ç®æéã åã«ãªãããããã¤ãç ©éã«ãªãã®ã«å¯¾ããä¸æ§ã¢ãã«ã¹ã¼ãã¯åçã®å¹æãåä¸ã®ã¢ãã«ã§éæãããã¨ãã§ãã¾ããã¿ã¹ã¯ãã¯ãã«
ã¿ã¹ã¯ãã¯ãã« [Ilharco+ ICLR 2023] ã¯ãã¿ã¹ã¯ã®å¦ç¿ã表ããã¯ãã«ã§ããã¿ã¹ã¯ã®å¦ç¿ãå¿å´ (unlearning) ãã¿ã¹ã¯ãã¯ãã«ã®ç®è¡ã«ããå®ç¾ã§ãã¾ãã
äºåå¦ç¿ã¢ãã« ããã¯ããã¦ãã¿ã¹ã¯ A ã®ãã¼ã¿ã»ããã§ãã¡ã¤ã³ãã¥ã¼ãã³ã°ããçµæã ã¨ããã¨ãã ãã¿ã¹ã¯ A ã®ã¿ã¹ã¯ãã¯ãã«ã¨ããã¾ãã
ã¿ã¹ã¯ãã¯ãã«ã®ç®è¡ã«ããå¦ç¿ãå¿å´ãå®ç¾ã§ãããã¨ãç¥ããã¦ãã¾ããä¾ãã°ããã©ã¡ã¼ã¿ ã表ãã¢ãã«ã¯ãã¿ã¹ã¯ A 㨠B ã®ä¸¡æ¹ã®æ§è½ãé«ããªãã¾ããã¾ããäºåå¦ç¿ã¢ãã«ãè¨èªã¢ãã«ãã¿ã¹ã¯ C ãæªå£ãããã¿ã¹ã¯ã¨ããæªå£ããæ§æãããããã¹ãã³ã¼ãã¹ã§å¦ç¿ããã¦ã¿ã¹ã¯ãã¯ãã« ãå¾ãã¨ãããã©ã¡ã¼ã¿ ã表ãè¨èªã¢ãã«ã¯æªå£ãè¨ããªãã¢ãã«ã«ãªãã¾ãã
ã¾ããé¢ç½ããã¨ã«ãåèªãã¯ãã«ã¨åãããã«ã¿ã¹ã¯ãã¯ãã«ãã¿ã¹ã¯ã®é¡æ¨å¦çãã§ãã¾ããåèªãã¯ãã«ã§ã¯ãking - man + woman â queen ã¨ãªããã¨ãç¥ããã¦ãã¾ãããããã¨åæ§ã®ãã¨ãã¿ã¹ã¯ãã¯ãã«ã§ãã§ããã®ã§ããä¾ãã°ãã¿ã¹ã¯ A ã Amazon ã¬ãã¥ã¼ã®è¨èªã¢ããªã³ã°ãã¿ã¹ã¯ B ã Yelp ã¬ãã¥ã¼ã®è¨èªã¢ããªã³ã°ãã¿ã¹ã¯ C ã Amazon ã®ææ åæã¨ããã¨ããã©ã¡ã¼ã¿ ã表ãã¢ãã«ã¯ Yelp ã¬ãã¥ã¼ã®ææ åæã§è¯ãæ§è½ãéæãã¾ããä»ã«ããã¿ã¹ã¯ A ãç¬ã®åçãã¼ã¿ãã¿ã¹ã¯ B ãã©ã¤ãªã³ã®åçãã¼ã¿ãã¿ã¹ã¯ C ãç¬ã®ã¤ã©ã¹ããã¼ã¿ã¨ããã¨ããã©ã¡ã¼ã¿ ã表ãã¢ãã«ã¯ã©ã¤ãªã³ã®ã¤ã©ã¹ããæ£ããåé¡ã§ãã¾ãããã®ããã«ãææã®ã¿ã¹ã¯ã®ãã¼ã¿ãå ¨ãããããã¯å°éããæã¡åããã¦ãªãå ´åã«ããç°ãªããã¡ã¤ã³ã®ãã¼ã¿ã«ããæ§æããã¿ã¹ã¯ãã¯ãã«ã®æ¼ç®ã«ãããææã®ã¿ã¹ã¯ã解ããããã«ãªãã®ã§ãã
ã¢ãã«ãã©ã¡ã¼ã¿ã¨ãã¥ã¼ã©ã«ã¿ã³ã¸ã§ã³ãã«ã¼ãã«
ãã®ãããªã¢ãã«ãã©ã¡ã¼ã¿ã®ç®è¡ã¯ãã¥ã¼ã©ã«ã¿ã³ã¸ã§ã³ãã«ã¼ãã« (Neural Tangent Kernel; NTK) ãèããã¨è¦éããè¯ããªãã¾ã [Ortiz-Jimenez+ NeurIPS 2023]ã
äºåå¦ç¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ã ã¨ããã¨ãããã®ã¢ãã«ã§å®ç¾©ããããã¥ã¼ã©ã«ã³ã¿ã³ã¸ã§ã³ãã«ã¼ãã« ã¨ã¯ã
確ççå¾é éä¸æ³ (SGD) ã«ãããã¡ã¤ã³ãã¥ã¼ãã³ã°ãè¡ãã
ã¨æ´æ°ãã¦ããã¾ããããã§ã ã¯é¢æ°åºåã«ã¤ãã¦ã®æ失ã®å¾é ã¨å¦ç¿çã®ç©ã§ãï¼ä¸è¬æ§ã失ããã¨ãªã ã®åºåã¯ä¸æ¬¡å ã§ããã¨ä»®å®ãã¦ãã¾ããï¼å¦ç¿çã¯å°ããæ£ã®å¤ã§ããã ã®æ£è² ã¯æ失é¢æ°ã«ãã£ã¦ç°ãªãã¾ãããã®è¨ç·´ãµã³ãã«ã«ã¨ã£ã¦ãåºåã大ããããã¹ãã§ããã° ã¯æ£ã®å¤ãåããåºåãå°ããããã¹ãã§ããã° ã¯è² ã®å¤ãåãã¾ãããã¡ã¤ã³ãã¥ã¼ãã³ã°ã«ãããã©ã¡ã¼ã¿ ãå¾ãããã¨ãã¾ããè¨ç·´ã®éç¨ã§ãã¾ããã©ã¡ã¼ã¿ãåããªãã£ãã¨ãã ã®å¨ãã® ã«ã¤ãã¦ã®ãã¤ã©ã¼å±éã«ããä¸æè¿ä¼¼ããã¨ããã©ã¡ã¼ã¿ ã表ãé¢æ°ã¯
ã¾ããã¢ãã«ãã©ã¡ã¼ã¿æ´æ°ã®å±¥æ´ãå±éããã¨ã
ç¹å®ã®ãã¹ããµã³ãã« ã«ã¤ãã¦ã®è°è«ã§ã¯ãªããé¢æ°
ããã¾ã§ããã¨ã(1) è¨ç·´ãã¼ã¿ã(2) ã¢ãã«ãã©ã¡ã¼ã¿ã(3) ã¢ãã«ã表ãé¢æ°ãã®é¢ä¿ãæçã«ãªãã¾ããè¨ç·´ãã¼ã¿ã¯ ã¨ããé ãéãã¦ãã¢ãã«ãã©ã¡ã¼ã¿ã«å½±é¿ãä¸ãã¾ããããã¯ãé¢æ°å¤ãå¤åãããã¹ãéã«å¿ãã¦ãè¨ç·´ãã©ã¡ã¼ã¿ã«ãã®ãµã³ãã«ã®ãã¥ã¼ã©ã«ã¿ã³ã¸ã§ã³ãã«ã¼ãã«ç¹å¾´é ã足ãåããããã¨ã«å¯¾å¿ãã¦ãã¾ããã¾ããããã¯ã¢ãã«ã表ãé¢æ°ã¨ãã観ç¹ã§ã¯ãã«ã¼ãã«å¤ ã«å¿ãã¦å¤ãä¸ä¸ããã¦ãããã¨ã«å¯¾å¿ãã¦ãã¾ãã
ã¿ã¹ã¯ A ã®è¨ç·´ãã¼ã¿ ã§è¨ç·´ããã¢ãã«ãã©ã¡ã¼ã¿ ã«ã¯ã ä¸ã®è¨ç·´ãã¼ã¿ã®ãã¥ã¼ã©ã«ã¿ã³ã¸ã§ã³ãã«ã¼ãã«ç¹å¾´éãå ãããã¦ãã¾ããã¿ã¹ã¯ A ã®ã¿ã¹ã¯ãã¯ãã«ã¯ããã¥ã¼ã©ã«ã¿ã³ã¸ã§ã³ãã«ã¼ãã«ç¹å¾´éã®éã¿ä»ãåã«ãªãã¾ãã
ã¿ã¹ã¯ A ã®ã¿ã¹ã¯ãã¯ãã«ã¨ã¿ã¹ã¯ B ã®ã¿ã¹ã¯ãã¯ãã«ã®åã¯ãã¿ã¹ã¯ A ã®ãã¼ã¿ã»ããã¨ã¿ã¹ã¯ B ã®ãã¼ã¿ã»ããã®ãã¥ã¼ã©ã«ã¿ã³ã¸ã§ã³ãã«ã¼ãã«ç¹å¾´éã®åéåãå ãããã¦ãããã¨ã«ãªãã¾ããããããã¿ã¹ã¯ãã¯ãã«ã®åã«ããå¦ç¿ãèµ·ããç´è¦³çãªèª¬æã«ãªãã¾ãã
ã¾ããã¢ãã«ã¹ã¼ãã«ã¤ãã¦ããã¹ã¼ãã«ãã¥ã¼ã©ã«ã¿ã³ã¸ã§ã³ãã«ã¼ãã«ç¹å¾´éã追å ãã¦ãã£ã¦ããã¨è¦ãªããã¨ãã§ãã¾ããè¨ç·´ãã¼ã¿ã¯åããªã®ã§ãããã§ã¯æ©æµã¯èª¬æãããã¦ãã¾ããããç´è¦³çã«ã¯ãã¤ãã¼ãã©ã¡ã¼ã¿æ¯ã«å°ãç°ãªãè§åº¦ã§ç¹å¾´éã足ãåããããããã«ããã¹ãã«é¡ä¼¼åº¦ã測ããããã«ãªã£ã¦ããã¨èãããã¨ãã§ãã¾ãã
ãããã«
深層ã¢ãã«ã®ãã©ã¡ã¼ã¿ã¯æå³ã®ãªãæ°å¤ã®ç¾ åã®ããã«èããã¡ã§ãããããåæããã¨çè«çã«ãå®é¨çã«ãæ·±ãæå³ãããã¨ããã®ã¯ã¨ã¦ãé¢ç½ãã¨æãã¾ããããã 1~2 å¹´ã§æ¥éã«çºå±ãé²ãã ãããã¯ã§ãããä»å¾ãé¢ç½ãç 究çµæãã©ãã©ãåºã¦ãããã«æãã¾ãã
æèã深層ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ã®é«éåãã«ã¦æ¬ç¨¿ã®å 容ãå¤§å¹ ã«å¢è£ãã¾ãããæ¬ç¨¿ã«èå³ãæã£ãæ¹ã¯ãã¡ããåç §ããã ããã¨å¬ããã§ãã
é£çµ¡å : @joisino_ / https://joisino.net