æ¬è¨äºã¯ãå½ç¤¾ãªã¦ã³ãã¡ãã£ã¢ãDoorsãã«ç§»è»¢ãã¾ããã
ç´5ç§å¾ã«èªåçã«ãªãã¤ã¬ã¯ããã¾ãã
"); // ãªãã¤ã¬ã¯ã ããsetTimeout("redirect()", 5000);ã// 5 sec ããfunction redirect(){ ãã ãlocation.href = url; ãã} ã ã// canonical ã®æ¸ãæã ããvar link = document.getElementsByTagName("link")[0]; ããlink.href = url;
- åé¡æè
- ãã¸ãã¹ã§å©ç¨ããéã«ä¹ãè¶ããã¹ãå£
- PEFTã¨ã¯ä½ãï¼
- ãã¼ã¯ã³è¿½å å
- Adapterå
- LoRAå
- ã¾ã¨ã
- åèæç®
ããã«ã¡ã¯ãã¢ããªãã£ã¯ã¹ãµã¼ãã¹é¨ã®è¾»ã§ãã
ä»åã¯ãLLMãå¹ççã«åå¦ç¿ããææ³ã¨ãã¦ä»å¾éè¦æ§ãé«ã¾ã£ã¦ããã¨èããããPEFT(Parameter-Efficient Fine Tuning)ã¨ããæè¡ã«ã¤ãã¦ç´¹ä»ãããã¨æãã¾ãã
åé¡æè
- ç¹å®ã®ç®çãäºæ¥ãã¡ã¤ã³ã«å³ãã¦LLMãç¨ããããã«ã¯Fine Tuningãå¿ è¦
- æ°ååãã©ã¡ã¼ã¿ã¼ãæã¤LLMãFull Fine Tuning(Full FT)ã«ãã£ã¦åå¦ç¿ããã®ã¯æéçã«ãã³ã¹ãçã«ãç¾å®çã§ã¯ãªããããå¹ççãªåå¦ç¿ã¢ããã¼ãã®æ¤è¨ãå¿ è¦
ChatGPT*1ããªãªã¼ã¹ããã¦ä»¥æ¥ã大è¦æ¨¡è¨èªã¢ãã«(LLM)ãä¸çä¸ãã注ç®ãéãã¦ãã¾ãã
ChatGPTãBERTãªã©ã®ã¢ãã«ã¯ãè¨å¤§ãªããã¹ãããä¸è¬çãªè¨èªç¥èãç²å¾ãããã¾ãã¾ãªã¿ã¹ã¯ã«å¯¾å¿ã§ããè½åãæã£ã¦ãã¾ããä¾ãã°ãChatGPTãAlpaca*2ãªã©ã®LLMã¯ãè¦ç´ã翻訳ãåé¡ãçæãªã©ã®å¤ãã®ã¿ã¹ã¯ãè¡ããã¨ãã§ãã¾ãã
ããããLLMããã®ã¾ã¾å©ç¨ããã ãã§ã¯ãç¹å®ã®ç®çãèªç¤¾ã®ãã¸ãã¹ã«é©ããæ§è½ãçºæ®ããã¨ã¯éãã¾ãããLLMãå¦ç¿ãã¦ããã®ã¯WEBä¸ã§å¾ããããã¼ã¿ããã¼ã¿åããã¦ããè«æãä¼è©±ãä¸å¿ã§ããããã¸ãã¹ã«é¢ããæ£ç¢ºãªç¥èã¯ç²å¾ã§ãã¦ããªãã¨èãããã¾ããããã§ãLLMã追å ã®ãã¼ã¿ãã¿ã¹ã¯ã«åããã¦å¦ç¿ããããã¨ã§ãèªç¤¾ã®ãã¸ãã¹ãã¯ã¼ã¯ããã¼ã«æ²¿ãããã«ã¢ãã«ãã«ã¹ã¿ãã¤ãºããæ¹æ³ãå¿ è¦ã¨ãªãã¾ãããã®æ¹æ³ããã¡ã¤ã³ãã¥ã¼ãã³ã°(Fine Tuning)ã¨å¼ã³ã¾ãã
ãã¡ã¤ã³ãã¥ã¼ãã³ã°ãè¡ããã¨ã§ãLLMã®ãããªäºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ãä¸é¨ã¾ãã¯å ¨ä½çã«æ´æ°ããã¢ãã«ã®æ¯ãèããå¤åããã¾ãã
ããããäºåå¦ç¿æ¸ã¿ã¢ãã«ãç¹å®ã®ä¸æµã¿ã¹ã¯*3ã«é©å¿ãããããã«ã¯ãé常ãã¢ãã«ã®å ¨ãã©ã¡ã¼ã¿ãåå¦ç¿ããå¿ è¦ãããã¾ããããããç¾å¨ã®äºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ã¯æ°ååãè¶ ãããã®ãåºã¦ãã¦ããã容æã«ã¯å®è¡ã§ããªãç¶æ ã¨ãªã£ã¦ãã¾ãã
LLMã大è¦æ¨¡ãªäºåå¦ç¿ã¢ãã«ãç¨ãããµã¼ãã¹ããã¸ãã¹ãåºã社ä¼ã«æ®åãã¦ããã«ã¯ãé常ã®ãã¡ã¤ã³ãã¥ã¼ãã³ã°(Full FT)ã¨ã¯ç°ãªãå¹ççãªãã¡ã¤ã³ãã¥ã¼ãã³ã°ã®ã¢ããã¼ããå¿ è¦ã¨ããã¦ããã社ä¼çãªãã¼ãºã«å¿ããããã«ç¾å¨PEFT(Parameter-Efficient Fine-Tuning)ã大ããªçºå±ãéãã¦ãã¾ãã
ãªããæ¬è¨äºã§ã¯é常ã®ãã¡ã¤ã³ãã¥ã¼ãã³ã°ææ³ãFull FT(Full Fine Tuning)ã¨å¼ã³ãå¹ççãªãã¡ã¤ã³ãã¥ã¼ãã³ã°ææ³ã§ããPEFT(Parameter-Efficient Fine-Tuning)ã¨åºå¥ãã¾ãã
ãã¸ãã¹ã§å©ç¨ããéã«ä¹ãè¶ããã¹ãå£
PEFTã®å ·ä½çãªã¢ããã¼ããç´¹ä»ããåã«Full FTã®æ§è³ªã«ã¤ãã¦æ´çãã¦ããã¾ãã
Full FTã¯é«ã精度ã¨æ±åæ§è½ãéæã§ããåé¢ããã¸ãã¹ãå®ç¤¾ä¼ã§ã®å©ç¨ãèããå ´åã«ã¯ä¹ãè¶ããã¹ãéå£ãè¤æ°åå¨ãã¾ããä»®ã«ãä¼æ¥åºæã®ã¿ã¹ã¯ãäºæ¥ãã¡ã¤ã³ãé©ç¨ãããããã«Full FTãå©ç¨ããã®ã§ããã°ãä¸è¨ã®ãããªåé¡ã«æ©ã¾ããããã¨ã«ãªãã§ãããã
å ¨ãã©ã¡ã¼ã¿ã®æ´æ°ã«ã¯è¨å¤§ãªè¨ç®ã³ã¹ããæãã
- ã¢ãã«ã®ãµã¤ãºã大ãããªãã«ã¤ãã¦ãè¨ç®ã³ã¹ããã¡ã¢ãªã³ã¹ããå¢å¤§ããå®è¡ãå°é£ã«ãªã
- ä¾ãã°ãGPT-3ã¯1750åãã©ã¡ã¼ã¿ãæã£ã¦ããããè¨å¤§ãªæéã¨ãªã½ã¼ã¹ãå¿ è¦ã§ãã大è¦æ¨¡ãªè¨ç®ãªã½ã¼ã¹ãæãã¦ããä¼æ¥ãå£ä½ã§ãªããã°ããããå®è¡ãä¸å¯è½
å£æ» çå¿å´ãçºçãããªã¹ã¯ããã
- ã¢ãã«ã®ãã©ã¡ã¼ã¿ãæ°ããæ¦å¿µãã¿ã¹ã¯ãå¦ç¿ããéç¨ã§ãäºåå¦ç¿æã«ç²å¾ããä¸è¬çãªè¨èªç¥èã失ãå£æ» çå¿å´ã¨å¼ã°ããäºè±¡ãçºçããå¯è½æ§ããã
- å£æ» çå¿å´ãèµ·ãããã¨ã§ãæå³ããªãã¢ãã«ã®æ±ç¨æ§ã転移æ§è½ã®ä½ä¸ãçºçãããã¨ã«ã¤ãªãã
å¦ç¿ãã¼ã¿ãå°ãªãå ´åã«ãªã¼ãã¼ãã£ãããããã
- æ°ããªã¿ã¹ã¯ã®é©ç¨ã®ããã«ç¨æã§ãããã¼ã¿ãå°ãªãä¸ã§å ¨ã¦ã®ãã©ã¡ã¼ã¿ãæ´æ°ããå ´åã«ã¯éå¦ç¿(ãªã¼ãã¼ãã£ãã)ãèµ·ããå¯è½æ§ãé«ããã¿ã¹ã¯åºæã®ç¥èã«ä¾åãã¦ãã¾ãå¯è½æ§ããã
ã¢ãã«ãµã¤ãºã大ãã管çéç¨ã³ã¹ãã大ãã
- ç°ãªãã¿ã¹ã¯ã«å¯¾ãã¦Full FTãè¡ããã³ã«ãã¢ãã«ã®ãã©ã¡ã¼ã¿ãä¿åããããã«å¤ãã®ã¹ãã¬ã¼ã¸ã¹ãã¼ã¹ãå¿ è¦
- ã¾ããã¢ãã«ã®ãµã¤ãºã大è¦æ¨¡ã§ããããã¢ãã«ã®ç®¡çãå ±æãå°é£
ãããã®åé¡ã解決ããããã«ãPEFTã¨å¼ã°ããææ³ãææ¡ããã¦ãã¾ãã
PEFTã¨ã¯ä½ãï¼
- PEFTã¯ä¸é¨ã®ãã©ã¡ã¼ã¿ã ãããã¡ã¤ã³ãã¥ã¼ãã³ã°ããã¢ããã¼ã
- Full FTã¨åç¨åº¦ã®ç²¾åº¦ãä¿ã¡ã¤ã¤è¨ç®ã³ã¹ããåæ¸ããåªããæ±åæ§è½ãç²å¾ãããã¨ãå¯è½
- ç¾å¨çºå±ãã¦ããPEFTã®ã¢ããã¼ãã¯å¤§ãã3ã¤ã®ã³ã³ã»ããã«åé¡å¯è½
- ãã¼ã¯ã³è¿½å å
- Adapterå
- LoRAå
PEFTã¨ã¯ãLLMã®ãããªäºåå¦ç¿æ¸ã¿ã¢ãã«ããæ°ããã¿ã¹ã¯ã«å¹ççã«é©å¿ãããããã®ææ³ã§ããã¢ãã«ã®å ¨ä½ã§ã¯ãªããä¸é¨ã®ãã©ã¡ã¼ã¿ã ãããã¡ã¤ã³ãã¥ã¼ãã³ã°ãããã¨ã§ãFull FTã®æã¤åé¡ã解決ã§ããææ³ã¨ãã¦æ³¨ç®ãéãã¦ãã¾ãã
PEFTã®ã¢ããã¼ãã¯æ°å¤ãææ¡ããã¦ãã¾ãããã©ã®ã¢ããã¼ãã«ãå ±éããç¹å¾´ããããæ§è½é¢ã§ã管çéç¨é¢ã§ããã¸ãã¹é©ç¨ã«ããã¦å¥½ã¾ããæ§è³ªãæã£ã¦ãã¾ãã
è¨ç®ã³ã¹ãã¨ã¹ãã¬ã¼ã¸ã³ã¹ããåæ¸
- PEFTã§ã¯ãäºåå¦ç¿æ¸ã¿ã¢ãã«ã®ã»ã¨ãã©ã®ãã©ã¡ã¼ã¿ãåçµããã¾ã¾ãå°æ°ã®è¿½å ã¢ãã«ãã©ã¡ã¼ã¿ã®ã¿ããã¡ã¤ã³ãã¥ã¼ãã³ã°ãããããè¨ç®ã³ã¹ãã¨ã¹ãã¬ã¼ã¸ã³ã¹ããå¤§å¹ ã«åæ¸ãããã¨ãå¯è½
è´å½çãªå¿å´ãæå¶
- Full FTã®å ´åã«ã¯ãã¢ãã«ãäºåå¦ç¿ã§å¦ãã ç¥èãå¿ãã¦ãã¾ãå£æ» çå¿å´ãçºçãããã¨ãããããPEFTã§ã¯éãããæ°ã®ãã©ã¡ã¼ã¿(0.1%~æ°%)ãæ´æ°ããã ãã«çã¾ããããã®åé¡ãåé¿ã§ãã
æªç¥ã®ç¶æ³ã«å¯¾ããåªããæ±åæ§
- PEFTã§ã¯ãå¦ç¿ã«ç¨ãããã¨ãå¯è½ãªãã¼ã¿ãéããã¦ããç¶æ³ã§ã¯Full FTãããåªããæ§è½ãçºæ®ããæªç¥ã®ç¶æ³ã«å¯¾ãã¦ãåªããæ±åæ§ãçºæ®ãããã¨ã示ããã¦ãã
ã¢ãã«ãµã¤ãºãå°ãã管çéç¨ã³ã¹ããå°ãã
- Full FTã§ã¯å¤§è¦æ¨¡ãªã¢ãã«(æ°åGB)ãä½æãããã®ã«å¯¾ãã¦ãPEFTã§ã¯æ°MBã®å°ããªã¢ãã«ã§æ¸ããã¨ãå¤ã
- PEFTã§å¦ç¿ããéã¿ã¯ãã¢ãã«å ¨ä½ã交æãããã¨ãªããè¤æ°ã®ã¿ã¹ã¯ã«ç°¡åã«å°å ¥å¯è½
PEFTã®ã³ã³ã»ããåé¡
PEFTã¯çºå±ä¸ã®æè¡ã§ãããããä»å¾ãæ§ã ãªã¢ããã¼ããææ¡ããããã¨ãèãããã¾ãããç¾æç¹(2023å¹´5æ)ã§çºè¡¨ããã¦ããã¢ããã¼ãã¯ã³ã³ã»ããã®éãã«ãã£ã¦ï¼ã¤ã«åé¡ãå¯è½ã§ããã¨èãã¾ãã
ãã¼ã¯ã³è¿½å å
- å ¥å層ã«ä»®æ³ãã¼ã¯ã³ã追å ãããã¨ã§ãç¹å®ã®ã¿ã¹ã¯ã«åºæã®ç¹å¾´ãå¦ç¿
- äºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿èªä½ã¯æ´æ°ããåçµããã
- è¨èªç解ãããã¹ãã®ç¿»è¨³ã»è¦ç´ã¿ã¹ã¯ã«å¯¾ãã¦ã®é©ç¨ãå¯è½ã§ç»ååé¡ãªã©ã«ã¯é©ç¨ã§ããªã
Adapterå
- äºåå¦ç¿æ¸ã¿ã¢ãã«ã®å¤é¨ã«ç¹æ®ãªãµãã¢ã¸ã¥ã¼ã«ã追å ããã©ã¡ã¼ã¿ãæ´æ°
- Adapterã®ãã©ã¡ã¼ã¿ãæ´æ°ããå ã®äºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ã¯åçµããã
- è¨èªç解ãããã¹ãã®ç¿»è¨³ã»è¦ç´ã¿ã¹ã¯ã«å¯¾ãã¦ã®é©ç¨ãå¯è½
LoRAå
- äºåå¦ç¿æ¸ã¿ã¢ãã«èªä½ã¯åçµãããä½ã©ã³ã¯è¡åã®ã¿ãæ´æ°
- ä½ã©ã³ã¯è¡åãæ´æ°ããä¸ã§å ã®äºåå¦ç¿æ¸ã¿ã¢ãã«ã®éã¿ãå ç®ããã©ã¡ã¼ã¿ãæ´æ°ãã
- è¨èªç解ãããã¹ãã®ç¿»è¨³ã»è¦ç´ã¿ã¹ã¯ã ãã§ãªããç»åçæãç»ååé¡ã®ã¿ã¹ã¯ã«ãé©ç¨å¯è½
以éã¯ã³ã³ã»ããæ¯ã«ã©ã®ãããªã¢ããã¼ãããããã解説ãã¦ããã¾ãã
ãã¼ã¯ã³è¿½å å
ãã¼ã¯ã³è¿½å åã«å±ããã¢ããã¼ãã¯ãäºåå¦ç¿æ¸ã¿ã¢ãã«ã®éã¿ããã¡ã¤ã³ãã¥ã¼ãã³ã°ãã代ããã«ãå
¥å層ã«ä»®æ³ãã¼ã¯ã³ã追å ãããã¨ã§ãç¹å®ã®ã¿ã¹ã¯ã«åºæã®ç¹å¾´ãå¦ç¿ããã¢ããã¼ãã¨ãªãã¾ãã
Huggingfaceã§ã¯ãPrefix Tuningã¨P Tuningã¨Prompt Tuningã®3ã¤ã®ã¢ããã¼ãããã§ã«å®è£
ããã¦ãã¾ããäºåå¦ç¿æ¸ã¿ã¢ãã«ã®éã¿ãæ´æ°ããªãã¢ããã¼ãã¨ããç¹ã¯å
±éãã¦ãã¾ãããã©ã®ãããªå½¢ã§å¦ç¿ãè¡ããã¯ã¢ããã¼ããã¨ã«ç¹å¾´ãããã¾ãã®ã§ãããããç°¡åã«ãç´¹ä»ãã¦ããã¾ãã
Prefix Tuning
ã¢ããã¼ãæ¦è¦
Prefix Tuning*4ã¯äºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ãåçµããå ¥åã®å é ã«kåã®é£ç¶ããã¿ã¹ã¯åºæã®ãã¯ãã«ï¼ãã¬ãã£ãã¯ã¹ï¼ã追å ããä¸ã§ããã¬ãã£ãã¯ã¹ã®ã¿ãå¦ç¿ããäºåå¦ç¿æ¸ã¿ã¢ãã«ã®å層ã«å ç®ããææ³ã§ã(ä¸å³åç §)ããã¬ãã£ãã¯ã¹èªä½ã¯å¦ç¿å¯è½ãªãã©ã¡ã¼ã¿ã§ããããããã¬ãã£ãã¯ã¹ã«ã¿ã¹ã¯åºæã®æ å ±ãä¸ãããã¨ã«ãªãã¾ãã
ã¾ãããã¬ãã£ãã¯ã¹ã¯ãã¿ã¹ã¯ã«å¿ãã¦ãã¡ã¤ã³ãã¥ã¼ãã³ã°ããããããã¢ãã«ã®åºåãã¿ã¹ã¯ã«é©ãããã®ã«å¤ãããã¨ãã§ãã¾ããä¸å³ã§ã¯ç¿»è¨³ç¨ã®ãã¬ãã£ãã¯ã¹ãè¦ç´ç¨ã®ãã¬ãã£ãã¯ã¹ãä»ä¸ãããã¬ãã£ãã¯ã¹ãã¿ã¹ã¯åºæã®æ å ±ãç²å¾ãã¦ãã¾ãã
è«æã®å³è¡¨.1ããæç²
Prefix Tuningã¯ãäºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿æ°ã«æ¯ã¹ã¦ãé常ã«å°ãªããã©ã¡ã¼ã¿æ°ã§ã¿ã¹ã¯ãå¦ç¿ã§ãããããå¹ççãªãã¡ã¤ã³ãã¥ã¼ãã³ã°ãå¯è½ã¨ãªã£ã¦ãã¾ãã
ç¹å¾´
ãã¼ã¯ã³è¿½å åã®ã¢ããã¼ãå ¨ã¦ã«å ±éããç¹ã§ã¯ããã¾ãããäºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ãåçµãããã¨ã§ãå¦ç¿ã«å¿ è¦ãªè¨ç®è³æºãæéãå¤§å¹ ã«åæ¸ã§ãã¾ãã
ãã¬ãã£ãã¯ã¹ã¯é£ç¶çãªãã¯ãã«ã§ãããå®éã®ãã¼ã¯ã³ã§ã¯ãªããããèªå½ãµã¤ãºãå ¥åé·ã«å¶éãããªãã¨ããã大ããªç¹å¾´ã¨ãªãã¾ãã
ã¾ãããã¬ãã£ãã¯ã¹ã¯ã¿ã¹ã¯åºæã®ãã®ã§ãããè¤æ°ã®ã¿ã¹ã¯ã«å¯¾ãã¦åãäºåå¦ç¿ã¢ãã«ã使ãåããã¨ãã§ãã¾ãã
ãã©ã¡ã¼ã¿å ¨ä½ã®ãã¡0.1%ã®ãã©ã¡ã¼ã¿ãå¦ç¿ããã ãã§ãPrefix Tuningã¯Full FTã¨åçã®æ§è½ãç²å¾ãããã¼ã¿éã®å°ãªãå ´åã§ã¯Full FTã®ç²¾åº¦ãä¸åã£ã¦ãããã¨ãè«æå ã§å ±åããã¦ãã¾ãã
P Tuning
ã¢ããã¼ãæ¦è¦
P Tuning*5ã¨ã¯ãäºåå¦ç¿æ¸ã¿ã¢ãã«ã®å ¥åã«ããã©ã¡ã¼ã¿åãããããã³ããã追å ãããã¨ã§ãã¿ã¹ã¯åºæã®æ å ±ãä¸ããææ³ã§ããPrefix Tuningã¨åæ§ã«ããã³ããã¯ãã¿ã¹ã¯ã«å¿ãã¦ãã¡ã¤ã³ãã¥ã¼ãã³ã°ããããããè¨èªã¢ãã«ã®åºåãã¿ã¹ã¯ã«é©ãããã®ã«å¤ãããã¨ãã§ãã¾ãã
P Tuningãã¾ããäºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿æ°ã«æ¯ã¹ã¦ãé常ã«å°ãªããã©ã¡ã¼ã¿æ°ã§ã¿ã¹ã¯ãå¦ç¿ã§ãããããå¹ççãªãã¡ã¤ã³ãã¥ã¼ãã³ã°ãå¯è½ã¨ãªã£ã¦ãã¾ãã
ç¹å¾´
ã¢ãã«ã®ãã©ã¡ã¼ã¿æ°ã«ä¾åããªãããã大è¦æ¨¡ãªLLMã§ãä½ã³ã¹ãã§ãã¡ã¤ã³ãã¥ã¼ãã³ã°å¯è½ã§ããããã³ãããé£ç¶çãªãã¼ã¯ã³ã§ãããããLLMã®å ¥åé·å¶éã«å½±é¿ãããªãç¹ã大ããªç¹å¾´ã¨ãªãã¾ãã
P Tuningã¯ããã³ããã¨ã³ã¸ãã¢ãªã³ã°ã®æéãæ¸ãããè«æçºè¡¨æç¹ã§ã®SuperGLUEãã³ããã¼ã¯ã«ããã¦ãæå 端ã®ã¢ããã¼ããåé§ããæ§è½ãçºæ®ãã¾ããã
Prompt Tuning
ã¢ããã¼ãæ¦è¦
Prompt Tuning*6ã¯ãäºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ãåçµããäºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ãæ´æ°ãã代ããã«ãã½ããããã³ãããå¦ç¿ãã©ãã«ä»ãããããµã³ãã«ãåãè¾¼ããã¨ã§ãã¡ã¤ã³ãã¥ã¼ãã³ã°ãè¡ãã¢ããã¼ãã¨ãªãã¾ããä¸å³ã®ããã«Prompt Tuningã§ã¯ãã¿ã¹ã¯ãã¨ã«ã¢ãã«ã®ãã¥ã¼ãã³ã°ãè¡ã代ããã«ã¿ã¹ã¯ãã¨ã«å°è¦æ¨¡ãªããã³ãããä»ä¸ãããã¨ã§ãã¥ã¼ãã³ã°ãè¡ãã¾ãã
è«æã®å³è¡¨.2ããæç²
ç¹å¾´
ã¿ã¹ã¯ãã¨ã«ä»ä¸ãããã½ããããã³ããã¯ã¢ãã«ã«ä¾åããªããããé å¥ã«ãã¡ã¤ã³è»¢éãå¯è½ã§ããç¹ããã®ã¢ããã¼ãã®ç¹å¾´ã¨è¨ãã¾ãã
è«æã§ã¯ã¢ãã«ãµã¤ãºã大ãããªãã»ã©æ§è½ãæ¹åãããã¨ã示ããã¦ãããã¢ãã«ã®ãã©ã¡ã¼ã¿ãæ°ååãè¶ ããã¨å¤§ããªæ§è½æ¹åãè¦ããã¦ãã¾ãã
Adapterå
次ã«ãAdapterã¨å¼ã°ããç¹æ®ãªãµãã¢ã¸ã¥ã¼ã«ãäºåå¦ç¿æ¸ã¿ã¢ãã«ã®å¤é¨ã«è¿½å ããã¢ããã¼ãããç´¹ä»ãã¾ããAdapterã¯äºåå¦ç¿æ¸ã¿ã¢ãã«ã«ä»å ãããç¬ç«ããã³ã³ãã¼ãã³ãã¨ãã¦æ©è½ãããããAdapterã追å /é¤å»ããã ãã§äºåå¦ç¿æ¸ã¿ã¢ãã«ãåå©ç¨å¯è½ã§ãã
Huggingfaceã«ããã¦PEFTã®ã©ã¤ãã©ãªã¼ã¨ãã¦åãè¾¼ã¾ãã¦ããªããã®ã®LoRAã¨ã®ã³ã³ã»ããã®éããæ確ã«ããããã«Adapterã®ã¢ããã¼ãã«ã¤ãã¦ããç´¹ä»ãã¦ããã¾ãã
Adapter
ã¢ããã¼ãæ¦è¦
Adapter*7ã¯äºåå¦ç¿æ¸ã¿ã¢ãã«ã®å¤é¨ã«è¿½å ãããç¹æ®ãªãµãã¢ã¸ã¥ã¼ã«ã§ãä¸å³ã«ããããã«Transformerã®Multi-Head Attention層ã¨Feed -foward層ã®å¾ã«æ¿å ¥ãããã¨ã§ããã¡ã¤ã³ãã¥ã¼ãã³ã°ã®éã«äºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ãåçµããã¾ã¾Adapterã®ãã©ã¡ã¼ã¿ã®ã¿ãæ´æ°ããã¢ããã¼ãã¨ãªãã¾ãã
è«æã®å³è¡¨.2ããæç²
ç¹å¾´
Adapterã¯Transformerã®Multi-Head Attention層ã¨Feed Foward層ã®åºåãå ¥åã¨ãã¦åãåããåºåãçæããä»çµã¿ã¨ãªã£ã¦ããããTransformerã®æ©è½ã«ç´æ¥å½±é¿ãä¸ããªãæ§é ã¨ãªã£ã¦ãã¾ãã
æè»ã«Adapterã®æ¿å ¥ã¨åé¤ãã§ãã代ããã«ãæ¨è«ãè¡ããããã³ã«åçµãããéã¿ãæã¤Transformerã®åºåãAdapterãå ¥åã¨ãã¦åãåãåºåãè¿ãå¿ è¦ãããããæ¨è«ã«æéããããå¾åãããã¾ãã
LoRAå
æå¾ã«LoRAã¨å¼ã°ããä½ã©ã³ã¯è¡åã®ã¿ãæ´æ°ãããã¨ã§å¹ççã«ãã©ã¡ã¼ã¿ããã¡ã¤ã³ãã¥ã¼ãã³ã°ããã³ã³ã»ããã®ã¢ããã¼ãããç´¹ä»ãã¾ãã
Huggingfaceã®PEFTã®ã©ã¤ãã©ãªã¼ã¨ãã¦ã¯LoRAã¨Ada LoRAãå®è£ ããã¦ãã¾ããLoRAã¯è¨èªã¿ã¹ã¯ã®ã¿ã«çã¾ããç»åã¿ã¹ã¯ã«å¯¾ãã¦ãå¹ççãªãã¡ã¤ã³ãã¥ã¼ãã³ã°ãå¯è½ã¨ãªã£ã¦ãããé常ã«å¿ç¨ç¯å²ã®åºãã¢ããã¼ãã¨ãªã£ã¦ãã¾ãã
LoRA
ã¢ããã¼ãæ¦è¦
LoRA(Low-Rank Adaptation)*8ã¯ãäºåå¦ç¿æ¸ã¿ã¢ãã«ã®Attention層ã®ã¯ã¨ãªã¨ããªã¥ã¼ã«å¯¾ãã¦ãä½ã©ã³ã¯è¡åãé©ç¨ãããã¨ã§ããã©ã¡ã¼ã¿æ°ãå¤§å¹ ã«åæ¸ããªãããå ã®ã¢ãã«ã¨åçããã以ä¸ã®æ§è½ãéæããææ³ã§ãã
å ·ä½çã«ã¯ãä¸å³ã®ããã«äºåå¦ç¿æ¸ã¿ã¢ãã«ã®ãã©ã¡ã¼ã¿ãåçµããä¸ã§ãã¯ã¨ãªã¨ããªã¥ã¼ã®è¡åã«å¯¾ãã¦ä½ã©ã³ã¯è¡åãæãåããã¦æ°ããªè¡åãä½ãã¾ãããã®ä½ã©ã³ã¯è¡åã¯ããã¡ã¤ã³ãã¥ã¼ãã³ã°æã«å¦ç¿ãããå¯ä¸ã®ãã©ã¡ã¼ã¿ã¨ãªãã¾ãã
è«æ ã®å³è¡¨.1ããæç²
ç¹å¾´
LoRAã¯äºåå¦ç¿æ¸ã¿ã¢ãã«ã®99.9%以ä¸ã®ãã©ã¡ã¼ã¿ã¼ãåçµããä¸ã§æ°ããã¿ã¹ã¯ãå¦ç¿ããããã«ä½ã©ã³ã¯è¡åã®ã¿ãæ´æ°ãã¦ãã¾ããããã«ããFull FTã¨åçã®æ§è½ãå®ç¾ããä¸æ¹ã§ããã©ã¡ã¼ã¿æ°ã¨è¨ç®è¦ä»¶ãå¤§å¹ ã«åæ¸ãããã¨ã«æåãã¦ãã¾ãã
Adapter ã§ã¯ãäºåå¦ç¿æ¸ã¿ã®ã¢ãã«ã«ãã¬ã¼ãã³ã°å¯è½ãªãã©ã¡ã¼ã¿ã¼ã追å ããããããæ¨è«ã®å¾ ã¡æéãé·ããªãå¯è½æ§ãããã¾ããããLoRAã§ã¯ããã¬ã¼ãã³ã°å¯è½ãªä½ã©ã³ã¯è¡åãäºåå¦ç¿æ¸ã¿ã¢ãã«ã®å層ã«æ³¨å ¥ãããæ¨è«ä¸ã«åçµãããéã¿ã¨æ´æ°ãããéã¿ããã¼ã¸ã§ãããããæ¨è«ã®é 延ãçºçãã«ãããªãã¾ãã
LoRAãç¨ãããã¨ã§ãè¨ç®ãªã½ã¼ã¹ã¨ã¹ãã¬ã¼ã¸ã³ã¹ããã·ãã¢ã¨ãªãæ¬çªç°å¢ã«ããã¦ãLLMããããã¤ãããããªãã¾ãã
大ããªæç¨æ§ã®ããLoRAã§ããããã¤ã課é¡ãææããã¦ãã¾ãã
å¤æããä½ã©ã³ã¯è¡åãå ã®ã¢ãã«ã®è¡¨ç¾åãããç¨åº¦æãªãå¯è½æ§ãããç¹ããã©ã¡ã¼ã¿ã®æ´æ°ãå ã®äºåå¦ç¿æ¸ã¿ã¢ãã«ã«ä¾åãã¦ããããã«ããã®äºåå¦ç¿æ¸ã¿ã¢ãã«ã®æ§è½ã¨å¶éäºé ãåãç¶ãç¹ã課é¡ã¨ãã¦ææããã¦ãã¾ãã
Ada LoRA
ã¢ããã¼ãæ¦è¦
Ada LoRA*9ã¨ã¯ãAdaptive Low-Rank Adaptationã®ç¥ã§ãLoRAãæ¡å¼µããææ³ã§ããAda LoRAã¯ãä½ã©ã³ã¯è¡åã追å ããã ãã§ãªããå層ã«é©å¿çãªå¦ç¿çãä¸ãããã¨ã§ããã©ã¡ã¼ã¿æ°ãå¦ç¿æéãããã«åæ¸ãã¾ãã
Ada LoRAã¯ãLoRAãããé«ãæ§è½ãéæã§ããå¯è½æ§ãããã¾ããããã©ã¡ã¼ã¿æ°ãè¨ç®éãå¢å ãããã¨ãæ³å®ããã¾ãã
ç¹å¾´
äºåå¦ç¿æ¸ã¿ã®è¨èªã¢ãã«ã®ãã©ã¡ã¼ã¿ãåçµãã追å ããä½ã©ã³ã¯ãªè¡åã ããå¦ç¿ãããã¨ã§ãã©ã¡ã¼ã¿æ°ãè¨ç®éãæããã¨ããç¹ã¯ãLoRAã¨åãã§ãã層ã®éè¦åº¦ã«å¿ãã¦å¦ç¿çãå¤åããããã¨ã§ãéè¦ãªå±¤ã¯ããå¤ãå¦ç¿ããéè¦ã§ãªã層ã¯ããå°ãªãå¦ç¿ãã¦ããç¹ãæ¹è¯ç¹ã¨ãªãã¾ãã
ã¾ããAda LoRAã§ã¯ãã¡ã¤ã³ãã¥ã¼ãã³ã°æã«ä½¿ç¨ãããã¼ã¿ã»ãããã¿ã¹ã¯ã«å¿ãã¦ãæé©ãªä½ã©ã³ã¯è¡åãå¦ç¿çãèªåçã«æ±ºå®ãããã¨ãã§ãã¾ãã
ä¸æ¹ãLoRAã¨æ¯è¼ããã¨ä½ã©ã³ã¯è¡åãå¦ç¿çã決å®ããããã«å¿ è¦ãªãã¤ãã¼ãã©ã¡ã¼ã¿ãå¤ãããããã®ãã¥ã¼ãã³ã°ãé£ããå ´åãããã¾ãã
ã¾ã¨ã
ä»åã¯å¤§è¦æ¨¡ãªäºåå¦ç¿æ¸ã¿ã¢ãã«ã«å¯¾ãã¦ãå¹ççã«æ°ããªã¿ã¹ã¯ãé©ç¨å¯è½ã«ããPEFTã¨ããæè¡ãç´¹ä»ããã¦ããã ãã¾ãããæ¬è¨äºã§ã¯ãPEFTã«ã©ã®ãããªã¢ããã¼ããããã®ããä¸å¿ã«ãç´¹ä»ãã¾ããããä»å¾ã¯å®éã«PEFTãç¨ãã¦ã©ã®ç¨åº¦å¦ç¿ã³ã¹ããä½ä¸ããããã©ã®ç¨åº¦ã®ç²¾åº¦ãä¿ã¤ãã¨ãã§ãããã«ã¤ãã¦ããå ±åãã¦ããäºå®ã§ãã
åèæç®
*1:OpenAIã2022å¹´11æã«å ¬éããAIãã£ããããã
*2:ã¹ã¿ã³ãã©ã¼ã大å¦ã®ç 究ãã¼ã ãéçºãããªã¼ãã³ã½ã¼ã¹LLM
*3:ãã¦ã³ã¹ããªã¼ã ã¿ã¹ã¯ã¨ãè¨ããã
*4:Prefix Tuning: https://aclanthology.org/2021.acl-long.353.pdf
*5:P Tuning: https://arxiv.org/pdf/2103.10385.pdf
*6:Prompt Tuning: https://arxiv.org/pdf/2104.08691.pdf
*7:Adapter: https://arxiv.org/pdf/1902.00751.pdf
*8:LoRA: https://arxiv.org/pdf/2106.09685.pdf
*9:Ada LoRA:https://arxiv.org/pdf/2303.10512.pdf