ãLLMè«æãèªãã人éçãªã¨ãã½ã¼ãè¨æ¶ãæã¤ã³ã³ããã¹ãç¡éã®LLM
- è¨èªã¢ãã«ã®ãè¨æ¶ãã管çããæ¹æ³ã¨ãã¦ã¯ãï¼ï¼è¿½å å¦ç¿ãç¥èç·¨éã«ãã£ã¦LLMèªä½ã®ç¥èæ´æ°ã試ã¿ãæ¹æ³ã¨ãï¼ï¼å¤é¨ã®è¨æ¶ãã¼ã¿ããå¿ è¦æ å ±ãé©ææ¤ç´¢ãã¦ã¢ãã«ã«æ¸¡ãæ¹æ³ãã®2ã¤ã®æ¹åæ§ãããã¾ãã
- ãã®è«æã¯å¾è ã®ç³»çµ±ã«å±ããç 究ã§ã人éã®ã¨ãã½ã¼ãè¨æ¶ã®ããæ¹ãåç §ããå¤é¨ãã¼ã¿ãå¹æçã«æ´çã»æ¤ç´¢ãããã¨ãç®æããç 究ã®ããã§ãã
- ã¿ã¤ãã«ã¯"Human-like Episodic Memory for Infinite Context LLMs"ã§ãHuaweiã®ç 究æã¨UCLã®ç 究è ãã«ãã£ã¦2024å¹´7æã«æ稿ããã¦ãã¾ãã
è¦æ¨
大è¦æ¨¡è¨èªã¢ãã«ï¼LLMï¼ã¯ç®è¦ã¾ããè½åã示ãã¦ããããåºç¯ãªã³ã³ããã¹ãã®å¦çã«ã¯ã¾ã è¦æ¦ãã¦ãããé·ãã·ã¼ã±ã³ã¹ã«ããã£ã¦ä¸è²«æ§ã¨æ£ç¢ºããç¶æããè½åã«ã¯éçããããããã¨ã¯å¯¾ç §çã«ã人éã®è³ã¯ãç涯ã«ãããè¨å¤§ãªæéã¹ã±ã¼ã«ã®ã¨ãã½ã¼ãçµé¨ãæ´çããæ¤ç´¢ãããã¨ã«åªãã¦ãããæ¬ç 究ã§ã¯ã人éã®ã¨ãã½ã¼ãè¨æ¶ã¨äºè±¡èªç¥ã®ä¸»è¦ãªå´é¢ãLLMã«çµ±åããè¨ç®å¹çãç¶æããªããå®è³ªçã«ç¡éã®ã³ã³ããã¹ãé·ãå¹æçã«æ±ããããã«ããæ°ããã¢ããã¼ããEM-LLMããç´¹ä»ãããEM-LLMã¯ããã¤ã¸ã¢ã³ã»ãµãã©ã¤ãºã¨ã°ã©ãçè«ã«åºã¥ãå¢çç²¾å¯åã®çµã¿åããã«ããããã¼ã¯ã³ã®ã·ã¼ã±ã³ã¹ããªã³ã©ã¤ã³æ¹å¼ã§é¦å°¾ä¸è²«ããã¨ãã½ã¼ãã¤ãã³ãã«æ´çãããå¿ è¦ã«å¿ãã¦ããããã®ã¤ãã³ãã¯ãé¡ä¼¼æ§ã«åºã¥ãæ¤ç´¢ã¨æéçã«é£ç¶ããæ¤ç´¢ãçµã¿åããã2段éã®è¨æ¶ããã»ã¹ã«ãã£ã¦æ¤ç´¢ãããé¢é£æ å ±ã¸ã®å¹ççã§äººéçãªã¢ã¯ã»ã¹ãå®ç¾ãããLongBenchãã¼ã¿ã»ãããç¨ããå®é¨ã§ã¯ãEM-LLMã®åªããæ§è½ãå®è¨¼ãããæ§ã ãªã¿ã¹ã¯ã«ããã¦ãææ°ã®InflLLMã¢ãã«ã4.3%ã®ç¸å¯¾çæ¹åã§ä¸åã£ããããã«ãæã ã®åæã«ãããEM-LLMã®ã¤ãã³ãã»ã°ã¡ã³ãã¼ã·ã§ã³ã¨äººéãèªèããã¤ãã³ãã¨ã®éã«å¼·ãç¸é¢ããããã¨ãæããã«ãªãããã®äººå·¥çãªã·ã¹ãã ã¨çç©å¦çãªå¯¾å¿ç©ã¨ã®éã®æ©æ¸¡ãã示åãã¦ãããæ¬ç 究ã¯ãLLMã®æ¡å¼µæèå¦çè½åãåä¸ãããã ãã§ãªãã人éã®è¨æ¶ã¡ã«ããºã ãæ¢æ±ããããã®è¨ç®ãã¬ã¼ã ã¯ã¼ã¯ãæä¾ããAIã¨èªç¥ç§å¦ã®å¦éçç 究ã«æ°ããªéãéããã®ã§ããã
Claudeã«ãã解説
Q.å 容ããããããã説æãã¦
ãã®è«æã¯ã大è¦æ¨¡è¨èªã¢ãã«(LLM)ã®ãã³ã°ã³ã³ããã¹ãå¦çè½åãåä¸ãããæ°ããã¢ããã¼ããEM-LLMããææ¡ãã¦ãã¾ãã主ãªãã¤ã³ãã¯ä»¥ä¸ã®éãã§ã:
1. åé¡ç¹:
  - ç¾å¨ã®LLMã¯é·ãã³ã³ããã¹ãã®å¦çã«èª²é¡ãããã
  - å¦ç¿æã®ã³ã³ããã¹ãé·ãè¶
ããã¨æ§è½ãä½ä¸ããã
2. ææ¡ææ³ (EM-LLM):
  - 人éã®èªç¥ç§å¦ãç¹ã« episodic memory(ã¨ãã½ã¼ãè¨æ¶) 㨠event cognition(ã¤ãã³ãèªç¥) ã®ç¥è¦ãåãå
¥ããã
  - å
¥åãæå³ã®ããã¤ãã³ãã«åå²ããå¹ççã«ä¿åã»æ¤ç´¢ããä»çµã¿ãå°å
¥ã
3. 主ãªç¹å¾´:
  - Surprise-based segmentation: äºæ¸¬å°é£ãªç®æã§ã¤ãã³ããåºåãã
  - Boundary refinement: ã°ã©ãçè«ãç¨ãã¦ã¤ãã³ãå¢çãæé©åã
  - Two-stage retrieval: é¡ä¼¼æ§ã¨æéçé£ç¶æ§ã«åºã¥ãè¨æ¶æ¤ç´¢ã
4. çµæ:
  - LongBenchãã¼ã¿ã»ããã§æ¢åã®æé«æ§è½ã¢ãã«(InfLLM)ãä¸åãæ§è½ãéæã
  - ç¹ã« PassageRetrieval ã¿ã¹ã¯ã§33%ã®æ§è½åä¸ã
  - 人éã®ç¥è¦ããã¤ãã³ãåºåãã¨ã®é«ãç¸é¢ã示ããã
5. æ義:
  - äºå®ä¸ç¡éã®é·ãã®æèãæ±ããLLMã®å¯è½æ§ã示ããã
  - èªç¥ç§å¦ã¨AIãæ©æ¸¡ããã人éã®è¨æ¶ã¡ã«ããºã ã®è¨ç®ã¢ãã«ãæä¾ã
6. å°æ¥ã®å±æ:
  - ããé«åº¦ãªã¤ãã³ãåå²ã¢ã«ã´ãªãºã ã®æ¢æ±ã
  - ãã«ãã¢ã¼ãã«ã¿ã¹ã¯ã¸ã®æ¡å¼µã
  - å¼·åå¦ç¿ãç¶ç¶å¦ç¿ã¸ã®å¿ç¨ã
ãã®ç 究ã¯ãLLMã®é·æèå¦çè½åãå¤§å¹ ã«åä¸ãããã ãã§ãªãã人éã®èªç¥ããã»ã¹ãAIã«åãå ¥ããæ°ããã¢ããã¼ãã示ããç¹ã§éè¦ã§ãã
Q.ã³ã³ããã¹ãå¦çæ¹æ³ã詳ãã
EM-LLMã®ã³ã³ããã¹ãå¦çæ¹æ³ãããå°ã詳ãã説æãã¾ãããï¼
1. ã³ã³ããã¹ãã®åå²:
  - ã³ã³ããã¹ãããã¤ãã³ããã¨å¼ã°ããæå³ã®ããã»ã°ã¡ã³ãã«åå²ãã¾ãã
  - ãã®åå²ã¯ãã¾ãããµãã©ã¤ãºãï¼ã¢ãã«ã®äºæ¸¬èª¤å·®ï¼ã«åºã¥ãã¦è¡ããããã®å¾ã°ã©ãçè«çãªææ³ã§æé©åããã¾ãã
2. ã¤ãã³ãã®ä¿å:
  - åå²ãããã¤ãã³ãã¯ãåã¤ãã³ãå
ã®ãã¼ã¯ã³ã®key-valueãã¢ã¨ã¨ãã«ä¿åããã¾ãã
3. ã¡ã¢ãªæ¤ç´¢:
  - æ°ãããã¼ã¯ã³ãçæããéãé¢é£ããã¤ãã³ããæ¤ç´¢ãã¾ãã
  - æ¤ç´¢ã¯2段éã§è¡ããã¾ãï¼
   a) é¡ä¼¼æ§ãããã¡: ç¾å¨ã®ã¯ã¨ãªã¨åã¤ãã³ãã®ä»£è¡¨ãã¼ã¯ã³ã¨ã®é¡ä¼¼æ§ã«åºã¥ãã¦k_såã®ã¤ãã³ããé¸æãã¾ãã
   b) é£ç¶æ§ãããã¡: é¸æãããã¤ãã³ãã®åå¾ã®ã¤ãã³ããk_cååå¾ããæéçãªæèãç¶æãã¾ãã
4. ã³ã³ããã¹ãæ§æ:
  - é¸æãããã¤ãã³ãã¯ãLLMã®æ¡å¼µãããã³ã³ããã¹ãã¦ã£ã³ãã¦ã«è¿½å ããã¾ãã
  - ãã®ã³ã³ããã¹ãã¯ä»¥ä¸ã®ããã«æ§æããã¾ãï¼
   - åæãã¼ã¯ã³ï¼128ãã¼ã¯ã³ï¼
   - é¡ä¼¼æ§ãããã¡ããã®ã¤ãã³ã
   - é£ç¶æ§ãããã¡ããã®ã¤ãã³ã
   - ãã¼ã«ã«ã³ã³ããã¹ãï¼ææ°ã®ãã¼ã¯ã³ï¼
5. LLMã®å¦ç:
  - æ¡å¼µãããã³ã³ããã¹ãã使ç¨ãã¦ãLLMãæ°ãããã¼ã¯ã³ãçæãã¾ãã
ãã®æ¹æ³ã«ãããEM-LLMã¯é常ã«é·ãã³ã³ããã¹ããå¹ççã«å¦çããé¢é£æ§ã®é«ãæ å ±ãä¿æããªããã人éã®è¨æ¶ã¡ã«ããºã ã«ä¼¼ãæ¹æ³ã§æ å ±ãåãåºããã¨ãã§ãã¾ãã
Q.ãã¯ãã«ãã¼ã¿ãã¼ã¹ã¯ä½¿ã£ã¦ãªãã§ããï¼
ã¯ãããã®éãã§ããEM-LLMã®ææ³ã§ã¯ããã¯ãã«ãã¼ã¿ãã¼ã¹ã®ãããªå¤é¨ã®ãã¼ã¿æ§é ã¯ä½¿ç¨ãã¦ãã¾ããã代ããã«ãæ å ±ã¯ä»¥ä¸ã®ãããªå½¢ã§ä¿æããã¦ãã¾ãï¼
1. ããã¹ããã¼ã¹ã®ä¿å:
  - æ
å ±ã¯ãå
ã®ããã¹ãå½¢å¼ã§ä¿åããã¾ããå
·ä½çã«ã¯ããã¼ã¯ã³åãããããã¹ãã®å½¢ã§ä¿æããã¦ãã¾ãã
2. Key-Valueãã¢:
  - åã¤ãã³ãï¼ããã¹ãã»ã°ã¡ã³ãï¼ã¯ããã®ã¤ãã³ãå
ã®ãã¼ã¯ã³ã«å¯¾å¿ããkey-valueãã¢ã¨ãã¦ä¿åããã¾ãã
  - ãããã®key-valueãã¢ã¯ãTransformerã¢ãã«ã®æ³¨ææ©æ§ã§ä½¿ç¨ããããã®ã¨åãã§ãã
3. ã¤ãã³ãæ§é :
  - ããã¹ãã¯ãã¤ãã³ããã¨å¼ã°ããæå³ã®ããã»ã°ã¡ã³ãã«åå²ããã¾ãã
  - åã¤ãã³ãã¯ããã®ã¤ãã³ãã«å«ã¾ãããã¼ã¯ã³ã®key-valueãã¢ã®éåã¨ãã¦è¡¨ç¾ããã¾ãã
4. å
é¨ã¡ã¢ãª:
  - ãããã®ã¤ãã³ãã¨ããã«å¯¾å¿ããkey-valueãã¢ã¯ãã¢ãã«ã®å
é¨ã¡ã¢ãªã«ä¿æããã¾ãã
  - å¤é¨ã®ãã¼ã¿ãã¼ã¹ãã¤ã³ããã¯ã¹ã¯ä½¿ç¨ããããã¹ã¦ã®æ
å ±ã¯ã¢ãã«å
é¨ã§ç®¡çããã¾ãã
5. æ¤ç´¢ã¡ã«ããºã :
  - é¢é£ã¤ãã³ãã®æ¤ç´¢ã¯ãä¿åãããkey-valueãã¢ãç´æ¥ä½¿ç¨ãã¦è¡ããã¾ãã
  - é¡ä¼¼æ§ã®è¨ç®ã¯ãã¯ã¨ãªã¨åã¤ãã³ãã®ä»£è¡¨ãã¼ã¯ã³ã®keyéã§è¡ããã¾ãã
ãã®æ¹æ³ã®å©ç¹ã¯ãå¤é¨ã®ãã¼ã¿æ§é ã«ä¾åããã«é·ãã³ã³ããã¹ããå¦çã§ãããã¨ãããã³Transformerã¢ãã«ã®æ¢åã®ã¡ã«ããºã ï¼key-value注ææ©æ§ï¼ãç´æ¥å©ç¨ã§ãããã¨ã§ããããã«ãããã¢ãã«ã®æ¢åã®è½åãæ大éã«æ´»ç¨ããªãããé·ãã³ã³ããã¹ããå¹ççã«å¦çãããã¨ãã§ãã¾ãã
Q.ãã®è«æã§ä½¿ã£ã¦ããLongBenchãã¼ã¿ã»ããã«ã¤ãã¦æãã¦
ã¯ããLongBenchãã¼ã¿ã»ããã«ã¤ãã¦èª¬æãã¾ãã
LongBenchã¯é·æã³ã³ããã¹ãå¦çè½åãè©ä¾¡ããããã®ãã³ããã¼ã¯ãã¼ã¿ã»ããã§ããè«æã®ä¸ã§ä»¥ä¸ã®ãããªç¹å¾´ãæãããã¦ãã¾ã:
1. é·æå¦çãå¿ è¦ã¨ããå¤æ§ãªã¿ã¹ã¯ãå«ã¾ãã¦ãã¾ããå ·ä½çã«ã¯ä»¥ä¸ã®ãããªã¿ã¹ã¯ãæãããã¦ãã¾ã:
- NarrativeQA: é·ãç©èªæã®è³ªåå¿ç
- MultiNews: è¤æ°ã®è¨äºã®è¦ç´
- Qasper: å¦è¡è«æããã®è³ªåå¿çÂ
- TREC: 質ååé¡
- MultiFieldQA: è¤æ°åéã«ã¾ããã質åå¿ç
- TriviaQA: éå¦ã¯ã¤ãºå½¢å¼ã®è³ªåå¿ç
- HotpotQA: è¤æ°ææ¸ãåç
§ããå¿
è¦ã®ãã質åå¿ç
- SAMSum: ä¼è©±ã®è¦ç´
- 2WikiMQA: Wikipediaã®è¤æ°è¨äºã«ã¾ããã質åå¿ç
- PassageRetrieval: è¦ç´ããå
ã®æ®µè½ãç¹å®ããã¿ã¹ã¯
- Musique: è¤æ°ã®ã¹ããããè¦ãã質åå¿ç
- LCC: é·æã®åé¡
- GovReport: æ¿åºå ±åæ¸ã®è¦ç´
- RepoBench-P: ããã°ã©ãã³ã°é¢é£ã®è³ªåå¿ç
- QMSum: ä¼è°é²ã®è¦ç´
2. ãããã®ã¿ã¹ã¯ã¯ãé·ãæèã®ç解ãè¤æ°ææ¸éã®é¢é£ä»ããé·æã®è¦ç´ãªã©ãé·æå¦çã®æ§ã ãªå´é¢ãè©ä¾¡ãã¾ãã
3. è«æã®å®é¨ã§ã¯ããã®LongBenchã使ã£ã¦EM-LLMã®æ§è½ãè©ä¾¡ããç¹ã«PassageRetrievalã¿ã¹ã¯ã§33%以ä¸ã®å¤§å¹ ãªæ¹åã示ãã¦ãã¾ãã
4. LongBenchã¯é·æå¦çã¢ãã«ã®æ§è½ãå æ¬çã«è©ä¾¡ããããã®æ¨æºçãªãã³ããã¼ã¯ã¨ãã¦ä½¿ç¨ããã¦ãããEM-LLMã®æ§è½ãInfLLMãªã©ä»ã®ã¢ãã«ã¨ç´æ¥æ¯è¼ãããã¨ãã§ãã¾ãã
ãã®ããã«ãLongBenchã¯é·æã³ã³ããã¹ãå¦çã«ãããè¨èªã¢ãã«ã®è½åãå¤è§çã«è©ä¾¡ããããã®éè¦ãªãã¼ã¿ã»ããã¨ãã¦ä½ç½®ã¥ãããã¦ãã¾ãã
Q.é©ããã¼ã¹ã®åå²(Surprise-based segmentation)ããã£ã¨è©³ãã
Surprise-based segmentationã¯ãEM-LLMã®éè¦ãªè¦ç´ ã®1ã¤ã§ãããã®ææ³ã¯äººéã®èªç¥ããã»ã¹ã«ãã³ããå¾ã¦ãããäºæ¸¬ãå°é£ãªï¼ã¤ã¾ããé©ããã®ããï¼ç®æã§ããã¹ããåºåããã¨ã§ãæå³ã®ããã¤ãã³ãã«åå²ãã¾ãã以ä¸ã«ãã®è©³ç´°ã説æãã¾ãï¼
1. åºæ¬æ¦å¿µ:
  - LLMã®äºæ¸¬ç¢ºçãå©ç¨ãã¦ãåãã¼ã¯ã³ã®ãé©ããã®åº¦åããè¨ç®ãã¾ãã
  - ãé©ãããä¸å®ã®ãããå¤ãè¶
ããç®æãã¤ãã³ãã®å¢çã¨ã¿ãªãã¾ãã
2. æ°å¦çå®ç¾©:
  - ãããã¼ã¯ã³ x_t ã®é©ãã¯ãè² ã®å¯¾æ°å°¤åº¦ã§å®ç¾©ããã¾ãï¼
   - âlog P(x_t | x_1, ..., x_t-1; θ)
  - ãã㧠P 㯠LLM ã«ãããã¼ã¯ã³ã®äºæ¸¬ç¢ºçãθ ã¯ã¢ãã«ã®ãã©ã¡ã¼ã¿ã§ãã
3. å¢çã®æ±ºå®:
  - åçãªãããå¤ T ã使ç¨ï¼T = μ_t-Ï + γÏ_t-Ï
  - μ_t-Ï ã¨ Ï_t-Ï ã¯ãéå» Ï ãã¼ã¯ã³ã®é©ãã®å¹³åã¨æ¨æºåå·®
  - γ ã¯ã¹ã±ã¼ãªã³ã°å å
4. ã¢ã«ã´ãªãºã :
  - åãã¼ã¯ã³ã®é©ããè¨ç®
  - é©ãããããå¤ T ãè¶
ãããã¼ã¯ã³ãæ½å¨çãªå¢çã¨ãã¦èå¥
  - ããã«ãããåæã®ã¤ãã³ãåå²ãå¾ããã
5. å©ç¹:
  - ã³ã³ããã¹ãã®å¤åã«é©å¿ç
  - 人éã®ç¥è¦ããã¤ãã³ãå¢çã¨é«ãç¸é¢ã示ã
  - è¨ç®ã³ã¹ããä½ãï¼æ¢åã®LLMåºåãå©ç¨ï¼
6. è£å®çææ³:
  - Boundary refinement ã¨çµã¿åããããã¨ã§ãããã«æ´ç·´ãããåå²ãå¯è½
7. èªç¥ç§å¦ã¨ã®é¢é£:
  - 人éãäºæ¸¬èª¤å·®ãé©ãã®å¤§ããªç¬éã§çµé¨ãåå²ããã¨ããçè«ã¨ä¸è´
  - ã¨ãã½ã¼ãè¨æ¶ã®å½¢æããã»ã¹ã模å£
ãã®ã¢ããã¼ãã«ãããEM-LLMã¯å ¥åããã¹ããæå³ã®ããåä½ã«å¹ççã«åå²ããå¾ã®æ¤ç´¢ãå©ç¨ã容æã«ãã¾ããããã¯ç¹ã«é·ãæèãå¦çããéã«éè¦ã§ãLLMã®æ§è½åä¸ã«å¤§ããè²¢ç®ãã¦ãã¾ãã
Q.å¢ç精緻å(Boundary refinement)ããã£ã¨è©³ãã
Boundary refinementã¯ãSurprise-based segmentationã§å¾ãããåæã®ã¤ãã³ãå¢çãããã«æé©åããããã»ã¹ã§ãããã®ææ³ã¯ã°ã©ãçè«ã®æ¦å¿µãå¿ç¨ãã¦ãããã¤ãã³ãå ã®ä¸è²«æ§ãé«ããã¤ãã³ãéã®åºå¥ãæ確ã«ãããã¨ãç®çã¨ãã¦ãã¾ãã以ä¸ã«ãã®è©³ç´°ã説æãã¾ãï¼
1. ç®ç:
  - ã¤ãã³ãå
ã®é¡ä¼¼æ§ï¼intra-event similarityï¼ãæ大å
  - ã¤ãã³ãéã®é¡ä¼¼æ§ï¼inter-event similarityï¼ãæå°å
2. ã°ã©ãçè«çã¢ããã¼ã:
  - 注ææ©æ§ã®ãã¼ï¼keyï¼ãã¯ãã«éã®é¡ä¼¼åº¦è¡åãé£æ¥è¡åã¨ãã¦æ±ã
  - åãã¼ã¯ã³ãã°ã©ãã®ãã¼ãã¨ã¿ãªããé¡ä¼¼åº¦ãã¨ãã¸ã®éã¿ã¨ãã
3. 主è¦ãªè©ä¾¡ææ¨:
  a. Modularityï¼ã¢ã¸ã¥ã©ãªãã£ï¼:
   - ã³ãã¥ããã£ï¼ã¤ãã³ãï¼æ§é ã®è³ªã測å®
   - é«ãã¢ã¸ã¥ã©ãªãã£ã¯ãã¤ãã³ãå
ã®å¼·ãçµåã¨ã¤ãã³ãéã®å¼±ãçµåã示ã
  b. Conductanceï¼ã³ã³ãã¯ã¿ã³ã¹ï¼:
   - ã³ãã¥ããã£éã®æ¥ç¶ã®å°ãªãã測å®
   - ä½ãã³ã³ãã¯ã¿ã³ã¹ã¯ãããæ確ã«åºå¥ãããã¤ãã³ãã示ã
4. ã¢ã«ã´ãªãºã ã®æµã:
  - Surprise-based segmentationã§å¾ãããåæå¢çãéå§ç¹ã¨ãã
  - é£æ¥ããå¢çéã§ãé¸æããè©ä¾¡ææ¨ï¼ã¢ã¸ã¥ã©ãªãã£ã¾ãã¯ã³ã³ãã¯ã¿ã³ã¹ï¼ãæé©åããæ°ããå¢çä½ç½®ãæ¢ç´¢
  - ãã®éç¨ãå
¨ã¦ã®åæå¢çã«å¯¾ãã¦ç¹°ãè¿ã
5. è¨ç®è¤éæ§:
  - O(kn)ãããã§kã¯åæå¢çã®æ°ãnã¯ã·ã¼ã±ã³ã¹é·
6. å©ç¹:
  - ããä¸è²«æ§ã®ãããæå³ã®ããã¤ãã³ãåå²ãå¯è½
  - å¾ã®æ¤ç´¢ããã»ã¹ã®å¹çã¨ç²¾åº¦ãåä¸ããã
  - ã°ã©ãçè«ã®æ¢åã®ç¥è¦ãæ´»ç¨ã§ãã
7. å®é¨çµæ:
  - åç´ãªSurprise-based segmentationãããé«ãããã©ã¼ãã³ã¹ã示ã
  - 人éã®ç¥è¦ããã¤ãã³ãå¢çã¨ã®ç¸é¢ãããã«é«ã¾ã
8. å°æ¥ã®å±æ:
  - ããé«åº¦ãªã°ã©ãã¯ã©ã¹ã¿ãªã³ã°ã¢ã«ã´ãªãºã ã®æ¢ç´¢
  - é層çãªã¤ãã³ãæ§é ã®å½¢æã¸ã®æ¡å¼µã®å¯è½æ§
Boundary refinementã¯ãEM-LLMã®æ§è½åä¸ã«å¤§ããå¯ä¸ãã¦ãããç¹ã«è¤éãªæéçæ¨è«ãå¿ è¦ã¨ããã¿ã¹ã¯ã§å¹æãçºæ®ãã¾ãããã®ææ³ã«ãããLLMã¯ãã人éãããæ¹æ³ã§é·ãæèãç解ããå¦çãããã¨ãå¯è½ã«ãªã£ã¦ãã¾ãã
Q.äºæ®µéæ¤ç´¢(Two-stage retrieval)ããã£ã¨è©³ãã
EM-LLMã® Two-stage retrieval ããã»ã¹ã¯ã人éã®è¨æ¶æ¤ç´¢ã¡ã«ããºã ã«çæ³ãå¾ãéè¦ãªç¹å¾´ã§ãã以ä¸ã«ãã®ä¸»ãªè¦ç´ ã¨ç¹å¾´ãã¾ã¨ãã¾ãï¼
1. ç®ç:
- é¢é£æ§ã®é«ãã¨ãã½ã¼ãã¤ãã³ããå¹ççã«é¸æããLLMã®ã³ã³ããã¹ãã¦ã£ã³ãã¦ã«çµã¿è¾¼ã
- é¡ä¼¼æ§ã¨æéçé¢ä¿ã®ä¸¡æ¹ãèæ
®ããæ¤ç´¢ãå®ç¾ãã
2. 第1段é: é¡ä¼¼æ§ãã¼ã¹ã®æ¤ç´¢
- k-NN (k-Nearest Neighbors) æ¤ç´¢ã使ç¨
- ç¾å¨ã®ã¯ã¨ãªã¨åã¤ãã³ãã®ä»£è¡¨ãã¼ã¯ã³ã¨ã®ãããç©é¡ä¼¼åº¦ã«åºã¥ãã¦æ¤ç´¢
- ksåã®ã¤ãã³ããåå¾ãããé¡ä¼¼æ§ãããã¡ããå½¢æ
3. 第2段é: æéçé£ç¶æ§ã®ç¶æ
- ãé£ç¶æ§ãããã¡ããå°å
¥ï¼ãµã¤ãº kc ã®ãã¥ã¼æ§é ï¼
- æ¤ç´¢ãããã¤ãã³ãã®è¿åã¤ãã³ãï¼å
ã®ã·ã¼ã±ã³ã¹ã§Â±nä½ç½®ä»¥å
ï¼ããã®ãããã¡ã«ã¨ã³ãã¥ã¼
- æéçæèãç¶æããLLMã®ãèªå°ã注æããããé£ç¶æ§ã¨é対称æ§å¹æã示ããã¨ãå¯è½ã«
4. å
¨ä½ã®æ§æ:
- åè¨ k = ks + kc + 2 åã®ã¤ãã³ããã³ã³ããã¹ãã¦ã£ã³ãã¦ã«è¿½å ããã
- é¢é£æ§ã¨æéçé¢ä¿ã®ãã©ã³ã¹ãåã
5. ç¹å¾´:
- 人éã®è¨æ¶æ¤ç´¢ãã¿ã¼ã³ã«é¡ä¼¼ï¼é¡ä¼¼æ§ãã¼ã¹ã¨æéçã«é£ç¶ããæ¤ç´¢ã®çµã¿åããï¼
- ãã¥ã¼æ§é ã«ãããæ°ããã¤ãã³ããå¦çãããã«ã¤ãã¦æéçæèãèªç¶ã«æ¸è¡°
6. å¹æ:
- LLMãé·æçãªæèãå¹æçã«å©ç¨ã§ããããã«ãªã
- ç¹ã«è¤éãªæéçæ¨è«ãå¿
è¦ã¨ããã¿ã¹ã¯ã§ããã©ã¼ãã³ã¹ãåä¸
7. å®è£
ã®è©³ç´°:
- 大è¦æ¨¡ãªã¡ã¢ãªã¹ãã¢ã«å¯¾ãã¦ã¯ãå¹çæ§ãç¶æããããã«è¿ä¼¼k-NNãå©ç¨
ãã®Two-stage retrievalããã»ã¹ã¯ãEM-LLMãå®è³ªçã«ç¡éã®ã³ã³ããã¹ãé·ãæ±ããããã«ããéè¦ãªè¦ç´ ã§ããã人éã®è¨æ¶ã·ã¹ãã ã®ç¹å¾´ã模å£ãã¤ã¤ãLLMã®æ§è½ãå¤§å¹ ã«åä¸ããããã¨ã示ããã¦ãã¾ãã
ã¡ã¢
- ç¸å¤ããããã¯ãã«ã«ãªé¨åã¯è¨³ããã£ã¦ãã¾ããããè¦ã¯å·¥å¤«ãéããRAGã ã¨ç解ãã¾ããã
- ã³ã³ããã¹ããå ¨é¨èªã¿è¾¼ãã®ã§ã¯ãªãããããããã¤ãã³ãåä½ã«æ´çãã¦ãããå ¥åãããããã³ããã¨é¡ä¼¼æ§ã®é«ãã¤ãã³ãï¼æéçé£ç¶æ§ã®é«ãã¤ãã³ãã ããæ®ãã¦ã¢ãã«ã«æ¸¡ãã
- æ®å¿µãªããå®è£ ã¯å ¬éãã¦ããªãã£ã½ãã®ã§ããããã®è«æã§ãæ¢åã®ãã³ã°ã³ã³ããã¹ãå¦çã®SOTAãã¨ãã¦è¨åããã¦ãããInfLLMãã¯Githubã«å ¬éããã¦ãã¾ããã試ããªããã£ã¡ã§ããããã
Â
Â
ãLLMè«æãèªããç¥èç´æ¥é¸å¥½æé©åï¼KDPOï¼ã«ããLLMã®ç¥èç·¨é
-  ãç´æ¥é¸å¥½æé©åï¼DPOï¼ãã¯ããããã¢ã©ã¤ã³ã¡ã³ãã®ç®çã§ä½¿ããã¦ããLLMã®å¦ç¿ææ³ã§ãã
-  åä¸ã®æ示ã«å¯¾ããæçãªåçä¾ã¨æ害ãªåçä¾ã®ä¸¡æ¹ãæ示ãããã¨ã§ãã¢ãã«ãéçºè ã«ã¨ã£ã¦å¥½ã¾ããæåãã¨ãããå¹ççã«èª¿æ´ãã¾ãã
- ä»ã®ã¢ã©ã¤ã³ã¡ã³ãææ³ãããæ軽ã«è¡ããã®ã§ããã¼ã«ã«LLMã³ãã¥ããã£ã§ã追å å¦ç¿ã®æçµå·¥ç¨ã¨ãã¦ç©æ¥µçã«æ¡ç¨ããã¦ããããã§ãã
- 以åããã®DPOã£ã¦LLMã«å¯¾ããç¥èã®è¿½å ç®çã§ã使ããã®ã§ã¯ï¼ãã¨æã£ããã¨ããããå®éã«è©¦ããã®ã§ãããç´ äººä»äºã§ã¯å¹æãå¾ãããããã£ããã«ãªã£ã¦ãã¾ããã
- ãã ãä¹ ãã¶ãã«ç¥èç·¨éé¢é£ã®æ å ±ã調ã¹ã¦ããã¨ãããDPOãå¿ç¨ããç¥èç·¨éã«é¢ããarXivè«æãåºã¦ããã®ã§ãç°¡åã«ç®ãéãã¦ã¿ã次第ã§ãã
- ã¿ã¤ãã«ã¯ãKnowledge Editing in Language Models via Adapted Direct Preference Optimizationï¼é©å¿ç´æ¥é¸å¥½æé©åã«ããè¨èªã¢ãã«ã®ç¥èç·¨éï¼ãã2024å¹´6æã«ã¤ã¹ã©ã¨ã«ã®ãã«ã¢ãã大å¦ãªã©ã®ç 究è ã«ãã£ã¦æ稿ããã¦ãã¾ãã
è¦æ¨
大è¦æ¨¡è¨èªã¢ãã«(LLM)ã®æã¤ç¥èã¯ãæéã®çµéã¨ã¨ãã«å¤ããªã£ã¦ãããã¨ãããäºå®èªèã®èª¤ããã®ã£ããã«ã¤ãªããå¯è½æ§ããããç¥èç·¨éï¼Knowledge Editing: KEï¼ã¯ãé«ã³ã¹ããªåãã¬ã¼ãã³ã°ãå¿ è¦ã¨ããªãéã¿æ´æ°ãç¨ãã¦ããã®èª²é¡ãå æãããã¨ãç®çã¨ãã¦ãããæã ã¯ç¥èç·¨éãLLMã¢ã©ã¤ã¡ã³ãåé¡ã¨ãã¦æ±ããã¨ãææ¡ããããã®ç®æ¨ã«åãã¦ãç¥èã®ä¿®æ£ã«ããå¹æçãªç´æ¥é¸å¥½æé©åï¼DPOï¼ã®ããªã¨ã¼ã·ã§ã³ã§ãããç¥èç´æ¥é¸å¥½æé©åï¼KDPOï¼ããå°å ¥ãããæã ã®ææ³ã¯ãã¢ãã«ã«æ ¼ç´ãããç¥èãç¶ç¶çã«æ´æ°ãããªã³ã©ã¤ã³ã¢ããã¼ãã«åºã¥ãã¦ãããDPOã¨å¼ã°ããããã»ã¹ã«ããã¦ãç¾å¨ã®ç¥èããã¬ãã£ããµã³ãã«ã¨ãã¦ãå°å ¥ãããæ°ããç¥èããã¸ãã£ããµã³ãã«ã¨ãã¦ä½¿ç¨ãããã¾ãããã¬ãã£ããµã³ãã«ã®çæã«ã¯æ師強å¶ãç¨ãããã¸ãã£ããµã³ãã«ãç¨ãã¦æé©åãè¡ããã¨ã§ãå±æçãªå¤åãç¶æãããã¨ãã§ãããæã ã¯æ§ã ãªãã¼ã¿ã»ããã¨ã¢ãã«ã§æã ã®ç¥èç·¨éæ³ããã¹ããã100åã¨500åã®é次編éãè¡ããããã¤ãã®æå 端ææ³ã¨æ¯è¼ãããããã«ãæã ã®æ¹æ³ãæ¨æºçãªDPOã¢ããã¼ãã¨æ¯è¼ããã¢ãã¬ã¼ã·ã§ã³ç 究ãè¡ã£ããæã ã®å®é¨çµæã¯ãæã ã®ä¿®æ£DPOæ³ãããæ´ç·´ãããç¥èç·¨éãå¯è½ã«ããå¾æ¥ã®æ¹æ³ã¨æ¯è¼ãã¦åç以ä¸ã®æ§è½ãéæãããã¨ã示ãã¦ããã
ã¡ã¢
- ããã§ã¯ã好ã¾ããåç(Chosen)ãï¼ãæ°ããç¥èããã好ã¾ãããªãåç(Rejected)ãï¼ãå¤ãç¥èãã¨ãã¦ãDPOææ³ãç¥èç·¨éã«å¿ç¨ãã¦ãã¾ãã
- ãã®ããã§ãKDPOææ³ã§ã¯å¦ç¿ç¥èããã¼ã¯ã³åä½ã§ç´°ããã¿ã¼ã²ããã£ã³ã°ãããã¨ã«ããå¦ç¿ã®è³ªã®æ¹åãå³ã£ã¦ããããã§ãã
- 以ä¸ã®å³ã¯KDPOãã¡ãã·ã¯ã¢ã«ã¼ã³ãã³ã®ããµãªãªå¸ã§çã¾ãããã¨ããç¥èããã¡ãã·ã¯ã¤ã¿ãªã¢ã®ãã©ã³å¸ã§çã¾ãããã«æ¹å¤ããå ´åã®ä¾ã§ãã
- é常ã®DPOã®ããã«ãã¤ã¿ãªã¢ã®ãã©ã³å¸ vs. ã¢ã«ã¼ã³ãã³ã®ããµãªãªå¸ãã§å¦ç¿ãããã®ã§ã¯ãªããã¤ã¿ãªã¢ããåºå®ãã¦ãããï¼ã¤ã¿ãªã¢ã®ï¼ãã©ã³ vs. ï¼ã¤ã¿ãªã¢ã®ï¼ããµãªãªãã«çµã£ã¦æ失ãå¦ç¿ãããã
- é常ã®DPOã ã¨ãLLMã®ç¥èã®ä¸ã§ãã¡ãã·ãã¨ãã¢ã«ã¼ã³ãã³ãã®é¢ä¿ãé ãããã ãã§ãªããããµãªãªå¸ãã¨ãã¢ã«ã¼ã³ãã³ãã®é¢ä¿æ§ããé ããã¦ãã¾ããå¥ã®ç¥èãæªããå¼å®³ãããã
- ãã®è«æã§ç¤ºããã¦ããKDPOææ³ã®ããã©ã¼ãã³ã¹ãæ¦è¦³ãã¾ãã
- 以ä¸ã®è¡¨ã¯ãMMLUãªã©ã®ä»£è¡¨çãªLLMãã³ããã¼ã¯ãå©ç¨ãããã®ã§ãå·¦å´ã¯ã100åã®é£ç¶ç¥èç·¨éã®çµæããå³ãã500åã®é£ç¶ç¥èç·¨éã®çµæãã示ãã¦ãã¾ãã対象ã¢ãã«ã¯Llama-2-7Bã§ãã
- Oursã¨ãªã£ã¦ããã®ãKDPOææ³ãFT-Mã¯ä¸è¬çãªãã¡ã¤ã³ãã¥ã¼ãã³ã°ã®æ´¾çææ³ãROMEã¨MEMITã¯éãã¡ã¤ã³ãã¥ã¼ãã³ã°ã®ä»£è¡¨çãªç¥èç·¨éææ³ã§ãã
- ã¾ããã»ãã®ææ³ããKDPO/DPOã®ã»ããå
¨è¬çã«æåçãé«ããã¨ã示ããã¦ã¾ãã
- ãã ãç¥èç·¨éã®ç 究ã§ã¯è«æã«ãã£ã¦åææ³ã®è©ä¾¡ã«å¤§ããªãã©ã¤ããããã®ã§æ³¨æãããã§ãã
- ã¾ããé常ã®ç¥èç·¨éã§ã¯KDPOãããåç´ãªDPOææ³ã®ã»ããæåçãé«ãã500åç¨åº¦ã¾ã§ç·¨éåæ°ãå¢ããã¨KDPOã®ã¡ãªãããåºã¦ãããã¨ãããã¨ã®ããã§ãã
- 以ä¸ã¯ãå¥ã®æ¹æ³ã«ããç¥èç·¨é精度ã®è©ä¾¡ã§ãã
- ãã®é¨åã¯ãDPOã¨KDPOï¼Oursï¼ã®æ¯è¼ã®ã¿ã§ããçµæã«ã ã©ãããããã¯ãç·¨éã®å復åæ°ãå¢ãããªãã¨KDPOã®åªä½æ§ã¯åºã«ããããã§ãã
- ãªããã¼ã¿ã»ããã¯ããããç¥èç·¨éã®è©ä¾¡ç¨ã®ãã®ã§ãMetricã®Edit Succ.ã¯ç·¨éæåçãPortabilityã¯ç·¨éç¥èãå¿ç¨ã§ãããã®è©ä¾¡ãLocalityã¯ç·¨éãä»ã®å¨è¾ºç¥èã«æªå½±é¿ãåã¼ãã¦ããªããã®è©ä¾¡ãFluencyã¯ç·¨éå¾ã®ã¢ãã«ãå復çãªåºåã«é¥ããªããã®è©ä¾¡ãã®ããã§ãã
- æå¾ã«ãDPOã»KDPOï¼Oursï¼ã¨ããã¡ã¤ã³ãã¥ã¼ã³ã®æ´¾çææ³ï¼AdaLoRA, FT-L, FT-Mï¼ã®æ¯è¼è¡¨ã§ããQwen1.5-0.5Bã¨ããå°è¦æ¨¡ãªè¨èªã¢ãã«ã対象ã«ãã¦ãã¾ãã
- ã©ããããè¶ å°åã®è¨èªã¢ãã«ã ã¨SDPOã®åªä½æ§ãåºãããã¨ãããã¨ã¿ããã§ãã
éæ
- ãã®è«æã«ããã°ãããããDPOèªä½ãç¥èç·¨éã«ããªãå¹æãããã¨ä¸»å¼µããã¦ãã¾ããããã§ä½¿ãããå ·ä½çãªDPOãã¤ãã¼ãã©ã¡ã¼ã¿ãããåãããªãã£ãã®ã§ãããã¾ãDPOã«ããç¥èç·¨éã試ãã¦ã¿ãããªãã¾ããã
- ãKDPOãã«ã¤ãã¦ã¯ããã¶ãæå¾ ããã»ã©ã®çµæã¯åºãªãã£ãã®ããªãã¨ããé°å²æ°ãæãã¾ãã
- ç¥èç·¨éã«ã¤ãã¦ã¯ããã¡ã¤ã³ãã¥ã¼ã³ã®æ´¾çææ³ããã以å¤ã®ç³»çµ±ããæç¶çã«æ°ææ³ãåºç¶ãã¦ããã®ã§ããããã¾ããã¼ã«ã«LLMã³ãã¥ããã£ã®é¢å¿ãé«ããªãç¶²ç¾ çã»å®ç¨çãªè©ä¾¡ãããåãããªãé£ç¹ãããã¾ã...
Â
ãã¡ã¤ã³ãã¥ã¼ãã³ã°ã®çµçï¼å ¨ã¦ã¯ç¶ç¶çãªäºåå¦ç¿
- å°ãå¤ãã§ããããã¡ã¤ã³ãã¥ã¼ãã³ã°ã®çµçï¼The End of Fine-tuningï¼ãã¨ããè¨äºãé¢ç½ãã£ãã®ã§ãç°¡åãªã¡ã¢ã§ãã
- ãã®è¨äºã«ç»å ´ããFast.aiã®Jeremy Howardããã¯ãäºåå¦ç¿â追å å¦ç¿âRLHFãã®ããã«ããã§ã¼ãºãã¨ã«ãã¼ã¿ã»ããã®ç¨®é¡ã»ææ³ãå®å ¨ã«åãæ¿ãã¦ããè¨èªã¢ãã«ã®å¦ç¿æ¹æ³ã¯ããã¯ãæ£ãããªããã¨ä¸»å¼µãã¦ãã¾ãã
ãç§ã¯ä»ã§ã¯èª°ããæ¡ç¨ãã¦ãã 3 段éã®å¦ç¿ã¢ããã¼ããæåã«èæ¡ãã¾ããããããã¯å®ã¯ééã£ã¦ããã使ç¨ãã¹ãã§ã¯ãªãã¨ããã®ãç§ã®ä»ã®è¦è§£ã§ããæ£ããæ¹æ³ã¯ããã¡ã¤ã³ãã¥ã¼ãã³ã°ã¨ããèããæ¨ã¦å»ããã¨ã§ãããã®ãããªãã®ã¯åå¨ãã¾ããããã äºåå¦ç¿ãç¶ç¶ããã ãã§ãã
äºåå¦ç¿ã¨ã¯ãæåãããé¢å¿ã®ãããããã種é¡ã®ãã¼ã¿ãé¢å¿ã®ãããããã種é¡ã®åé¡ãæ示ãæ¼ç¿ãã³ã¼ããæ±ç¨ããã¥ã¡ã³ãã®å®æãªã©ããã¹ã¦å«ããããã«ãããã¨ã§ããããã¦ããã¬ã¼ãã³ã°ããã«ã¤ãã¦ãå¾ã ã«ããããã¥ã¬ã¼ãããå¾ã ã«å質ãé«ããå®è¡ãããã¿ã¹ã¯ã®ç¨®é¡ã«ã¾ãã¾ãç¹åãã¦ããã¾ãããã ãããã¼ã¿ãæ¨ã¦ããã¨ã¯ããã¾ãã...
ãããä»ã®ç§ã®è¦è§£ã§ããULMFiT ã¯ééã£ãã¢ããã¼ãã ã¨æãã¾ããããããããããã¢ã©ã¤ã¡ã³ãç¨ï¼alignment taxï¼ãæ°å¤ãè¦ãããçç±ã§ããå®éã«ã¯ã人ã ãééã£ããã¬ã¼ãã³ã°ããã¦ããããã ã¨æãã¾ããã
- ãã®ã³ã¡ã³ãã§è¨åããã¦ãããã¢ã©ã¤ã¡ã³ãç¨ãã¯ãLLMãæ害ãªåºåãããªãããã«å¾®èª¿æ´ããã¢ã©ã¤ã³ã¡ã³ãã«ãã£ã¦ã¢ãã«æ§è½ãä½ä¸ãã¦ãã¾ãç¾è±¡ãæãã¦ããã®ã ã¨æãã¾ãã
- ãã®ã»ããLlama-2ã«ã³ã¼ãã£ã³ã°ç¹åã®å¦ç¿ãå ããCodeLlamaãä¸è¬ã¿ã¹ã¯ã§ã®æ§è½ãè½ã¨ããä¾ã«ãè¨åãã¤ã¤ãéå»ã«ç²å¾ããç¥èã追å å¦ç¿ä¸ã«å¤±ã£ã¦ãã¾ããå£æ»
çå¿å´ããé²ãããã«ã¯å¸¸ã«å¦ç¿ãã¼ã¿ã®å¤æ§æ§ãç¶æãç¶ãããã¨ãéè¦ã ãã¨ããã®ãè«æ¨ã¿ããã§ãã
- æçµçãªç¨éã«åããã¦éè¦ãªãã¼ã¿ã®æ§ææ¯çãå¢ããã¦ããå¿ è¦ã¯ããã¤ã¤ãããã以å¤ã®ãã¼ã¿ãå®å ¨ã«æ¨ã¦ã¦ãã¾ãã¨å¦ç¿ã«ãã£ã¦æ±ç¨æ§è½ãå£åãã¦ãã¾ãã
- ããããã°ã以åã«ãä¼¼ããããªæ稿ãã¡ã¢ãã¦ãã¾ããã微調æ´ä¸ã1/4ãããã¯äºåå¦ç¿ãã¼ã¿ãæ··ããã»ããããã®ã§ã¯ï¼ã¨ããå 容ã§ãã
Â
ãLLMè«æãèªãï¼è©¦ããLLMã®åãè¾¼ã¿(Pruning)ï¼æ·±ãã¬ã¤ã¤ã¼ã®ä¸åçãªéå¹çæ§
Â
- å æ¥ãLlama-3-70Bãåãè¾¼ã¿ï¼Pruningï¼ã§42Bã«ç¸®ããã¢ãã«ãç»å ´ãã¨ããRedditã®æ稿ã話é¡ã«ãªã£ã¦ãã¾ããã
- æ稿è ã¯ããªãã¿ã®kindacognizantï¼kalomazeï¼ããã§ãããã¢ãã«ä½æè ã¯å¥ã®æ¹ã®ããã§ããã¢ãã«ã®HuggFaceã®Repoã¯ãã¡ãã
- ãã®ã¢ãã«ãå®éã«è§¦ã£ã¦ã¿ã¾ããããInstructã¢ãã«ã§ã¯ãªãBaseã¢ãã«ãåãè¾¼ãã ãã®ãªã®ã§æ£ç´ãªã¨ããè¯ãæªãã¯ãããããã¾ããã§ããã
- æ稿ã«ããã°ãPruned-Llama3ã®MMLUã¹ã³ã¢ã¯76-77ï¼4bitã§ãã¹ãï¼ã§ãå ã®79ããè¥å¹²ä½ä¸ãã¦ãããã®ã®ãããã§ãMixtral 8x22Bã¨åçï¼ã®é«ãã¹ã³ã¢ãç¶æãã¦ããããã§ãã
- ã¡ãã£ã¨é¢ç½ãããªã®ã§ããã®ã¢ãã«ä½ææé ã®å ãã¿ã¨ããã¦ããè«æãæãèªã¿ããããã§ãã¢ãã«åãè¾¼ã¿ã試ãã¦ã¿ããã¨æãã¾ãã
- ããã¯2024å¹´3æã«Meta FAIRé¢ä¿ã®ç 究è ãæ稿ããarXivè«æã§ãã
è¦æ¨
æã ã¯ããªã¼ãã³ã¦ã§ã¤ãã®äºåå¦ç¿æ¸ã¿LLMã®ããã®åç´ãªã¬ã¤ã¤ã¼åãè¾¼ã¿æ¦ç¥ãçµé¨çã«ç 究ããããã®çµæãã¢ãã«ã®ã¬ã¤ã¤ã¼ã®å¤§é¨åï¼æ大ã§ååï¼ãé¤å»ããå¾ã§ããæ§ã ãªè³ªåå¿çãã³ããã¼ã¯ã§ã®æ§è½ã®å£åãæå°éã«çãããããã¨ãçºè¦ããããããã®ã¢ãã«ã«åãè¾¼ã¿ãè¡ãã«ããããã¬ã¤ã¤ã¼éã®é¡ä¼¼æ§ãèæ ®ãããã¨ã«ãã£ã¦é¤å»ããã¬ã¤ã¤ã¼ã®æé©ãªãããã¯ãç¹å®ããããã®éãã©ã¡ã¼ã¿å¹çã®è¯ããã¡ã¤ã³ãã¥ã¼ãã³ã°ï¼PEFTï¼ææ³ãQLoRAã使ç¨ããåå®é¨ãåä¸ã®A100 GPUã§å®è¡ãã¦ãããå®ç¨çãªè¦³ç¹ããã¯ããããã®çµæã¯ãã¬ã¤ã¤ã¼åãè¾¼ã¿ææ³ãä»ã®PEFTæ¦ç¥ãè£å®ããä¸æ¹ã§ã¯ãã¡ã¤ã³ãã¥ã¼ãã³ã°ã®è¨ç®è³æºãããã«åæ¸ã§ããä»æ¹ã§ã¯æ¨è«ã®ã¡ã¢ãªã¨ã¬ã¤ãã³ã·ãæ¹åã§ãããã¨ã示åãã¦ãããç§å¦çãªè¦³ç¹ããã¯ããããã®LLMã層ã®åé¤ã«å¯¾ãã¦é å¥ã§ãããã¨ã¯ãç¾å¨ã®äºåå¦ç¿ææ³ããããã¯ã¼ã¯ã®æ·±ãã¬ã¤ã¤ã¼ã®ãã©ã¡ã¼ã¿ãé©åã«æ´»ç¨ãã¦ããªãããæµ ãã¬ã¤ã¤ã¼ãç¥èã®ä¿åã«ããã¦éè¦ãªå½¹å²ãæããã¦ãããã¨ã示åãã¦ããã
ã¡ã¢
- Llama-2-70Bã®ãããªãã¼ã¹ã¢ãã«ã§ã¯ãæ大ã§ååãããã®ã¬ã¤ã¤ã¼ãåé¤ãã¦ããã®æ§è½ã®å¤§é¨åãç¶æãããã¨ãã§ããã
- ã¨ããã®ããåãè¾¼ãã¬ã¤ã¤ã¼ãå¢ããã¦ããã¨ï¼å調ã«æ§è½ãä½ä¸ããã®ã§ã¯ãªãï¼ä¸å®ã®æ°´æºã§MMLUã¹ã³ã¢ãæ¥ä½ä¸ããç¾è±¡ã観å¯ãããã
- ããã¯ãã¢ãã«æ§è½ã«ã¨ã£ã¦éè¦ãªãã©ã¡ã¼ã¿ãä¸é¨ã®ã¬ã¤ã¤ã¼ã«åã£ã¦ããï¼ï¼ãã以å¤ã®ã¬ã¤ã¤ã¼ã¯åã£ã¦ãæ§è½ã«å½±é¿ãã«ããï¼ãã¨ã示åããã
- å ·ä½çã«æ¤è¨¼ããã¨ãæµ ãã¬ã¤ã¤ã¼ã§ã¯ã¬ã¤ã¤ã¼éã®é¡ä¼¼æ§ãä½ãï¼ç¡é§ããªãï¼ãããæ·±ãã¬ã¤ã¤ã¼ã§é¡ä¼¼æ§ãé«ãï¼åé·ã«ãªãï¼ãã¨ãããã£ãã
- ãããã£ã¦ãåç´ã«åãè¾¼ãæ¦ç¥ãã¨ãããããæ·±ãã¬ã¤ã¤ã¼ãåªå ãã¦åãè¾¼ãæ¦ç¥ãã¨ããã¨ã§ãã¢ãã«ã®æ§è½ä½ä¸ãæããªãããµã¤ãºã縮å°ããéãéããã
- ãªãã¬ã¤ã¤ã¼ãé¸æçã«åãè¾¼ãå ´åããã¢ãã«ã®æå·ãä¿®ç¹ããããã®è¿½å å¦ç¿ï¼Healingï¼ãæ½ããã¨ã§è¡å¾ï¼ã®æ§è½ãããå®å®ããã
PruneMeï¼è«æã®éå ¬å¼å®è£
- ãã¦ãMergeKitãæãããarcee-aiããPruneMeãã¨ãããã®è«æã®éå ¬å¼ãªå®è£ ãæä¾ãã¦ãã¦ãåé ã®ã¢ãã«ããããå©ç¨ãã¦ä½ã£ã¦ããããã§ãã
- PruneMeã®Repoã«ã¯ãMistral-7B-instruct-v0.2ããä¾ã«å®éã«ã¬ã¤ã¤ã¼éã®é¡ä¼¼æ§ãæ¤è¨¼ããã°ã©ããæ²è¼ããã¦ãã¾ãã
- å è«æã§ä¸»å¼µããã¦ããã¨ãããååã®æµ ãã¬ã¤ã¤ã¼ã¯é¡ä¼¼æ§ãä½ããå¾åã®æ·±ãã¬ã¤ã¤ã¼ã§é¡ä¼¼æ§ãé«ããªãæ§åãããããããã¦ãã¾ãã
PruneMeã«ããã¢ãã«åãè¾¼ã¿ã試ã
- ã§ã¯Google Colabã§ç°¡åã«è©¦ãã¦ã¿ã¾ããRepoãã¯ãã¼ã³ããä¾åé¢ä¿ãã¤ã³ã¹ãã¼ã«ãã¾ãï¼Transformersã¯Colabã«ããªã¤ã³ã¹ãã¼ã«æ¸ã¿ï¼ã
# ã¬ãã¸ããªã®ã¯ãã¼ã³ !git clone https://github.com/arcee-ai/PruneMe # ä¾åé¢ä¿ !pip install bitsandbytes datasets accelerate
- ããã§ã¯Mistral 7Bãã¼ã¹ã®æ¥æ¬èªãã£ããã¢ãã«ã使ã£ã¦ãã¬ã¤ã¤ã¼é¡ä¼¼æ§ã測å®ãã¾ãã
- ã¨ãããããµã³ãã«ã¨åãè±èªãã¼ã¿ã»ããã使ã£ã¦æ¸¬å®ãã¦ãã¾ãï¼ã¬ã¤ã¤ã¼ã®é¡ä¼¼æ§ãå¤å®ã§ããã°ããæ¥æ¬èªã使ãå¿ è¦ã¯ãªãæ°ãããã®ã§ï¼ã
- ã--layers_to_skipãã§ãåé¤ãããã¬ã¤ã¤ã¼ã®æ°ãæå®ãã¾ããããã§ã¯32ã¬ã¤ã¤ã¼ã®ãã¡1/4ã®8ã¬ã¤ã¤ã¼ãåé¤ããæ³å®ã§ãã
- ãªãè¨ç®æéã®åé¡ã§ããã¼ã¿ã»ãããµã¤ãºã¯ãµã³ãã«ããå¤§å¹ ã«æ¸ããã¦ãã¾ãã
!python ./compute_block_similarity/layer_similarity.py --model_path "TFMC/Japanese-Starling-ChatV-7B" \ --dataset "arcee-ai/sec-data-mini" \ --dataset_column "text" \ --batch_size 8 \ --max_length 1024 \ --layers_to_skip 8 \ --dataset_size 200 \ --dataset_subset "train"
- è¨ç®ãçµããã¨ãCSVãã¡ã¤ã«ã§çµæãåºåããã¾ãï¼ä¸è¨è¨å®ã§ãæ¨æºGPUã§20åã»ã©ï¼ã測å®çµæã¯ä»¥ä¸ã®éãã§ããã
- ãµã³ãã«ã¨ããä¼¼ãçµæãå¾ããã¾ããã21-29ã®ãããã¯ãæãåé·ãªããã§ããåãMistral 7Bãã¼ã¹ã®ã¢ãã«ãªã®ã§å½ç¶ã¨ããã°å½ç¶ããããã¾ããã
- ãããã«ãã¦ãã¬ã¤ã¤ã¼ãã¨ã®åé·æ§ã¯è¿½å å¦ç¿ã§ã¯ç¹ã«å¤ãããªãï¼ã¢ã¼ããã¯ãã£ãåããªãåãï¼å¯è½æ§ãããããã§ãã
- ãã¨ã¯ããã®ç¹å®ããã¬ã¤ã¤ã¼ãããã¯ãåé¤ãããããMergeKitãå°å ¥ãã¾ãã
# MergeKitã®å°å ¥
!git clone https://github.com/cg123/mergekit.git %cd mergekit !pip install -e .
- 次ã«ã./PruneMe/slice_with_mergekit/slice.yamlããé©å®ç·¨éãã¾ããMergeKitã®passthoroughææ³ï¼ãããããã©ã³ã±ã³ãã¼ã¸ãä½ãææ³ï¼ãæµç¨ãã¦ããããã§ãã
slice.yaml
--------------------------------------
slices: - sources: - model: TFMC/Japanese-Starling-ChatV-7B layer_range: [0, 21] - sources: - model: TFMC/Japanese-Starling-ChatV-7B layer_range: [29,32] merge_method: passthrough dtype: bfloat16
- æå¾ã«ã¹ã©ã¤ã¹ãå®è¡ããã¨ãåãè¾¼ã¿ã¢ãã«ãåºåããã¾ãã
# åãè¾¼ã¿ã®å®è¡
!python /content/PruneMe/slice_with_mergekit/merge_me.py
åãè¾¼ã¿ã¢ãã«ã®ãã¹ã
- ã¨ãããã¨ã§ãã¢ãã«ã®åãè¾¼ã¿ãã§ããã®ã§å®éã«åºåãã¦ã¿ã¾ãã以ä¸ãåºåã®ä¸ä¾ã§ãã
<s> [INST] <<SYS>>\nããªãã¯å½¹ç«ã¤ã¢ã·ã¹ã¿ã³ãã§ãã\n<</SYS>>\n\n空ãéãã®ã¯ã©ããã¦ï¼ [/INST] 空ãéãã®ã¯ãä¸è¬ã«ã¯ç©ºã®ä¸ã§ã®è²ã®ç¸ç©ãå¹åµãã¼ããã«ãå¹ãã¦å¹ã¼ã¨ããã¨ããè·ã¦å¹ã©å¹ãã¾ãã§ãã¦ã®ãããã樾è©å´æ£§å´å£¯ã¾æ£§å´å£¬è²»å¡å£¯ã³ã度ã度桴壯ã·åº¦ç¸ç«çã¾åº¦ã度諷è¡ç´ã¾ãå¡«ã¾ã¨ãã度ç±ã¼ããã§ãã¦ã¾ãè¦è«ã度ã...
- æ®å¿µãªãããã¢ãã«ã®åºåã¯éä¸ããç ´ç¶»ãã¦ãã¾ãã¾ãã
- ããèãã¦ã¿ãã¨ãåé ã®ã¢ãã«ãå è«æã§æ±ã£ã¦ããä¾ãããããBaseã¢ãã«ã ãã§ãInstructã¢ãã«ã«ã¯è§¦ãã¦ãã¾ããã§ãã...
- ãåãè¾¼ã¿ãã¦ãæ§è½ãè½ã¡ã«ãããã¨ããã®ã¯ãããã¾ã§é¸æå¼ã®MMLUã¹ã³ã¢ã§æ¸¬å®ããããããªBaseã¢ãã«ã®åç´æ§è½ã®è©±ã§ãInstructã¢ãã«ã¯ãã£ã¨ç¹ç´°ãªããã§ãã
- ãããã¯ãæ示追å¾ã«ã¨ã£ã¦éè¦ãªãã©ã¡ã¼ã¿ãå®ã¯åé·ãªæ·±ãã¬ã¤ã¤ã¼ã«åå¨ãã¦ããããããã®ã§ããããï¼
- ã¨ãããã¨ã§ãåºæ¬çã«ã¯ãInstructã¢ãã«ã§ã¯ãªãBaseã¢ãã«ã«åãè¾¼ã¿ãã¦ãããããã¦æ示追å¾ã®è¿½å å¦ç¿ããããã¨ããæé ãå¿ è¦ã«ãªãããã§ããæ軽æã¯ããã¾ããã
- æ¢æã®ã¢ãã«ãåã«ãµã¤ãºå§ç¸®ããã®ãç®çãªããéååã®è³ªã«ãã ãã£ãã»ããæå¾ ã§ãããã§ãã
- ã¨ã¯ãããä¾ãã°MoEã¢ãã«ãªããã£ã¨åãè¾¼ããä½å°ãããããï¼ã¨ããè²ã ã¨é¢ç½ãããªææ³ã§ã¯ããã®ã§ãæå¹ãªæ´»ç¨æ¹æ³ãè¦ãã ããã¦ã»ããæ°ããã¾ãã
Â
llama.cppï¼iMatrixéååã¯æ¥æ¬èªæ§è½ã«ã©ãå½±é¿ãããï¼
Â
- éååæã®ã¢ãã«å£åãæå¶ããéè¦åº¦è¡åï¼iMatrix; Importance Matrixï¼è¨ç®ã®è©±é¡ã§ãã
- æè¿ã¯HuggingFaceã«ã¢ãããããGGUFãå¤ããiMatrixçã¨ãªã£ã¦ãã¾ããããããã®éååã§ãã使ããã¦ããiMatrixè¨ç®ç¨ãã¼ã¿ã»ããã¯ä»¥ä¸ã®2種é¡ã®ããã§ããÂ
- wiki.train.rawï¼Wikitextï¼è±èªçï¼ã®ãã¬ã¼ãã³ã°ç¨ãã¼ã¿ã»ãããllama.cppã®ãµã³ãã«ã§ãwiki.train.rawã使ããã¦ãããiMatrixè¨ç®ã§ã¯ããã®ãã¡10kãã¼ã¯ã³ã»ã©è¨ç®ããã°å®ç¨çã«ã¯ååãªæ§åã
- groups_merged.txtï¼koboldcppã«ãé¢ãã£ã¦ããkalomazeãããä½æããçä¼¼ã©ã³ãã ãã¼ã¿ã»ãããè¨èªã¢ãã«ã§çæããéå¤ãªããã¹ããä¸å¿ã¨ãã200KBã»ã©ã®ãã¼ã¿ã
- ã¨ãªãã¨ãä¸ä½ã©ã¡ãã®iMatrixçãè¯ãã®ããæ°ã«ãªã£ã¦ãã¾ãã以ä¸ã®llama.cppã®Repoã®è°è«ã«ç®ãéãã¨ãåºæ¬çã«ã¯wikitextã§ããããè±èªä»¥å¤ã®è¨èªã§ä½¿ããªãã°groups_mergedãªã©ä»ã®é¸æè¢ãæ¤è¨ãã¹ããããããªããã¨ãã趣æ¨ã®è©±ãå±éããã¦ãã¾ãã
iMatrixããã¹ãã«ããæ¥æ¬èªæ§è½ã®å·®
- ããã§ãå æ¥ã®ChatVectorã使ã£ã7Bã¢ãã«ã使ã£ã¦ããããã®iMatrixéååã«ããæ¥æ¬èªæ§è½ã®å¤åã調ã¹ã¦ã¿ã¾ããããã³ããã¼ã¯ã¯ã馴æã¿ã®ELYZA-tasks-100ã§ããçµæã¯ä»¥ä¸ã®éãã§ããã
- éãiMatrixç¡ãã®éååãæ©ãwiki.trainã«ããiMatrixéååãé»ãgroups_mergedã«ããiMatrixéååã§ãï¼ãªãIQ3_XS以ä¸ã§ã¯éiMatrixã®éååã¯ä¸å¯ï¼ã
- "Q5_K_M"ã®æç¹ã§ã¯iMatrixã®æç¡ã«ããå·®ã¯ããã¾ãããã4bitã®éååã§ã¯"iMatrix"ã«ããå¹æãã¯ã£ããããããã¾ãã
- åãiMatrixçã§ãå·®ãã¿ããã"Q4_K_M"ã®groups_mergedçï¼3.52ï¼ã¯ã"Q8_0"ããè¶
ããã¹ã³ã¢ï¼3.42ï¼ã示ãã¦ãã¦é©ãã¾ãã
- ããªãå·®ã大ãããåãªã誤差ã§ã¯ãªãããã§ããå ·ä½çã«å¾ç¹ãä¸ãã£ãåçã®ä¸ä¾ã確èªãã¾ãã
ElyzerTasks100 è¨å68 |
Qï¼ç§ã®ç®ã¯é»ãã髪ã¯æããé»è²ã§ã足å ã¯ãã£ããã¨å°é¢ã«æ ¹ããã¦ãã¾ããç§ã¯å¤ªé½ã®å ãé¡ã«æµ´ã³ãã®ã好ãã§ããç§ã®æ£ä½ã¯ä½ã§ãããï¼ |
Q8_0éåå ï¼Ãä¸æ£è§£ï¼ |
ããªãã®ç¹å¾´ããæ¨æ¸¬ããã¨ãããªãã¯ããããæ¥æ¬äººã§ããå¯è½æ§ãé«ãã§ããæ¥æ¬äººã®å¤ãã¯é»ãç®ãæã¡ãæããé»è²ã®é«ªè²ãæã¤äººã¯ç¨ã§ããã¾ãã足å ããã£ããã¨å°é¢ã«æ ¹ããã¦ããã¨ãã表ç¾ãããå®å®æãå°ã«è¶³ã®ã¤ãã人ç©åãæµ®ãã³ã¾ãã太é½ã®å ã好ãã¨ãããã¨ãããæããååããªæ§æ ¼ãæã£ã¦ããã®ã§ã¯ãªãã§ããããããã ãããããã®ç¹å¾´ã ãã§ã¯æ£ç¢ºãªæ£ä½ãç¹å®ãããã¨ã¯ã§ãã¾ãããããããããããã®ä»ã®è¦ç´ ãå«ãã¦èªåèªèº«ã®ã¢ã¤ãã³ãã£ãã£ã表ç¾ãã¦ã¿ãã¨è¯ãããããã¾ããã |
Q4_K_Méåå By groups_merged ï¼ãæ£è§£ï¼ |
ãã®æ
å ±ããæ¨æ¸¬ããã¨ãããªãã®æ£ä½ã¯ãããã¯ãªãã§ããå¯è½æ§ãé«ãã§ãã以ä¸ã®çç±ããã§ãã 1. ç®ãé»ãï¼ããã¯ãªã®è±å¼ã«ã¯ä¸å¿é¨ã«æãé¨åããããã¾ãã§ç®ã®ããã«è¦ãããã¨ãããã¾ãã 2. 髪ã¯æããé»è²ï¼ããã¯ãªã®è±å¼ã®è²ã¯æããé»è²ã§ã髪è²ãæããé»è²ã§ãããã¨ãããã¯ãªã®ã¤ã¡ã¼ã¸ã«ä¸è´ãã¦ãã¾ãã 3. 足å ããã£ããã¨å°é¢ã«æ ¹ã¥ãã¦ããï¼ããã¯ãªã¯èãé«ãããã£ããã¨å°é¢ã«æ ¹ãå¼µã£ã¦ãã¾ãã足å ããã£ããã¨å°é¢ã«æ ¹ã¥ãã¦ãããã¨ãããã¯ãªã®ç¹å¾´ã®ä¸ã¤ã§ãã 4. 太é½ã®å ãé¡ã«æµ´ã³ãã®ã好ãï¼ããã¯ãªã¯å¤ªé½ã®å ãæµ´ã³ããã¨ã§å æ°ã«æé·ãã¾ãã太é½ã®å ã好ã姿å¢ã¯ããã¯ãªã«ããã¿ãªã§ãã 以ä¸ã®ç¹ãããããªãã¯ããã¯ãªã§ããå¯è½æ§ãé«ãã¨æ¨æ¸¬ããã¾ãããã ããä»ã®è±ãæ¤ç©ã®å¯è½æ§ãããããã確å®ããã«ã¯ããã«è©³ããæ å ±ãå¿ è¦ããããã¾ããã |
-
- ãã®åé¡ã§ã¯ãæ¥æ¬èªã®æ¯å©è¡¨ç¾ãç解ã§ããããã«ãªã£ã¦ãã¾ããã»ãã®è¨åãå«ããç·ãã¦æ¥æ¬èªã®æç« ç解ãè¥å¹²åä¸ãã¦ããå°è±¡ãåãã¾ãã
- "Q4_K_M"以å¤ã§ã¯iMatrixãã¼ã¿ã®éãã«ããå·®ã¯å°ãããã¾ãå¿ ãããgroups_mergedçã®ã»ããã¹ã³ã¢ãè¯ãããã§ãããã¾ããã
- ã¨ã¯ãããiMatrixãã¼ã¿ã工夫ãããã¨ã§æ¥æ¬èªæ§è½ã«ä½ããã®å¹æãåã¼ãããããã¨ããæå¾ ã¯ã§ãããã§ãã
3bit以ä¸ã§é¡èã«ãªãéååã®æªå½±é¿
- éååãããããæ§è½å¤åã¯äºæ¸¬ä¸è½ãªé¨åã大ããã®ã§ããã確å®ã«è¨ããã®ã¯ã3bit以ä¸ã®ä½bitéååã§ã¯æ§è½ãå¤§å¹ ã«å£åãããã¨ããç¹ã§ãã
- å ã»ã©ç´¹ä»ããã°ã©ãã§ã¯ã"IQ2_XS"ã§é¡èã«ãã³ããã¼ã¯ã¹ã³ã¢ãä½ä¸ãã¦ãã¾ããããå®éã®åºåå 容ãè¦ãã¨å®ã¯"IQ3_XS"ã®æç¹ã§åçãç ´ç¶»ããã±ã¼ã¹ãåºç¾ãã¾ããä¾ãã°ä»¥ä¸ã®ãããªä¾ã§ãã
ElyzerTasks100 è¨å13 |
IMEã¨ãã¦æ©è½ãã¦ãã ããã ããªãã®ã¿ã¹ã¯ã¯ãã¼ãåå ¥åãããã¢ã«ãã¡ãããåãåèªãã¨ã«ã²ãããªã«å¤æããããããã®åèªãæ¼¢åã«å¤æããåè£ãæ大3ã¤æ示ãããã¨ã§ãã ï¼ä¸ç¥ï¼ ããã§ã¯ãå ¥åãéå§ããã®ã§ãIMEã¨ãã¦åºåã表示ãã¦ãã ããã inputs: j, u, u, b, u, n, o, s, h, o, s, e, k, I |
IQ2_XSéåå Groups_merged ã®åç |
以ä¸ã¯å
¥åãããã¢ã«ãã¡ãããåãã²ãããªã«å¤æããããããã®åèªãæ¼¢åã«å¤æããåè£ãæ大3ã¤æ示ããä¾ã§ãã inputs: j, u, u, b, u, n, o, s, h, o, s, e, k, i outputs: - ã¸ã¥ã¦ã¦ããã¦ãã½ã¨ã: 寿çé²éµ¬, é å®è·å, ä½åæ°´ç¥ç¤¾ - ããã: åªé , 夢æ³, æã - ããã: æçºãã, 決ãã, ç³ç°ç´å¸å - ãã»: ãã»ã»ã»ã», ãã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã»ã» |
- åããã¼ã¯ã³ãé£ç¶ãã¦æ¢ã¾ããªããªã£ãããæç« ãæå³ããªãåé·ã«ãªãã®ã¯ä½bitéååã§ããè¦ãç¾è±¡ã§ãããããªãã¨ãã¢ãã«ã¨ãã¦ã¯å®ç¨ãé£ãããªãã¾ãã
- éå»ã«æ¥æ¬èªWikitextã使ã£ã¦éååã«ããã¢ãã«ã®Perplexityã®æ¨ç§»ã調ã¹ããã¨ããã£ãã®ã§ãããããã§ãéååã¯ã©ã¹ã3bitã«ãªãã¨Perplexityãæ¥ä¸æããå¾åã確èªã§ãã¾ããï¼Perplexityã¯ä¸è¬ã«ä½ãã»ã©è¯ãï¼ã
- ãã¡ããè±èªã§ãåãããã«PPLãæªåãã¾ãããã©ãããæ¥æ¬èªã®ã»ããããæ¿ããPPLãå¢å ããå¯è½æ§ãé«ãã§ãï¼åèè¨äºï¼ã
iMatrixè¨ç®ã«æ¥æ¬èªããã¹ãã使ãã¨ï¼
- ããã3bit以ä¸ã®éååã¯é¿ããã°ï¼ã¨ãã話ã§ã¯ããã®ã§ãããæè¿ã¯ãã¼ã«ã«LLM大ååãé²ãã§ãããã¨ããããå®éã«ã¯3bitã¯ã©ã¹ã®éååããããªããã§ã¤ã¹ããå ´é¢ãï¼å人çã«ã¯ï¼ããããã¾ãã
- ããã§ãiMatrixã®è¨ç®ç¨ãã¼ã¿ã«æ¥æ¬èªãå«ãããã¨ã§ä½bitã§ã®æ¥æ¬èªæ§è½ã®å£åãæå¶ã§ããªãããã¨ãããã¨ãèãã¦ãã¾ãã
- åé ã®llama.cppã®Repoã®è°è«ã§ããè±ä»æ··åããã¹ããiMatrixè¨ç®ã«ä½¿ããã¨ã§ä»èªã§ã®PPLãæ¹åããï¼ããã ãã§ãªãè±èªæ§è½ãä¸é¨æ¹åããï¼ã¨ããçµæãæ稿ããã¦ãã¾ãã
- ãããã®æ å ±ãè¸ã¾ããWikitextãgroups_mergedã®ã»ãä¸è¨ã®ä¾ã§ç¨ãããã¦ããc4ãã¼ã¿ã»ããï¼Webã¯ãã¼ã«ã³ã¼ãã¹ï¼ã使ã£ã¦ãiMatrixè¨ç®ç¨ã®æ¥è±æ··åããã¹ããä½æãã¦ã¿ããã¨ã«ãã¾ããã
- æ°ãã«ç¨æããããã¹ãã«ã¤ãã¦ã¯ã200KBç¨åº¦ã®groups_merged.txtã§ä¸å®ã®å¹æãåºã¦ãããã¨ããæ¥æ¬èª200KBï¼è±èª200KBã®400KBããããç®å®ã«ãã£ããæ½åºãã¦ãã¾ãã
- ã¨ããããã§ãä½ç¨®é¡ãã®iMatrixããã¹ãã§IQ2_XSéååãè¡ã£ã¦éååã¢ãã«ã®ãã³ããã¼ã¯ã¹ã³ã¢ãåã£ãçµæã以ä¸ã®ããã«ãªãã¾ããã
- wiki_text(en)ã¯wiki.trainã¨åããã®ãwiki_text(en+ja)ãwiki.trainã«æ¥æ¬èªwikitextã追å ãããã®ãc4_en_jaãc4ã®æ¥ï¼è±ããã¹ããc4_ja_onlyãc4ã®æ¥æ¬èªã®ã¿ããã¹ãã§ãã
- æ¥æ¬èªããã¹ããå«ãããã¨ã§ãã³ããã¼ã¯ã¹ã³ã¢ãæ¹åãããã¨ããããã¾ãã
- ãã ããwiki_text(en+ja)ã®ã¹ã³ã¢ã¯wiki_text(en)æ¯ã§å¾®å¢ã¨ãã£ãã¨ããã§ã»ã¨ãã©å¹æã¯ããã¾ãããã¾ããæ¥æ¬èªã®ã¿ã®c4_ja_onlyãããæ¥è±æ··åã®c4_en_jaã®ã»ããã¹ã³ã¢ãé«ãã®ãé¢ç½ãã§ãã
- ãã®ãã¹ãããc4ãã¼ã¿ã»ããç±æ¥ã®æ¥è±ããã¹ãã§iMatrixè¨ç®ãããã¨è¯ãçµæãå¾ãããããªãã¨ãåãã£ãã®ã§ãããå°ãæ·±å ããã¦ã¿ã¾ãã
- ä¸è¨ã®ä¾ã§ã¯ãc4ã®è±èªããã¹ã200KBã®å¾ãã«c4ã®æ¥æ¬èªããã¹ã200KBãçµåããã®ã§ãããè±èªã¨æ¥æ¬èªã®æ··ãæ¹ã§éããããããããã¾ããã
- æ¥æ¬èªããã¹ããå ã«ãã¦è±èªããã¹ããå¾åã«çµåãããã¿ã¼ã³ããè±æãã£ã³ã¯ã¨æ¥æ¬èªãã£ã³ã¯ãã©ã³ãã ã«ããã¯ã¹ãããã¿ã¼ã³ãä½æããããããã¦éååå¾ã®ã¹ã³ã¢ã測å®ãã¾ããã
- çµæã¨ãã¦ã¯ãæåã®ãã¹ãã§ä½¿ã£ããè±èªããã¹ãï¼å ï¼ï¼æ¥æ¬èªããã¹ãï¼å¾ï¼ãã®ãã¿ã¼ã³ãæé«ã¹ã³ã¢ã¨ãªãã¾ããã
- ãè±æãã£ã³ã¯ã¨åæãã£ã³ã¯ã®ã©ã³ãã ããã¯ã¹ãããæ¥æ¬èªããã¹ãï¼å ï¼ï¼è±èªããã¹ãï¼å¾ï¼ãã¯ãæ¥æ¬èªã®ã¿ããããã¹ã³ã¢ãä½ããªã£ã¦ãã¾ãï¼iMatrixè¨ç®ã®å¾åãæ¥æ¬èªããã¹ãã®ã¿ã§è¡ããã¨ã«ä½ãå¹æãããã®ã§ããããï¼ï¼
éæ
- è¨äºãé·ããªããããªã®ã§ãã²ã¨ã¾ããã®ãããã«ãä»ã®ã¢ãã«ãä»ã®éååãµã¤ãºã§ã®å¹æã追ã£ã¦æ¤è¨¼ã§ããã°ã¨æã£ã¦ãã¾ãã
- ãªããä»åã®ãã¹ãã§æãè¯ãããªå¹æã示ããæ¥è±æ··åiMatrixç¨ããã¹ãï¼c4_en_plus_jaï¼ã«ã¤ãã¦ã¯åèã¾ã§HuggingFaceã«ã¢ãããã¦ãã¾ãã
Â
Mergoo: LLMã®å¹ççãªãã¼ã¸ã¨å¾®èª¿æ´ã®ããã®ã©ã¤ãã©ãª (MoE, Mixture of Adapters)
Â
- ãMerggooãã¨ããLLMãã¼ã¸ç¨ã®æ°ããã©ã¤ãã©ãªãã§ããããã§ãéçºè ã®æ¹ãRedditã§ãPRãã¹ããä¸ãã¦ãããã¾ãã
- ãããããã¨Megekitã«ä»£ãããã®ã«ãªãã®ããããã¾ãããMoEã®ã«ã¼ãã£ã³ã°ã®å¦ç¿ããLoRA Adapterã®æ··åãªã©ã
é¢ç½ãããªã®ã§å¾ã§è©¦ããã°ã¨æãã¾ãããæ¦è¦ã ãå ã«è¦ã¦ãããã¨æãã¾ãã以ä¸ãéçºè ããã°ã®å訳ã§ãã
Mergoo: å¹ççã«ãã¼ã¸ãã微調æ´ãã (MoE, Mixture of Adapters)
ç§ãã¡ã¯æè¿ãè¤æ°ã® LLM ã¨ãã¹ãã¼ããç°¡åã«çµ±åããçµ±åããã LLM ãå¹ççã«ãã¬ã¼ãã³ã°ããããã®ã©ã¤ãã©ãªã§ãã mergoo ãéçºãã¾ããããã®ã©ã¤ãã©ãªã§ã¯ããã¾ãã¾ãªæ±ç¨ã¾ãã¯ãã¡ã¤ã³ãã¼ã¹ã® LLM ã¨ãã¹ãã¼ãã®ç¥èãå¹ççã«çµ±åã§ãã¾ãã
ð Mergooã§ã¯æ¬¡ã®ãã¨ãå¯è½ã§ãã
- è¤æ°ã®ãªã¼ãã³ã½ã¼ã¹LLMãç°¡åã«çµ±åã§ãã¾ã
- ãã¾ãã¾ãªãã¼ã¸æ¹æ³ãé©ç¨ï¼ã¨ãã¹ãã¼ãã®æ··åãã¢ããã¿ã®æ··åãã¬ã¤ã¤ã¼ãã¨ã®ãã¼ã¸ãå¯è½ã§ã
- ã¼ãããå§ãããã¨ãªãMoEãå¹ççã«ãã¬ã¼ãã³ã°ã§ãã¾ã
- HuggingFeceð¤ã¢ãã«ããã¬ã¼ãã¼ã¨äºææ§ãããã¾ã
å°å ¥
mergoo ã¯ãæ±ç¨çãªãã®ãããã¡ã¤ã³åºæã®ãã®ã¾ã§ããã¾ãã¾ãª LLM ã¨ãã¹ãã¼ãã®ç¥èãçµ±åããããã®ä¿¡é ¼æ§ãé«ãééçãªãã¤ãã©ã¤ã³ãæ§ç¯ããããã«è¨è¨ããã¦ãã¾ããã¨ãã¹ãã¼ãã®æ··åãã¢ããã¿ã®æ··åãã¬ã¤ã¤ã¼ãã¨ã®ãã¼ã¸ãªã©ã®ãã¾ãã¾ãªçµ±åææ³ãçµã¿è¾¼ã¾ãã¦ãããLLM ãã«ãã¼ã«æè»æ§ãæä¾ãã¾ãããã¼ã¸ããã LLM ã¯ãSFTrainerãPEFTãTrainer ãªã©ã® Hugging Face ð¤ ãã¬ã¼ãã¼ã使ç¨ãã¦ãç¹å®ã®ã¦ã¼ã¹ã±ã¼ã¹ã«åããã¦ããã«å¾®èª¿æ´ã§ãã¾ãã
次ã®ã»ã¯ã·ã§ã³ã§ã¯ãMoE ã使ç¨ãã¦å®å
¨ã«å¾®èª¿æ´ããã LLM ãããã¼ã¸ããã LLM ãæ§ç¯ããæ¹æ³ã¨ãLoRA ã§å¾®èª¿æ´ãããã¨ãã¹ãã¼ãããã¢ããã¿ã®æ··å LLM ãä½æããæ¹æ³ã示ã 2 ã¤ã®ä¾ã«ã¤ãã¦èª¬æãã¾ãã
å®å ¨ã«å¾®èª¿æ´æ¸ã¿ã®LLMã®æ··å
Branch-Train-Mixã«ç¶ãã¦ããã¡ã¤ã³åºæã® LLM ã¨ãã¹ãã¼ãã¯ãMixture-of-Expert (MoE) ã¬ã¤ã¤ã¼ã®ã¨ãã¹ãã¼ãã¨ãã¦ãã£ã¼ããã©ã¯ã¼ããã©ã¡ã¼ã¿ãã¾ã¨ããæ®ãã®ãã©ã¡ã¼ã¿ãå¹³ååãããã¨ã§çµ±åã§ãã¾ããMoE ã¬ã¤ã¤ã¼ã¯ãå¾ã§å¾®èª¿æ´ãã¦ãã¼ã¯ã³ ã¬ãã«ã®ã«ã¼ãã£ã³ã°ãå¦ç¿ã§ãã¾ãã
ä¾ã¨ãã¦ã次ã®ãã¡ã¤ã³åºæã® LLM ã¨ãã¹ãã¼ããçµ±åãã¾ãã
- ãã¼ã¹ã¢ãã«: metal-llama/Llama-2-7b-hf
- ã³ã¼ãã¨ãã¹ãã¼ã: codellama/CodeLlama-7b-Python-hf
- WikiChat ã¨ãã¹ãã¼ã: stanford-oval/Flame-2-7b-WikiChat-fused
ãã¼ã¸ããããã®è¨å®ãæå®ãã¾ã:
config = \ { "model_type": "llama", "num_experts_per_tok": 2, "experts":[ { "expert_name" : "base_expert", "model_id" : "meta-llama/Llama-2-7b-hf" }, { "expert_name" : "expert_1", "model_id" : "codellama/CodeLlama-7b-Python-hf" }, { "expert_name" : "expert_2", "model_id" : "stanford-oval/Llama-2-7b-WikiChat-fused" } ], "router_layers":[ "gate_proj", "up_proj", "down_proj" ], }
次ã«ããã¼ã¸ãããã¨ãã¹ãã¼ãã®ãã§ãã¯ãã¤ã³ããæ§ç¯ãã¦ä¿åãã¾ãã
import torch from mergoo.compose_experts import ComposeExperts model_id = "mergoo_llama_code_wikichat" expertmerger = ComposeExperts(config, torch_dtype=torch.float16) expertmerger.compose() expertmerger.save_checkpoint(model_id)
以ä¸ã§ã¯ããã¼ã¸ããã LLM ã®ãã§ãã¯ãã¤ã³ããèªã¿è¾¼ã¾ããPython ã³ã¼ãå½ä»¤ãã¼ã¿ã»ããã§ããã«å¾®èª¿æ´ããã¾ãã
from mergoo.models.modeling_llama import LlamaForCausalLM import torch import datasets import random from trl import SFTTrainer from transformers import TrainingArguments # load the composed checkkpoint model = LlamaForCausalLM.from_pretrained( "mergoo_llama_code_wikichat", device_map="auto", torch_dtype=torch.bfloat16, )# 'gate' / router layers are untrained hence loaded warning would appeare for them # load the train dataset dataset = datasets.load_dataset("iamtarun/python_code_instructions_18k_alpaca")['train'] dataset = dataset['prompt'] random.shuffle(dataset) train_dataset = datasets.Dataset.from_dict(dict(prompt=dataset[:-1000])) eval_dataset = datasets.Dataset.from_dict(dict(prompt=dataset[-1000:])) # specify training arguments trainer_args = TrainingArguments( output_dir= "checkpoints/llama_moe", per_device_train_batch_size = 1, per_device_eval_batch_size = 1, learning_rate= 1e-5, save_total_limit=1, num_train_epochs=1, eval_steps= 5000, logging_strategy="steps", logging_steps= 25, gradient_accumulation_steps=4, bf16=True ) trainer = SFTTrainer( model, args= trainer_args, train_dataset= train_dataset, eval_dataset= eval_dataset, dataset_text_field="prompt", ) # start training trainer.train()
ãã®å¾ãã³ã¼ãã Huggingface Hub ã«ããã·ã¥ã§ãã¾ã (ãã²ããã·ã¥ãã¦ãã ãã):
model.push_to_hub("mergoo_llama_code_wikichat_trained")
mergooã¯Mistralã¨Bertãã¼ã¹ã®ã¨ãã¹ãã¼ãããµãã¼ããã¦ãã¾ãã
ã¢ããã¿ã¼ã®æ··å(Mixture of Adapters)
mergoo ã¯ãè¤æ°ã®ã¢ããã¿ (LoRA) ãçµ±åããã MoE ã¹ã¿ã¤ã«ã®ã¢ã¼ããã¯ãã£ã«çµ±åãããã¨ã容æã«ãã¾ããããã¯ã微調æ´ããã LoRA ã®ä¸ã«ã²ã¼ãã£ã³ã° ã¬ã¤ã¤ã¼ã¨ã«ã¼ãã£ã³ã° ã¬ã¤ã¤ã¼ãé©ç¨ãããã¨ã§å®ç¾ããã¾ãã
ã¢ããã¿æ··å LLM ãæ§ç¯ããã«ã¯:
- åããã¼ã¹ã¢ãã«ã§å¾®èª¿æ´ãããã¢ããã¿ï¼LoRAï¼ã®ãã¼ã«ãåéãã
- Mergoo ãé©ç¨ã㦠MoE ã¹ã¿ã¤ã«ã®ãã¼ã¸ãããã¨ãã¹ãã¼ããä½æãã
- ä¸æµã®ã¿ã¹ã¯ã§ãã¼ã¸ãããã¨ãã¹ãã¼ãã微調æ´ãã
ãã¨ãã°ã顧客ãµãã¼ãé åã§ã¯æ¬¡ã®å°é家ãçµ±åã§ãã¾ãã
- predibase/customer_support
- predibase/customer_support_accounts
- predibase/customer_support_orders
次ã®ããã«æ§æãæå®ãã¦ããã¼ã¸ããããã§ãã¯ãã¤ã³ããæ§ç¯ãã¾ãã
import torch from mergoo.compose_experts import ComposeExperts model_id = "mergoo_customer_suppoer_moe_lora" config = { "model_type": "mistral", "num_experts_per_tok": 2, "base_model": "mistralai/Mistral-7B-v0.1", "experts": [ { "expert_name": "adapter_1", "model_id": "predibase/customer_support" }, { "expert_name": "adapter_2", "model_id": "predibase/customer_support_accounts" }, { "expert_name": "adapter_3", "model_id": "predibase/customer_support_orders" } ], } expertmerger = ComposeExperts(config, torch_dtype=torch.bfloat16) expertmerger.compose() expertmerger.save_checkpoint(model_id)
注ï¼LoRAã§ãã¼ã¸åè£ã®ã¨ãã¹ãã¼ãã微調æ´ããå ´åãexpert_nameã¯adapter_ã§å§ã¾ãã
ãå®å ¨ã«å¾®èª¿æ´ããã LLM ã®æ··åãã»ã¯ã·ã§ã³ã§å®ç¾©ããã¦ããããã«ããã¼ã¸ãããã¨ãã¹ãã¼ãã¯ããã«å¾®èª¿æ´ã§ãã¾ãã
çµã³
以ä¸ã®ããã«ãmergooã§ã¯ãè¤æ°ã®å°é家ã®ç¥èã確å®ãã¤ééçã«çµ±åãããã¨ãã§ãã¾ããmergooã¯mixture-of-expertãmixture-of-adapters (MoE-LoRA)ãlayer-wise mergingãå«ãããã¤ãã®çµ±åæè¡ããµãã¼ããã¾ãããã¼ã¸ãããLLMã¯ãä¿¡é ¼ã§ããã¨ãã¹ãã¼ããæä¾ããããã«ãä¸æµã®ã¿ã¹ã¯ä¸ã§ããã«å¾®èª¿æ´ãããã¨ãã§ãã¾ãã
Â
Â
Chat Vector ã®å¹æ vs. MoEãã¼ã¸ ã®å¹æ
- ååã®ç¶ãã§ãã次ã¯Chat Vectorå¦çã«ãã£ã¦ã©ããããæ¥æ¬èªãã£ããæ©è½ãæ¹åãã¦ããã®ããå ·ä½çã«ã¿ã¦ã¿ã¾ãã
- ä¸è¨ã®è¨äºã§ãChat Vectorã使ããªãã¦ãåç´ã«2ã¢ãã«ãMoEãã¼ã¸ããã ãã§ä¸å®ã®æ§è½åä¸ã観å¯ã§ãããã¨ã¯ç¢ºèªãã¦ãã¾ãã
- ã§ã¯ãChat Vectorãå ãããã¨ã§ã¢ãã«ã®ãã£ããæ§è½ã¯ã©ã®ããã«å½±é¿ãåããã®ã®ã§ããããã
åä½ã®7Bã¢ãã«ã«ãããChat Vectorã®å¹æ
- ã¾ããLightChatAssistant 2x7Bãã®åç¾éç¨ã§å¾ããããChatNTQ 7Bã«Mistral-7B-v0.2-Instructã®Chat Vectorãé©ç¨ããã¢ãã«ãã®æ§è½ã測å®ãã¦ã¿ã¾ãã
å称 | LightChat Assistant 2x7B |
Chat NTQ 7B random MoE |
ChatNTQ 7B +ChatVector |
ChatNTQ 7B |
ãã©ã¡ã¼ã¿ | 2x7B | 2x7B | 7B | 7B |
éååã¿ã¤ã | Q8_0 | Q8_0 | Q8_0 | Q8_0 |
ElyzaTasks100 ã¹ã³ã¢å¹³å |
3.31 | 3.25 | 3.23 | 3.06 |
æ¨æºèª¤å·® | 0.14 | 0.14 | 0.15 | 0.15 |
å¹³ååæ° | 273.61 | 261.27 | 280.04 | 240.46 |
- ChatVectorå¦çã ãã§å¹³åã¹ã³ã¢ã¯3.23ã«ä¸æãã¾ããã
- 以åä½ã£ããChatNTQ 7Bã®x2ã©ã³ãã ãã¼ã¸ãã®ã¹ã³ã¢ã3.25ã ã£ãã®ã§ããã®Chat Vectorã¯ã©ã³ãã ãã¼ã¸ã«è¿ãã¬ãã«ã®æ§è½ä¸æå¹æãããããã§ãã
- åçã®å¹³ååæ°ã«ã¤ãã¦ãã©ã¡ããå
±éãã¦å¢å ãã¦ãããã®ã®ãChat Vectorçã®ã»ããããé¡èã«å¢ãã¦ãã¾ãã
- ä½æçã«ãChat Vectorçã®åçã¯ç·ãã¦ä¸å¯§ã§ãMistral 7B v0.2 Instructã®å½±é¿ãæãããã¾ãã
- å®éã®åçä¾ãè¦ã¦ã¿ã¾ãã
- 以ä¸ã®ããã«ãå ã®ãChatNTQ 7Bãã§ã¯è§£ããªãã£ãè¨ç®åé¡ããã©ã³ãã ãã¼ã¸çã§ããChat Vectorçã§ãåãããã«æ£çã§ããããã«ãªãã¾ããã
ElyzaTasks100 è¨å26 |
質å: 太éããã¯ãããã5ã¤æã£ã¦ãã¾ããå½¼ã¯ããã«2ã¤ã®ãããã®ç®±ãè²·ãã¾ããã1ã¤ã®ç®±ã«ã¯3ã¤ã®ããããå ¥ã£ã¦ãã¾ãã太éããã¯ä½åã®ããããæã£ã¦ãã¾ããï¼ |
ChatNTQ-7B åçï¼ä¸æ£è§£Ãï¼ |
太éããã¯æåã«5ã¤ã®ããããæã£ã¦ãã¾ããããã®å¾ã2ã¤ã®ç®±ãè²·ããããããã«3ã¤ã®ããããå ¥ã£ã¦ããããã太éããã¯åè¨ã§8ã¤ã®ããããæã¤ãã¨ã«ãªãã¾ãããããã£ã¦ãå½¼ãæåã«æã£ã¦ãã5ã¤ã®ãããã¨ãå¾ããè²·ã£ã2ã¤ã®ç®±ã«å ¥ã£ã¦ãã3ã¤ã®ããããåãããã¨ãåè¨ã§8ã¤ã®ããããæã£ã¦ãããã¨ã«ãªãã¾ãã |
ChatNTQ 2x7B Random MoE åçï¼æ£è§£ãï¼ |
太éãããæåã«æã£ã¦ãã5ã¤ã®ãããã¨ã2ã¤ã®ç®±ã®åè¨åæ°ãè¨ç®ããå¿
è¦ãããã¾ãã 太éãããæåã«æã£ã¦ãã5ã¤ã®ãããã®æ°ã¯ã5ã§ãã 1ã¤ã®ç®±ã«ã¯3ã¤ã®ããããå ¥ã£ã¦ããããããã1ã¤ã®ç®±ã«ãåãæ°ã®3ã¤ã®ããããå ¥ã£ã¦ããã¨ä»®å®ããã¨ã2ã¤ã®ç®±ã«ã¯åè¨6ã¤ã®ããããå ¥ã£ã¦ãã¾ãã ãããã£ã¦ã太éãããæåã«æã£ã¦ãã5ã¤ã®ãããã¨ã2ã¤ã®ç®±ã«å ¥ã£ã¦ãã6ã¤ã®ããããåè¨ããã¨ã11åã®ããããæã£ã¦ãããã¨ã«ãªãã¾ãã ãããã£ã¦ã太éããã¯11åã®ããããæã£ã¦ãã¾ãã |
ChatNTQ-7B + Chat Vector åçï¼æ£è§£ãï¼ |
æåã«å¤ªéãããæã£ã¦ãããããã¯5ã¤ã§ããããã¦ã2ã¤ã®ç®±ãè²·ã£ã¦ããããã1ã¤ã®ç®±ã«3ã¤ãã¤å
¥ã£ã¦ããã®ã§ãåè¨ã§2 à 3 = 6ã¤ã®ãããã追å ããã¾ãã æåã®5ã¤ã¨åããã¦ã太éããã¯11åã®ããããæã£ã¦ãã¾ãã |
- MoEãã¼ã¸ã¨Chat Vectorã¨ããå ¨ãç°ãªãå¦çã§ãããå ¨è¬çãªæ§è½åä¸ã¯ã©ã¡ãã®ææ³ã«ããã¦ãæå¾ ã§ããããªå°è±¡ã§ãã
- ã©ã³ãã MoEãã¼ã¸ã®å ´åãããããç·ãã©ã¡ã¼ã¿æ°ã®å¢å ãæ§è½ã«å¯ä¸ãã¦ãã¦ãChat Vectorã®å ´åã¯è±èªã¢ãã«ã®è¿½å å¦ç¿ã¹ãã«ãæ§è½ã«å¯ä¸ãã¦ããã®ã ã¨æãã¾ãã
Chat Vectorã¢ãã«ã®ã©ã³ãã MoEãã¼ã¸
- ã¤ã¥ãã¦ãChat VectorãæãããChatNTQ 7Bããåç´ã«ã©ã³ãã MoEãã¼ã¸ããã¨ã©ããªãã§ãããããçµæã¯ä»¥ä¸ã®ã¨ããã§ãã
å称 | LightChat Assistant 2x7B |
ChatNTQ 7B + ChatVector Random MoE |
ChatNTQ 7B + ChatVector |
ChatNTQ 7B |
ãã©ã¡ã¼ã¿ | 2x7B | 2x7B | 7B | 7B |
éååã¿ã¤ã | Q8_0 | Q8_0 | Q8_0 | Q8_0 |
ElyzaTasks100 ã¹ã³ã¢å¹³å |
3.31 | 3.33 | 3.23 | 3.06 |
æ¨æºèª¤å·® | 0.14 | 0.15 | 0.15 | 0.15 |
å¹³ååæ° | 273.61 | 286.73 | 280.04 | 240.46 |
- ãã³ããã¼ã¯æ§è½ã§ã¯ãLightChatAssistant 2x7Bãã¨ã»ã¼åãã¬ãã«ã®ã¢ãã«ãå¾ããã¾ããã
- ãããã£ã¦ããLightChatAssistant 2x7Bãã®é«ãæ¥æ¬èªãã£ããæ§è½ããèªä½ã¯ããChat Vectorå¹æã +ããã¼ã¸ã«ãããã©ã¡ã¼ã¿å¢å å¹æãã®2ã¤ã ãã§èª¬æãããã¨ãã§ãããã§ãã
- ã¨ã¯ããå®éã®åºåãè¦æ¯ã¹ã¦ã¿ãã¨ãå°èª¬çæã¢ãã«ãæãåããããLightChatAssistant 2x7Bãã®åºåã®æ¹ããåä¸ã¢ãã«ãã¼ã¸ã®ãChatNTQ-ChatVector 2x7Bãããæããã«ã¯ãªã¨ã¤ãã£ãã§é¢ç½ãã¨æãã¾ãã
- ç°ãªãç¹åã¢ãã«ãæãåããããã¨ã®å¹æã¯ããã³ããã¼ã¯ã§æ¸¬å®ããããªãé¨åã«ããç¾ããã®ããããã¾ããã
å¥ã®ã¢ãã«ããChat Vectorãæ½åºãã¦ã¿ã
- ã¤ãã§ã«ãç°ãªãã¢ãã«ããChat Vectorãæ½åºãããã¨ã試ãã¦ã¿ã¾ãã
- å è«æããLightChatAssistant 2x7Bãã§å·®åæ½åºã«ä½¿ããããMistral 7B v0.2 Instructãã¯ãMistralå ¬å¼ã®ãã¡ã¤ã³ãã¥ã¼ã³ã¢ãã«ã§ãããä½æçã«ãåªãããã£ããã¢ãã«ã§ãã
- ãã ããã³ããã¼ã¯ä¸ã¯ãMistral 7B v0.2 Instructããä¸åãMistralç³»ã¢ãã«ãããã¤ãä½åºããã¦ãã¦ããã®ä»£è¡¨ä¾ãã¹ã¿ã³ãã©ã¼ã大å¦ç³»ã®ç 究è ãå ¬éãããStarling-LM-7B-betaãã§ãã
- Starling-7Bã·ãªã¼ãºã«ã¤ãã¦ã¯ã以ä¸ã®è¨äºã§æ¦è¦ã確èªãã¦ãã¾ãã
- ãStarling-LM-7B-betaãã¯ç¾å¨ããLMSYS Chatbot Arena Leaderboardãã§7Bã¢ãã«æé«ã¹ã³ã¢ã示ãã¦ãã¦ãè±èªãã£ããæ§è½ã§ã¯ãã®ã¯ã©ã¹ã®SOTAã§ãã
- 7Bã®ãªã¼ãã³ã¦ã§ã¤ãã人æã«ããè©ä¾¡ã§ GPT-3.5-Turboã»Claude 2.1ã»Gemini Pro ãªã©ä¸»è¦å社ã®åç¨ã¢ãã«ã¨åçã®è©ä¾¡ãå¾ãããã¨ããã®ã¯å°ã é©ãã§ãã
- ãã®ã¢ãã«ããChat Vectorãæ½åºãã¦æãåãããã°ãããæãã®æ¥æ¬èªãã£ããã¢ãã«ãåºæ¥ä¸ããäºæããã¾ãï¼ç´ 人ã®çºæ³ï¼ã
- ã¨ããããã§ãStarling-LM-7B-betaãã®Chat VectorããChatNTQ 7Bãã«æããã¢ãã«ãä½ãããã³ããã¼ã¯ã«ããã¦ã¿ã¾ããã
- ãªãããStarling-LM-7B-betaãã®Chat Vectorã¯1.0ã§æãã¦ãã¾ã£ããè±èªãæ··ãããªãã£ãã®ã§ã0.8ã§ã¯ãªã1.0ã®å¼·åº¦ãæ¡ç¨ãã¦ãã¾ãã
- çµæã¯ä»¥ä¸ã®ã¨ããã§ãã
å称 | Command R V01 |
LightChat Assistant 2x7B |
ChatNTQ + Staling ChatV |
ChatNTQ 7B |
ãã©ã¡ã¼ã¿ | 35B | 2x7B | 7B | 7B |
éååã¿ã¤ã | Q8_0 | Q8_0 | Q8_0 | Q8_0 |
ElyzaTasks100 ã¹ã³ã¢å¹³å |
3.42 | 3.31 | 3.42 | 3.06 |
æ¨æºèª¤å·® | 0.15 | 0.14 | 0.15 | 0.15 |
å¹³ååæ° | 194.02 | 273.61 | 340.82 | 240.46 |
- ã¹ã³ã¢ä¸ã¯ãLightChatAssistant 2x7Bããä¸åãã8bitéååãã¼ã¹ã§ãCommand R 35Bãã«ä¸¦ã¶ã¢ãã«ãåºæ¥ä¸ããã¾ããã
- åºååæ°ãå¤ãï¼å¹³å341åï¼ãã¨ã«ããä¸å¯§ãªãã£ãã対å¿ãã§ããããã«ãªã£ã¦ãã¾ããElyzaTasks100ã§ã¯åçã®ãæç¨æ§ããè©ä¾¡åºæºã«å«ã¾ããã®ã§ãè©ä¾¡è ã®GPT-4ããã®è¦³ç¹ããé«ãè©ä¾¡ãã¦ããããã§ãã
- ã¨ã¯ãããCommand R 35Bãã¯é¥ãã«ç°¡æ½ãªåçï¼å¹³å194åï¼ã§åãã ãã®ã¹ã³ã¢ãåºãã¦ããã®ã§ãç´ç²ãªãè³¢ããã«é¢ãã¦è¨ãã°æããã«ãCommand R 35Bããä¸ã ã¨æãã¾ãã
Starling LMã®Chat Vectorã¢ãã«ã®MoEãã¼ã¸
- ã§ã¯ããã®ã¢ãã«ãããã«MoEãã¼ã¸ããã°ãã£ã¨ã¹ã³ã¢ã¯ä¸ããã®ããã¨å®æã«èãã¦ãã£ã¦ã¿ã¾ããããçµæã¯ä»¥ä¸ã®ã¨ããã§ããã
å称 | LightChat Assistant 2x7B |
ChatNTQ + Staling ChatV Random MoE |
ChatNTQ + Staling ChatV |
ChatNTQ 7B |
ãã©ã¡ã¼ã¿ | 2x7B | 2x7B | 7B | 7B |
éååã¿ã¤ã | Q8_0 | Q8_0 | Q8_0 | Q8_0 |
ElyzaTasks100 ã¹ã³ã¢å¹³å |
3.31 | 3.18 | 3.42 | 3.06 |
æ¨æºèª¤å·® | 0.14 | 0.15 | 0.15 | 0.15 |
å¹³ååæ° | 273.61 | 351.42 | 340.82 | 240.46 |
- åºååæ°ããå¤å°å¢ãã¦ãããã®ã®ãGPT-4ã®è©ä¾¡ã¯éã«ä½ä¸ãã¦ãã¾ãã
- ä½æçã«ããå£ã®éãã«æèã追ãä»ãã¦ããªãããããªãã¡ãã¯ããªå°è±¡ãåããã¢ãã«ã«ãªã£ã¦ãã¾ãã¾ããã
- åä¸ã¢ãã«ã«å¯¾ãã¦æé©ãªChat Vectoræ¯çã¨MoEãã¼ã¸å¾ã«æé©ã«ãªãChat Vectoræ¯çãç°ãªãã¨ãã話ï¼ãã¡ãã®Aratakoããã®ã¢ãã«ã«ã¼ãåç §ï¼ããããå¿ ããã足ãç®ã«ã¯ãªããªãã®ã ã¨å確èªãã¾ããã
éæ
- ã¨ãããã¨ã§ãã²ã¨ã¾ããLightChatAssistant 2x7Bãã®å¾è¿½ããããªããMoEã¨ChatVectorã§ããã¤ãã®è©¦è¡é¯èª¤ããã¦ã¿ã¾ããã
- ãªããStarling ChatV 7Bã¢ãã«ãã«ã¤ãã¦ã¯ãã³ããã¼ã¯æ§è½ããã¾ãã«é«ããä½ãã®åèã«ããªãæ°ãããã®ã§ãHuggingFaceã«ã¢ãããã¦ã¿ã¾ããã
- Llama-3ã®å°åã¢ãã«ã®ãªãªã¼ã¹ãéè¿ã«è¿«ã£ã¦ããã¨ãã話ãããããã¡ãã楽ãã¿ã«ããªãããCVã¨MoEã§ä½ãã§ããããå¼ãç¶ã試ãã¦ããããã¨æãã¾ãã
Â