ãã®è¨äºã¯ãLayerX Tech Advent Calendar 2024 ã® 12æ¥ç®ã®è¨äºã§ãã
ããã«ã¡ã¯ãLayerXã®AIã»LLMäºæ¥é¨ãããã¯ãããã¼ã¸ã£ã¼ã®éç(@isseinohata)ã§ãã
AIã»LLMäºæ¥é¨ã§ã¯çæAIãã©ãããã©ã¼ã ãAi Workforceããéçºãã¦ãã¾ãã
LLMãç¨ããã¢ããªã±ã¼ã·ã§ã³éçºã«ã¯ç¬èªã®ç¹å¾´ã課é¡ãåå¨ãã¦ãããAi Workforceã®éçºãã¼ã ããæ¥ã æ§ã ãªãã£ã¬ã³ã¸ã«åãåã£ã¦ãã¾ããä»åã¯ããã®ä¸ã§ãç¹ã«LLMã®ãåºåã®ä¸ç¢ºå®ããã«èµ·å ããéçºããã»ã¹ã®èª²é¡ã解決ããããã®æ¹æ³ã¨ãã¦ãè©ä¾¡é§åéçºã¨ããã¢ããã¼ãããç´¹ä»ãã¾ãã
è©ä¾¡é§åéçºãç´¹ä»ããåã«ãLLMãã¢ããªã±ã¼ã·ã§ã³ã«çµã¿è¾¼ãä¸ã§ã®ç¹å¾´ã課é¡ã«ã¤ãã¦ãç°¡åã«ã¾ã¨ãã¦ã¿ã¾ãã
LLMãç¨ããã¢ããªã±ã¼ã·ã§ã³éçºã®ç¹å¾´
1. 精度åä¸ã®ããã®ããã»ã¹è¨è¨ã¨ãã¡ã¤ã³ç¥èã®æ¡å
æ¥åã¢ããªã±ã¼ã·ã§ã³çã§æ¯è¼çè¤éæ§ã®é«ãå¦çãAIã§èªååããã±ã¼ã¹ã«ãã¦ãããã£ããåã®æ±ç¨çãªæ¥åã¢ã·ã¹ã¿ã³ããRAGã·ã¹ãã ãAIã¨ã¼ã¸ã§ã³ãçã®éçºãé²ããã±ã¼ã¹ã«ãã¦ããLLMã¢ããªã±ã¼ã·ã§ã³ã®æ§ç¯ã«ããã¦ã¯LLMã®å¦çããã»ã¹ã®è¨è¨ã¨ãã¡ã¤ã³ç¥èã®æ¡å ãã·ã¹ãã ã®æ§è½ã大ããå·¦å³ãã¾ãã
ç¹ã«è¤éãªæ¥åãå®å®çã«è§£æ±ºãããã¨ãæ±ããããã±ã¼ã¹ã§ã¯ãã¿ã¹ã¯ã®åå²ã«ãã£ã¦LLMã解ããããç²åº¦ã®åé¡ã«è½ã¨ãè¾¼ã¿ããã¡ã¤ã³ç¥èãã³ã³ããã¹ããä¸è¶³ãªãä¸ãããã¨ã«ãã£ã¦å®ç¨åã¬ãã«ã®ç²¾åº¦ãåºããã¨ãå¯è½ã«ãªãã¾ãã
Beyond PoC〜LLMを本番業務で適用するためにLayerXで取り組んでいること〜 - Speaker Deck
ä¸æ¹ã§AIã¨ã¼ã¸ã§ã³ããRAGã·ã¹ãã ã®ããã«æ¯è¼çæ±ç¨æ§ãã¤ã³ã¿ã©ã¯ãã£ãæ§ãæ±ããããç¨éã§ããã°ãåå¥ã®ã¿ã¹ã¯ãå解ããã®ã§ã¯ãªããããæ½è±¡çãªã人éã®åºç¤çãªèªç¥ã»æ¨è«è½åã«è¿ãæ©æ§ãå®ç¾©ãããã®çµã¿åããã¨ãã¦ããã»ã¹ã®æ§ç¯ã»æé©åããããã¨ãéè¦ã¨ãªãã¾ãã
ãããã®ã±ã¼ã¹ã§ãã¦ã¼ã¹ã±ã¼ã¹ã«åãããããã»ã¹ã®è¨è¨ã¨ããã¡ã¤ã³æ å ±ã®åãè¾¼ã¾ãæ¹ï¼äºåã«ãã¥ã¼ãã³ã°ããããã¤ã³ã¿ã©ã¯ãã£ããªä½é¨ã®ä¸ã§å¸åããããï¼ããLLMãå®ç¨ã¬ãã«ã§ã·ã¹ãã ã«çµã¿è¾¼ãä¸ã§ã¯ä¸å¿çãªè¦ç´ ã¨ãªãã¨æãã¾ãã
2. ã³ã¹ããæéã精度ã®ãã©ã³ã¹ã¨UXã®å·¥å¤«ã«ãããã¬ã¼ããªãèªä½ã®è§£æ¶
å¦çããã»ã¹ã®è¤éåã¯è¨ç®ã³ã¹ããå¦çæéã®å¢å ãæããããã·ã¹ãã ã®æ§è½åä¸ã¨ã®éã§ãã¬ã¼ããªãã®é¢ä¿ã«ããã¾ããç¹ã«LLMã¢ããªã±ã¼ã·ã§ã³ã§ã¯å¦çããã¼ãåã¹ãããã®ææ¬çãªæ¹ä¿®ã«ãã£ã¦ã¦ã¼ã¶ã¼ã¸ã®æä¾ä¾¡å¤ãå¤§å¹ ã«åä¸ããå ´åããããããå±æçãªæé©åã«ç¸ãããã®ã§ã¯ãªããã¢ã¦ãã«ã èªä½ã®æ¡å¤§å¯è½æ§ãå«ãã¦æ¢ç´¢ãã¦ãããã¨ãéè¦ã¨ãªãã¾ãã
ã¾ããå¦çæéã«å¯¾ããã¦ã¼ã¶ã¼ã®æãæ¹ã¯ãéä¸åºåã®è¦ãæ¹ãã¤ã³ã¿ã©ã¯ã·ã§ã³ã®è¨è¨ã«ãã£ã¦ã大ããå¤ããã¾ããä¾ãã°ãããã»ã¹ã®é²æç¶æ³ãæ確ã«ç¤ºããããéä¸ã§å¾ãããé¨åçãªçµæãé次æ示ãããã¨ã§ãæçµçãªã¢ã¦ãããããå¾ãããã¾ã§ã®ã¦ã¼ã¶ã¼ã®ä½æé度ãæ¹åãããã¨ãå¯è½ã§ãã
ãã®ãããåç´ã«ã³ã¹ããæéã精度ã®ãã¬ã¼ããªãã®ä¸ã§æé©åãå³ãã ãã§ãªããã¦ã¼ã¶ã¼ä½é¨ã®å·¥å¤«ãã¢ã¼ããã¯ãã£ã®ææ¬çãªæ¹åãéãã¦ãå ¨ä½ã®ãã©ã³ã¹ãã¨ãã¤ã¤æä¾ä¾¡å¤ã®æ¡å¤§ãé²ãã¦ãããã¨ã«ãªãã¾ãã
LLMãç¨ãããããã¯ãéçºã®èª²é¡
LLMãç¨ãããããã¯ãéçºç¹æã®èª²é¡ã¨ãã¦ãLLMã®åºåã確ççã§ä¸ç¢ºå®ããä¼´ãç¹ã¯ããææããã¾ãããåè¿°ã®è¦³ç¹ã«æ²¿ã£ã¦æ§ã ãªã¿ã¹ã¯åå²ãæ½è±¡çãªå¦çæ©æ§ãçµã¿åãããè¤éãªã¢ã¼ããã¯ãã£ãæ§ç¯ããã¨ã確ççãªè¦ç´ ã絡ã¿åããã¨ã§å ¨ä½ã®ä¸ç¢ºå®ããããã«å¢å ãã¾ãã
ãã®çµæãããã»ã¹ã®çµã¿æ¿ããæ°ããã¿ã¹ã¯ã»æ©æ§ã®å°å ¥ãåã¿ã¹ã¯ã«ãããããã³ããã®æ¹åãªã©ãããããã¿ã¤ãã®å¤æ´ãã·ã¹ãã å ¨ä½ã®ããã©ã¼ãã³ã¹ã«ã©ã®ãããªå½±é¿ãåã¼ãããææ¡ããããªããªãåé¡ãçºçãã¾ãã
Ai Workforceã®éçºã«ããã¦ããæ¥ã çºè¦ããã課é¡ã«å¯¾ãã¦ãã¾ãã¾ãªæ¹åæ½çãæã¤ãã®ã®ããããã®æ½çãä»ã«ã©ã®ãããªå½±é¿ãåã¼ãã®ããææ¡ãããããã¢ã¦ãã«ã ã®æ¡å¤§ããã¬ã¼ããªãã®æé©åãç®çã«ãããã¯ãèªä½ã®æ¹åãµã¤ã¯ã«ãåã観ç¹ã§ãããããã¯ã·ã§ã³ãªãªã¼ã¹æã®å質ä¿è¨¼ã®è¦³ç¹ã§ã大ããªè¶³æ·ã¨ãªã£ã¦ãã¾ããã
è©ä¾¡é§åéçºï¼Eval-driven development ï¼ã®å°å ¥
ãã®ãããªLLMç¹æã®ä¸ç¢ºå®ãã«èµ·å ããéçºããã»ã¹ã®èª²é¡ã«å¯¾ãã解決çã¨ãã¦æ³¨ç®ããã¦ããã®ããè©ä¾¡é§åéçºï¼Eval-driven developmentï¼ã§ãã
è©ä¾¡é§åéçºã¨ã¯çæAIãLLMãæ´»ç¨ããã·ã¹ãã éçºã«ããã¦ãã·ã¹ãã ã®åºåã®è©ä¾¡ï¼evaluationï¼ãä¸å¿ã«è¨è¨ãéçºãæ¹åã®éçºããã»ã¹ãåãææ³ã§ãã
å¾æ¥ã®ã½ããã¦ã§ã¢éçºã«ããããã¹ãé§åéçºï¼TDDï¼ã®æ¦å¿µã¨åºæ¬çã«ã¯åãã§ããããã¹ãé§åéçºãäºæ¸¬å¯è½ãªã·ã¹ãã ã«é©ãã¦ããã®ã«å¯¾ããè©ä¾¡é§åéçºã¯LLMãªãã§ã¯ã®ç¢ºççãªæ¯ãèããèªç¶è¨èªã«ããå ¥åºåã®å質ãç¶ç¶çã«è©ä¾¡ãããããæ¹åãµã¤ã¯ã«ã«çµã¿è¾¼ããã¨ã§ãåè¿°ãããããªLLMãç¨ãããããã¯ãéçºç¹æã®èª²é¡ã解決ããããã®ã¢ããã¼ãã§ãã
å ·ä½çã«ã¯ãã·ã¹ãã ã®ã¢ã¦ãã«ã ã¨é£åããè©ä¾¡ææ¨ï¼åçã®æ£ç¢ºæ§ãèªç¶ããæ£è§£ãã¼ã¿ã¨ã®å¿ å®æ§ãå¿çé度ãå¦çã³ã¹ããªã©ã®ãã¼ã·ãã¯ãªãã®ããã¦ã¼ã¹ã±ã¼ã¹ç¹åã®ã«ã¹ã¿ãã¤ãºãããè©ä¾¡è¦³ç¹ãªã©ï¼ã¨ãã¼ã¿ã»ãããå®ç¾©ããããããåºã«ã·ã¹ãã ã®åå¥è¦ç´ ã¨å ¨ä½ããããã®åºåå質ãè©ä¾¡ããä»çµã¿ãæ§ç¯ãã¾ããå®éã®éçºããã»ã¹ã§ã¯ãã¾ãã¾ãªå®è£ ãæ¹åæ½çã«å¯¾ãã¦è©ä¾¡ãå®æ½ããæå¾ ãããè©ä¾¡çµæãå¾ãããã¾ã§æ¹åãµã¤ã¯ã«ãåãã¦ããã¾ãã
ããã«ããåæ½çãçã£ã課é¡è§£æ±ºã«ã¤ãªãã£ã¦ããã®ããããã¾ã§ã§ãã¦ãããã¨ãã§ããªããªã£ã¦ããªãããã³ã¹ããå¦çæéã«è¨±å®¹ã§ããªãå½±é¿ãåã¼ãã¦ããªããããªã©ãç¶²ç¾ çãå¹ççã«ææ¡ã§ããããã«ãªããããå°ã«è¶³ã®ã¤ããæ¹åããã»ã¹ãåããã¨ãå¯è½ã«ãªãã¾ãã
OpenAI DevDayã§ãLLMã®ããã«é決å®è«çãªæè¡ãåããã¢ããªã±ã¼ã·ã§ã³ããããã¿ã¤ãããæ¬çªç°å¢ã¸ç§»è¡ããéã®éè¦ãªãã¬ã¼ã ã¯ã¼ã¯ã¨ãã¦è©ä¾¡é§åéçºãç´¹ä»ããã¦ãã¾ãã
The key here is to adopt evaluation-driven development. Good evaluations are the ones which are well correlated to the outcomes that you're trying to derive or the user metrics that you care about. They have really high end-to-end coverage in the case of RAG and they're scalable to compute. This is ...
è©ä¾¡é§åéçºã«ãããè©ä¾¡æ¹æ³ã¨ããã»ã¹
è©ä¾¡é§åéçºã«ãããè©ä¾¡ã¯ä¸è¬çã«ä»¥ä¸ã®3ã¤ã®æ¹æ³ã®çµã¿åããã§å®æ½ããã¾ãã
- 人éã«ããè©ä¾¡ï¼
- è¤æ°ã®ã¿ã¹ã¯ã横æããæçµã¢ã¦ããããã®è©ä¾¡ããã¥ã¢ã³ã¹ãå«ããè©ä¾¡ãªã©ãèªååãã«ããç®æã®è©ä¾¡ã§æ´»ç¨ã§ããä¸æ¹ãã¹ã±ã¼ã«æ§ãä½ããã¨ã課é¡ã¨ãªãã¾ãã
- AIã«ããè©ä¾¡ï¼
- ããããLLM as a judgeã¨å¼ã°ãããçæãããåºåãLLMèªä½ã«è©ä¾¡ãããææ³ã§ãããã®æ¹æ³ã¯ã¹ã±ã¼ã«æ§ãé«ãã®ãç¹å¾´ã§ãããLLMã«ããèªåè©ä¾¡èªä½ã®è©ä¾¡ããã¥ã¼ãã³ã°ãå¿ è¦ã«ãªããè©ä¾¡ç²¾åº¦ã®æ ä¿ãéè¦ã«ãªãã¾ãã
- ã³ã¼ããã¼ã¹ã®è©ä¾¡ï¼
- å ·ä½çãªã«ã¼ã«ãåºæºã«åºã¥ãã¦AIã®åºåãè©ä¾¡ããã¢ããã¼ãã§ããããç¨åº¦ã«ã¼ã«ãã¼ã¹ã«å質ãè©ä¾¡ã§ããã±ã¼ã¹ã§ã¯æ´»ç¨ã§ãã¾ããããã以å¤ã®ã±ã¼ã¹ï¼ã»ã¨ãã©ã®ã±ã¼ã¹ï¼ã§ã¯æ´»ç¨ã§ããªãç¹ããã¡ãªããã§ãã
å½ç¤¾ã§ãåæã¯å°æ°ã®è©ä¾¡ãã¼ã¿ã»ãããæ§ç¯ãã人æã«ããè©ä¾¡ãè¡ã£ã¦ãã¾ããããã¢ã¼ããã¯ãã£ã®è¤éæ§ãè©ä¾¡ãã¼ã¿æ°ã®å¢å ã«ä¼´ããæ¯è¼çããã«äººåã®è©ä¾¡ãé£ãããªã£ããããRAGASãLangfuseãå°å ¥ãã¦AIã«ããè©ä¾¡ã®èªååãã¢ãã¿ãªã³ã°ã«åãçµãã§ãã¾ãã
è©ä¾¡ããã»ã¹ã«ã¤ãã¦ã¯ã以ä¸ã®ãããªé²ãæ¹ããã¦ãã¾ãã
- è©ä¾¡ã®ä»çµã¿ã®éçº
- ã¦ã¼ã¹ã±ã¼ã¹ãã¨ã«ä»£è¡¨çãªå°æ°ï¼åæ°åç¨åº¦ï¼ã®ãã¼ã¿ã»ãããä½æãã
- RAGASãªã©ãæ´»ç¨ãã代表çãªè©ä¾¡ææ¨ãå°å ¥
- ãªãªã¼ã¹æ
- å°å ¥ãã¹ããæ©ãã§ããæ°ããªå¤æ´ãæ¹åãå®éã«è©ä¾¡ãã¦ã¿ã
- è©ä¾¡çµæãè¦ã¤ã¤ãæ©è½ã®å®è£ å¤æã¨è©ä¾¡èªä½ã®æ¹åãè¡ã
- ãªãªã¼ã¹ãããå¾
- ã¦ã¼ã¶ã¼ã®ãã£ã¼ãããã¯ãå©ç¨ãã°ãã¼ã¿ãéãã¤ã¤ããªãªã¼ã¹ããæ©è½èªä½ãè©ä¾¡
- è©ä¾¡åºç¤èªä½ãè²ã¦ãããã«ãå©ç¨ãã¼ã¿ããè©ä¾¡ãã¼ã¿ã»ããã®æ¡å ãè©ä¾¡åºæºãæ¹å
ç¹ã«åæã¯ãã¹ãã±ã¼ã¹ã®ç¶²ç¾ æ§ãã»ã¨ãã©ãªãã¨æããããããå¾è¿°ãããããªæ£è§£ãã¼ã¿ä½æã®é£ããã«ãã£ã¦ä¸ã¤ã®ãã¼ã¿ã»ãããå®ç¾©ããã®ã«æ°æéç¨åº¦ããããã¨ãããã«ãã£ãããããã®åãçµã¿èªä½ã«æå³ãããã®ã ãããã¨ä¸å®ã«ãªãã¾ããã ãããç¾å®çã«æåãã大éã®ãã¼ã¿ã»ãããæºåããã®ã¯é£ããã§ãããæ£ç¢ºãªè©ä¾¡ããããã¾ãè©ä¾¡ããã»ã¹èªä½ãåãå§ãããã¨èªä½ãéè¦ã§ããããåºæ¬çã«ã¯å°æ°ã§ãè¯ãã®ã§åãã¦ã¿ããã¨ã大äºãªã®ã ã¨æãã¾ãã
è©ä¾¡é§åéçºï¼Eval-driven development ï¼ã®é©ç¨äºä¾
å½ç¤¾ã§ã¯ã¾ã åãçµã¿åæã®æ®µéã§ãç´¹ä»ã§ããäºä¾ãå°ãªããããããã¤ãå ¬éããã¦ããäºä¾ãä¸å¿ã«ãç´¹ä»ãã¾ãã
Vercel v0
èªç¶è¨èªããUIãã¶ã¤ã³ï¼ã³ã¼ãï¼ã®èªåçæãå¯è½ãªv0ãéçºãã¦ãããã¼ã ã®çºè¡¨ããããã°ã§ã¯ãVercelãåãçµãã§ããè©ä¾¡é§åéçºï¼Eval-driven development ï¼ã«ã¤ãã¦èª¬æããã¦ãã¾ããèªåã¹ã¯ãªããã«ããè©ä¾¡ãã¹ãã®å®è¡ã¨GitHubãã«ãªã¯ã¨ã¹ãã¨ã®é£æºãè¡ããåºåã«å½±é¿ãåã¼ããã¹ã¦ã®ãã«ãªã¯ã¨ã¹ãã«ã¤ãã¦è©ä¾¡ãããªã¬ã¼ãããçµæãéçºè ã«ãã£ã¼ãããã¯ããã¦ããããã§ããè©ä¾¡ãéçºããã»ã¹ã«çã«çµã¿è¾¼ã¾ãã¦ããé¢ç½ãäºä¾ã ã¨æãã¾ãã
Anaconda Assistant
Anaconda社ã¯Anaconda Assistantï¼ãã¼ã¿ãµã¤ã¨ã³ã¹ãAIããã¸ã§ã¯ãã«ãããã³ã¼ãã®ä½æãåæããããã°ãæ¯æ´ãããã¼ã«ï¼ã®éçºã«ããã¦ãè©ä¾¡é§åéçºãæ¡ç¨ããéçºããã»ã¹ã«ã¤ãã¦ã®ããã°ãå ¬è¡¨ãã¦ãã¾ããã¦ã¼ã¶ã¼ã«ã¨ã£ã¦ã®ã¯ãªãã£ã«ã«ãªæ§è½ã«ãã©ã¼ã«ã¹ããè©ä¾¡åºæºã®è¨å®ããè©ä¾¡ã®å ã®èªåãã¥ã¼ãã³ã°ï¼Agentic Feedback Iterationï¼ã®åãçµã¿ãªã©ãç´¹ä»ããã¦ãããåèã«ãªãé¨åãå¤ãã§ãã
Dosu
Dosuã¯GitHubãªãã¸ããªä¸ã§ã½ããã¦ã§ã¢éçºè ãæ¯æ´ããAIãã¼ã«ã§ãç¹ã«éã³ã¼ãã£ã³ã°ã¿ã¹ã¯ï¼è³ªå対å¿ã課é¡ã®åé¡ãªã©ï¼ã軽æ¸ãããã¨ã§ãéçºè ãæ¬æ¥ã®ä½æ¥ã«éä¸ã§ããç°å¢ãæä¾ãã¦ãã¾ããDosuã®éçºãã¼ã ãå½åã¯ãã°ãæåã§åæãã¦èª²é¡ã®ç¹å®ãæ¹åãè¡ã£ã¦ãããã®ã®ãã³ã¼ãã®æ´æ°ã¨ç°ãªãLLMã®ããã³ãããå¦çããã»ã¹ã®èª¿æ´ãåã¼ãä¸ç¢ºå®ãªå½±é¿ãææ¡ãåããã¨ãé£ãããªããè©ä¾¡é§åéçºãæ¡ç¨ãã¦ãã¾ãã
è©ä¾¡è¨è¨ã«ãããé£ãã
è©ä¾¡é§åéçºã«åãçµãä¸ã§ç¹ã«é£æ度ãé«ãã¨æããã®ã¯æ£è§£ãã¼ã¿ã®ä½æã§ãã LLMã·ã¹ãã ã§ã¯å¾ã ã«ãã¦æ£è§£èªä½ãå®ç¾©ãããã¨ãé£ããã±ã¼ã¹ãå¤ãã¨æãã¾ãã ããã¤ãçç±ã¯ããã¨ãããã¾ãããç¹ã«èªåãæããçç±ãæãã¦ã¿ã¾ãã
- LLMã«ä»»ããã¿ã¹ã¯ã®ç¹æ§ã«ãããã®
- å®æ¥åã«LLMãé©ç¨ããå ´åããããªãå®ç¨åã¬ãã«ã®æçµã¢ã¦ãããããä½ããã¨ã¯é£ãããããä¸éçæç©ãLLMã«ä½ããããã¨ã§å ¨ä½ã®å¹çåãç®æãã±ã¼ã¹ãå¤ãã§ãã
- ãã®ã±ã¼ã¹ã§ã¯äººã«ãã£ã¦ä½æ¥ããã»ã¹èªä½ãéããã¨ããã£ãããããã¾ã§ã®æ¥åã§ä¸éçæç©ãæ確ã«ã¢ã¦ããããã¨ãã¦èªèããã¦ããªãã±ã¼ã¹ãå¤ããæ£è§£ã®å®ç¾©èªä½ããå§ããå¿ è¦ããããé£ããã®è¦å ã«ãªã£ã¦ãã¾ãã
- ã¦ã¼ã¶ã¼å´ã®ã³ã³ããã¹ãã®éãã«ãããã®
- ä»®ã«å¥ã®ã¦ã¼ã¶ã¼ããåãæé¢ã§è³ªåãæ示ãæããããå ´åã§ããããããã®ã¦ã¼ã¶ã¼ã®æ¢ç¥ã®æ å ±ããã¦ã¼ã¶ã¼ãã¨ã«ç½®ãããã³ã³ããã¹ããéãå ´åã¯ãåãã¢ã¦ããããã§ãç°ãªãè©ä¾¡ã«ãªããã¨ãããå¾ã¾ãã
- æ£ç¢ºã«è©ä¾¡ããã®ã§ããã°ã¯ã¨ãªã¨æ£è§£ãã¼ã¿ã®ã»ããã ãã§ã¯ãªããã¦ã¼ã¶ã¼ã®ç½®ãããã³ã³ããã¹ããæ¢ç¥ã®æ å ±ãªã©ãã¨ã«ãæ£è§£ãåãã¦ä½ãå¿ è¦ãåºã¦ãã¾ããããã®ã¬ãã«ã®ç´°ããè©ä¾¡ãã¼ã¿ã»ãããç¶²ç¾ çã«ä½ã£ã¦ããã®ã¯ç¾å®çã§ã¯ããã¾ããã
- ã¤ã³ã¿ã©ã¯ãã£ããªä½é¨è¨è¨ã«ãããã®
- ç¹ã«AIã¨ã¼ã¸ã§ã³ãããã£ããåã®RAGã·ã¹ãã ãªã©ã¤ã³ã¿ã©ã¯ãã£ãæ§ãæãããã±ã¼ã¹ã«ããã¦ã¯ãä¸åä¸åã®ããåãèªä½ãããã¤ã³ã¿ã©ã¯ã·ã§ã³å ¨ä½ãéãã¦ã¦ã¼ã¶ã¼ã®èª²é¡ã解決ããããã©ãããéè¦ã¨ãªãã¾ãã
- ã¤ã³ã¿ã©ã¯ãã£ããªããåãã¾ã§å«ãã¦æ§ã ãªãã¿ã¼ã³ã®è©ä¾¡ãã¼ã¿ã»ãããä½æããã®ã¯éç¾å®çã§ãããã¦ã¼ã¶ã¼ç®ç·ã§ã¯åå¥ã®åºåã¨ããããã¯å ¨ä½ã®ä½é¨ã¨ãã¦ã®ã¢ã¦ãã«ã ã®è©ä¾¡ã«ãªãããããã®ã®ã£ãããã©ã®ããã«åããããé常ã«é£ããã¨æãã¦ãã¾ãã
ãã®ãããªé£ãããã©ãä¹ãè¶ãã¦ããã®ãã¯ã¾ã è¦ãã¦ãã¾ããããä¾ãã°å¾ãã®äºç¹ã«ã¤ãã¦ã¯è©ä¾¡ç¨ã®AIã¨ã¼ã¸ã§ã³ããæ§ç¯ããã¦ã¼ã¶ã¼ãéæãããä»®æ³çãªã´ã¼ã«è¨å®ã¨ã³ã³ããã¹ããä¸ãã¦ãæ§ã ãªã·ããªãªãèªåè©ä¾¡ããã¨ãã§ããã¨å¤¢ãåºãããªã¼ã¨æã£ã¦ãã¾ãã
æå¾ã«ï¼
ä»åãç´¹ä»ããè©ä¾¡é§åéçºï¼Eval-driven developmentï¼ã®å®è·µã«ããã¦ã¯ãè©ä¾¡åä½ã®æé©åãè©ä¾¡ææ³ã®ã¹ã±ã¼ã«æ§ããã¼ã¿ã»ããã®æ§ç¯æ¹æ³ãªã©ãè¶ ããã¹ãåé¡ã¯ããããããã¾ããè©ä¾¡åºç¤ãæ©æã«æ§ç¯ããé©åãªè©ä¾¡ãä¸å¿ã¨ããéçºããã»ã¹ãåãããã©ããã¯ãLLMã¢ããªã±ã¼ã·ã§ã³éçºã«ããã¦é·æçã«ã¿ã¦é常ã«ã¬ãã¬ãã¸ã®å¹ãé¨åã ã¨æã£ã¦ãããä»å¾ã®AIãããã¯ãéçºã«ãããã¹ã¿ã³ãã¼ãã«ãªã£ã¦ããã®ã§ã¯ã¨èãã¦ãã¾ãã
å½ç¤¾ã§ãæ¥ã 模索ããªããé²ãã¦ãããããAIæ代ã®ãããã¯ãéçºã«èå³ã®ããæ¹ããã§ã«åãçµã¾ãã¦ããæ¹ãæ¯é以ä¸ã®ãªã³ã¯ããã話ãã§ããã¨å¬ããã§ãï¼
ã¾ããLayerXã§ã¯ãAIãããã¯ãéçºã«èå³ã®ãããã¶ã¤ãã¼ãã¨ã³ã¸ãã¢ãPdMãåéãã¦ãã¾ãã ãèå³ã®ããæ¹ã¯ããã²ãã§ãã¯ãã¦ãã ããï¼