æè¿RAGã¢ããªã±ã¼ã·ã§ã³ã®è©ä¾¡ããã®ç®¡çãã¼ã«ã«ã¤ãã¦èª¿ã¹ããã¨ãããã¾ããã ä»åã¯RAGã¢ããªã±ã¼ã·ã§ã³ã§ã®å®é¨ç®¡çã«ä½¿ç¨ã§ããPhoenixã使ã£ã¦ã¿ãã®ã§ãã®ã¡ã¢ã§ãã
RAGã¢ããªã±ã¼ã·ã§ã³ã¨è©ä¾¡
Retrieval-Augmented Generation (RAG)ã¯ãLLMã«å¤é¨ã®ç¥èã½ã¼ã¹ããã®è¿½å æ å ±ãæä¾ãããã¨ã§ãLLMèªä½ãç¥ããªãç¥èãè£ããããæ£ç¢ºã§æèã«æ²¿ã£ãçããçæããã¢ããªã±ã¼ã·ã§ã³ã§ãã 大ã¾ãã«ã¯ä¸è¨ã®ãããªæµãã§åä½ãã¾ãã
- ã¦ã¼ã¶ã¼ããã®ã¯ã¨ãªããã¨ã«é¢é£ããããã¥ã¡ã³ããæ¤ç´¢ (retrieve)
- ã¦ã¼ã¶ã¼ã®ã¯ã¨ãªãé¢é£ããããã¥ã¡ã³ããå«ããå½¢ã§ããã³ãããåçã«ä½æ (Augment)
- LLMã«ãã£ã¦åçãçæ (Generate)
ãã®ããã«çµã¿ä¸ããããRAGã¢ããªã±ã¼ã·ã§ã³ã§ããããã®ã¢ããªã±ã¼ã·ã§ã³ã®è©ä¾¡ããã®ç²¾åº¦ã®ãã©ããã³ã°ã«ã¤ãã¦ã¯ä¸ã¤åä»ãªãã¤ã³ãã«ãªã£ã¦ãã¾ãã
RAGã¢ããªã±ã¼ã·ã§ã³ã®è©ä¾¡
å ã«ç¤ºããããã«ãRAGã§ã¯è³ªåããã¨ã«Retrievalã¨Generationãéãã¦æçµçãªåçãçæãã¦ãã¾ãã ãã®ãããRAGã¢ããªã±ã¼ã·ã§ã³ã®å質ã¯ããã2ã¤ã®æ®µéãçµã¿åããããçµæã¨ãªãã¾ãã
RAGã¢ããªã±ã¼ã·ã§ã³ã®æ¹åããããã¨ãèããã¨ãRAGã®ã©ã®ã³ã³ãã¼ãã³ãã®ç²¾åº¦ãããã«ããã¯ã«ãªã£ã¦ããã®ããé©åã«è¦æ¥µããå¿ è¦ãããã¾ãã ãã®ããã«ã¯ãä»®ã«è³ªåã«å¯¾ãã¦èª¤ã£ãåçãè¿ããå ´åã«ã¯Retrievalãåå ãªã®ãGenerationãåå ãªã®ããèå¥ã§ããããã«ç®¡çããã¦ãããã¨ãæ±ãããã¾ãã
ç¾å¨RAGã¢ããªã±ã¼ã·ã§ã³ãè©ä¾¡ããã©ã¤ãã©ãªã¨ãã¦ã¯RAGASãåºãç¥ããã¦ãã¾ãã
RAGASã§ã¯Retrievalã¨Generationããããã§è©ä¾¡ææ¨ãå®ãã¦ãã¾ãã
- Generation
- faithfulness: ä¸ããããã³ã³ããã¹ãã«å¯¾ããåçã¨äºå®ã®ä¸è²«æ§
- answer relevancy: çæãããåçãä¸ããããããã³ããã«å¯¾ãã¦ã©ãã ãé©åã§ããã
- Retrieval
- context precision
- context recall
ãã¡ããRAGASã§èãããã¦ããè©ä¾¡ææ¨ä»¥å¤ã«ãèãããã¨ã¯ãããã¨æãã¾ãããRAGã¢ããªã±ã¼ã·ã§ã³ãç¶ç¶çã«æ¹åãã¦ããã«ã¯Retrievalã¨Generationãåãåãã¦è©ä¾¡ã確èªãããã¨ãè¯ãããã«æãã¾ãã
Arize Phoenix
Phoenixã¯ãå®é¨ãè©ä¾¡ããã©ãã«ã·ã¥ã¼ãã£ã³ã°ã®ããã«è¨è¨ããããªã¼ãã³ã½ã¼ã¹ã©ã¤ãã©ãªã§ãã
Phoenix is an open-source observability library designed for experimentation, evaluation, and troubleshooting. It allows AI Engineers and Data Scientists to quickly visualize their data, evaluate performance, track down issues, and export data to improve. Arize Phoenix | Phoenix
æ¬å®¶ã®ãã¼ã¸ã¯ä¸è¨ã«ãªãã¾ãã
ä»å対象ã«ãã¦ããRAGã«é¢ãã¦ã¯ãä¸è¨ã®ãã¢ãµã¤ããå®éã«è§¦ã£ã¦ãããã®ãæããæ´ãã«ã¯ã¯ããã¨æãã¾ãã
é¡ä¼¼ãã¼ã«
ãã®æã®LLMOpsç³»ã®ãã¼ã«ã§Arize Phoenixã¨ä¼¼ããããªãã¨ãã§ãããã¼ã«ã¨ãã¦ã¯ä¸è¨ã®ãããªãã¼ã«ãæãããã¾ãã
- SaaS
- LangSmith
- HoneyHive
- PromptLayer
- OSS
- Langfuse
- TruLens
Arize Phoenixã®ä»ã®ãã¼ã«ã¨ãã£ã¨æ¯è¼ããææã¨ãã¦ã¯ãä¸è¨ããããå¼·ã¿ããªã¨æãã¾ãã
- OSSã§ãã¼ã«ã«ã§ä½¿ãã
- æ¤ç´¢é¢ä¿ã®ã¢ãã¿ãªã³ã°ã»å¯è¦åæ©è½ãä»ã®ãã¼ã«ããå å®ï¼ãã¦ããæ°ãããï¼
ç¾ç¶ã ã¨LangSmithãä¸çªã¡ã¸ã£ã¼ãªæ°ã¯ãã¾ããæè¿ã ã¨Langfuseã話é¡ã«ãªã£ã¦ãã¾ããã ãã®æã®ãã¼ã«ã®ãã¡ã©ãã使ããé¸å®ããéã®çé ã®ãã¼ã«ã§ã¯ãªãããããã¾ãããããã¼ã«ã«ã§ã®éçºãã¡ã¤ã³ã ã£ãããæ¤ç´¢é¢ä¿ã®ã¢ãã¿ãªã³ã°çã«èª²é¡æãããå ´åã«ã¯åè£ã«ãªã£ã¦ããããããã¾ãããã
使ã£ã¦ã¿ã
Tutorial
ã¯ããã«å ¬å¼ã§æä¾ããã¦ãããã¥ã¼ããªã¢ã«ãéãã¦æããæ´ãã§ã¿ããã¨æãã¾ãã
ãã¥ã¼ããªã¢ã«éãã«What did the author do growing up?
ã¨è³ªåããã¨ããThe author focused on writing short stories and programming, particularly experimenting with early programming languages like Fortran on an IBM 1401 computer during their time in 9th grade.
ã¨ããåçãå¾ããã¾ããã
ãã®ã¨ãã®Phoenixã®ç»é¢ã¯ä¸è¨ã®ããã«ãªãã¾ãã
RetrieveããGenerationã¾ã§ã©ã®ãããªåä½ããããã¯ä¸è¨ã®ããã«ããã¼ã§ç¢ºèªãããã¨ãã§ãã¾ãã
ãã®ãããã®è¦ãç®ã¯LangSmithãªã©ã¨åæ§ã®ä½¿ç¨æã§ä½¿ããã¨ãã§ãã¾ããã
ãã©ããã³ã°èªä½ã¯ä¸è¨ã®ã³ã¼ããå®è¡ããã¨ãã¼ã«ã«ã«ãã©ããã³ã°ç¨ã®ãµã¼ãã¼ãèµ·åããã®ã§ããã以å¾ã¯OpenAIã¨ã®éä¿¡ãLangChainãLlamaIndexã§éä¿¡ããã¨ãããèªåã§ãã£ããã£ãã¦è¡ãããã§ãã
import phoenix as px session = px.launch_app()
追å ã§è¨ç®ããndcgãprecisionãªã©ã¯å¾ãããµã¼ãã¼ã¸éä¿¡ãã¦ããã·ã¥ãã¼ãã«åæ ããå½¢ã®ããã§ãã
from phoenix.trace import DocumentEvaluations, SpanEvaluations px.Client().log_evaluations( SpanEvaluations(dataframe=ndcg_at_2, eval_name="ndcg@2"), SpanEvaluations(dataframe=precision_at_2, eval_name="precision@2"), DocumentEvaluations(dataframe=retrieved_documents_relevance_df, eval_name="relevance"), )
ãããªæãã«åæ ããã¾ãã
è¦ããã«
- åºæ¬çã«
launch_app
ãã¦ããã°ã¤ãã³ãããã©ããã³ã°ãã¦ããã - ndcgãprecisionãªã©ãå¾ããè¨ç®ããææ¨ã¯
log_evaluations
ã§è¿½è¨ãã
æãã§ä½¿ã£ã¦ããã¿ããã§ãã
ãã¼ã«ã«ã§ã®ç®¡ç
ãã¥ã¼ããªã¢ã«ã§ã¯Colabã使ã£ã¦ã¿ãã®ã§ãä»åº¦ã¯ãã©ããã³ã°ããçµæãæ°¸ç¶åã§ããããã«ããããªãããããã¾ããã ãã¼ã«ã«ç°å¢ã§Phoenixã®ãµã¼ãã¼ãæ§ç¯ããã®ã¯ãã¡ãã使ãã°ã§ãããã§ãã
â»M3 Macã ã¨ãªãã ãPhoenixãåããªãã£ãã®ã§ä»åº¦Intel ç°å¢ã§æ¹ãã¦è©¦ãã¦ã¿ããã¨æãã¾ã
åèæç®
ææ³
Colabç°å¢ã§åããã ããªãçµæ§ç°¡åã«ä½¿ããããªãã¨ããããã¾ããã 使ã£ã¦ã¿ãã ãã§ãããä»åº¦ããã¡ãã£ã¨å¥ã®å½¢ã§ããã£ã¦ã¿ããã¨æãã¾ãã