ããã«ã¡ã¯ãã¡ã«ã«ãªã®çæAIãã¼ã 㧠ML Engineer ããã¦ãã ML_Bear ã§ãã
以åã®è¨äº[1]ã§ã¯ååã¬ã³ã¡ã³ãæ¹åã®ã話ãããã¦ããã ãã¾ããããä»åã¯ã大è¦æ¨¡è¨èªã¢ãã« (LLM) ããã®å¨è¾ºæè¡ãæ´»ç¨ãã¦30åãè¶ ããååã®ã«ãã´ãªåé¡ãè¡ãªã£ãäºä¾ãç´¹ä»ãã¾ãã
ChatGPTã®ç»å ´ã«ããLLMãã¼ã ã«ç«ãã¤ããã¨ãããã¨ããããLLMã¯ä¼è©±ãéãã¦å©ç¨ãããã®ã ã¨èªèããã¦ããæ¹ãå¤ãã¨æãã¾ãããLLMãæããé«ãæèè½åã¯ãã¾ãã¾ãªã¿ã¹ã¯ã解決ããããã®ãã¼ã«ã¨ãã¦ãé常ã«æç¨ã§ããä»æ¹ããã®å¦çé度ã®é ããè²»ç¨ã¯å¤§è¦æ¨¡ãªããã¸ã§ã¯ãã§ã®æ´»ç¨ã«ããã£ã¦ã®éå£ã¨ãªãå¾ã¾ãã
æ¬è¨äºã§ã¯ãããããLLMã®èª²é¡ãå æããããã«ãã¾ãã¾ãªå·¥å¤«ãæ½ããLLMåã³ãã®å¨è¾ºæè¡ã®ããã³ã·ã£ã«ãæ大éã«å¼ãåºãã¦å¤§è¦æ¨¡ååãã¼ã¿ã®ã«ãã´ãªåé¡åé¡ã解決ããåãçµã¿ã«ã¤ãã¦èª¬æãã¾ãã
課é¡
ã¾ãã¯ä»åã®ããã¸ã§ã¯ãã®èæ¯ã¨æè¡çãªèª²é¡ãç°¡åã«èª¬æãã¾ãã
ã¡ã«ã«ãªã¯2024å¹´ã«ã«ãã´ãªãªãã¥ã¼ã¢ã«ãè¡ããé層æ§é ãè¦ç´ãã¨ã¨ãã«ååã«ãã´ãªã®æ°ãå¤§å¹ ã«å¢ããã¾ããããããã«ãã´ãªæ°ããã®é層æ§é ããããã¨ãããã¨ã¯ãããã«ç´ã¥ãååã®ãã¼ã¿ãå¤æ´ããå¿ è¦ãããã¾ãã
é常ã§ããã°ååã®ã«ãã´ãªåé¡ã¯æ©æ¢°å¦ç¿ã¢ãã«ãã«ã¼ã«ãã¼ã¹ã¢ãã«ãå©ç¨ãã¾ãããããä»åã®ã±ã¼ã¹ã§ã¯éå»ã®ååã«å¯¾ãããæ°ããã«ãã´ãªé層ã§ã®æ£è§£ã«ãã´ãªããããããªããããæ©æ¢°å¦ç¿ã使ç¨ããåé¡å¨ãä½æãããã¨ãã§ãã¾ããã§ãããã¾ããã«ãã´ãªæ°ãé常ã«å¤ããããã«ã¼ã«ãã¼ã¹ã¢ãã«ã®æ§ç¯ãå°é£ã§ãããããã§ããã®èª²é¡ã«å¯¾ãã¦LLMãæ´»ç¨ã§ããªããã¨ããã¢ã¤ãã£ã¢ãåºã¦ãã¾ããã
解決ç: LLMã¨kNNã«ãã2ã¹ãã¼ã¸æ§æã®äºæ¸¬ã¢ã«ã´ãªãºã
çµè«ã¨ãã¦ã¯ä»¥ä¸ã®ãããª2ã¹ãã¼ã¸æ§æã®ã¢ã«ã´ãªãºã ãçµããã¨ã§ä»åã®èª²é¡ã«å¯¾å¿ãã¾ããã
- ChatGPT 3.5 turbo (OpenAI API[2])ã§éå»ååã®ä¸é¨ã®æ£è§£ã«ãã´ãªãäºæ¸¬ãã
- 1.ãå¦ç¿ãã¼ã¿ã¨ãã¦éå»ååã®ã«ãã´ãªäºæ¸¬ã¢ãã«ãä½æ
å ¨ã¦ãChatGPTã§äºæ¸¬ã§ããã°æ¥½ã ã£ãã®ã§ãããã¡ã«ã«ãªã®éå»ååã¯30åååãè¶ ãããã[3]ãå ¨ã¦ãChatGPTã§äºæ¸¬ããã®ã¯å¦çæéçã«ãAPIã³ã¹ãçã«ãä¸å¯è½ã§ããããã®ãããç´ä½æ²æãçµã¦ãã®ãããª2ã¹ãã¼ã¸ã®ã¢ãã«æ§æã¨ãã¾ããã(ãã¹ã¦ã®ååãChatGPT 3.5 turboã§åé¡ããã¨ã³ã¹ãè¦ç©ããã¯ç´100ä¸ãã«ãå¦çæéè¦ç©ããã¯1.9å¹´ã¨ããéç¾å®çãªæ°åã§ãã)
以ä¸ã«ã¢ãã«ã®å 容ãç°¡åã«èª¬æãã¾ãã詳細ã«ã¤ãã¦ã¯ã工夫ããç¹ãã§è¿°ã¹ããããä¸æ¦ã¯ã·ã³ãã«ãªè§£èª¬ã«çãã¾ãã
1. ChatGPT 3.5 turbo (OpenAI API)ã§éå»ååã®ä¸é¨ã®æ£è§£ã«ãã´ãªãäºæ¸¬ãã
ã¾ããéå»ã«åºåãããååãæ°ç¾ä¸ç¹ãµã³ããªã³ã°ããChatGPT 3.5 turboã«ãã®ååã®ãæ°ããã«ãã´ãªæ§æã§ã®æ£ããã«ãã´ãªããäºæ¸¬ããã¾ããã å ·ä½çã«ã¯ãåååã®åååãåå説ææãå ã®ã«ãã´ãªåããã¨ã«æ°ããã«ãã´ãªã®åè£ã10åç¨åº¦ä½æãããã®åè£ã®ä¸ããæ£è§£ãçãããã¾ããã
2. 1.ãå¦ç¿ãã¼ã¿ã¨ãã¦éå»ååã®ã«ãã´ãªäºæ¸¬ã¢ãã«ãä½æ
次ã«ã1. ã§ä½ã£ããã¼ã¿ã»ãããæ£è§£ãã¼ã¿ã¨ãã¦ãã·ã³ãã«ãª kNN ã¢ãã«[4] ãä½æãã¾ããã
å ·ä½çã«ã¯ãã¾ãã1.ã§æ£è§£ã«ãã´ãªãäºæ¸¬ããååã®Embeddingã¨æ£è§£ã«ãã´ãªããã¯ãã«ãã¼ã¿ãã¼ã¹ã«ä¿åãã¦ããã¾ãããã®å¾ãäºæ¸¬ãããååã®Embeddingãå ã«ããã¯ãã«ãã¼ã¿ãã¼ã¹ããé¡ä¼¼ååãXåæ½åºãããã®Xåã®ååã®æé »ã«ãã´ãªãæ£è§£ã«ãã´ãªã¨ãã¾ããã
Embeddingã¯åååã®åååãåå説ææãã¡ã¿ãã¼ã¿ãå ã®ã«ãã´ãªåãªã©ãé£çµããæååããã¨ã«è¨ç®ãã¾ãããããè¤éãªæ©æ¢°å¦ç¿ã¢ãã«ãæ¤è¨ãã¾ããããã·ã³ãã«ãªã¢ãã«ã§å第ç¹ã®æ§è½ãåºãããã·ã³ãã«ãªã¢ãã«ãæ¡ç¨ãã¾ããã
工夫ããç¹
ãã¦ãããããã¯ä»åã®ããã¸ã§ã¯ãã§å·¥å¤«ããç¹ããç´¹ä»ãã¾ãã以ä¸ã®ãããªç¹ã工夫ããã®ã§ãã²ã¨ã¤ã¥ã¤èª¬æãã¾ãã
- OSSã®Embeddingã¢ãã«ã®æ´»ç¨
- Sentence Transformers ã©ã¤ãã©ãªã«ããMulti-GPUã®æ´»ç¨
- Voyager Vector DBã«ããCPUä¸ã§ã®é«éãªè¿åæ¤ç´¢
- max_tokensã¨CoTã®æ´»ç¨ã«ããLLMäºæ¸¬ã®é«éå
- Numbaã»cuDFã®æ´»ç¨
1. OSSã®Embeddingã¢ãã«ã®æ´»ç¨
第2ã¹ãã¼ã¸ã®ã¢ãã« (kNN) ã§ã¯ååã®Embeddingã®è¨ç®ãå¿
è¦ã§ãããèªåã§ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ãçµããã¨ãå¯è½ã§ããããOpenAI Embeddings API (text-embedding-ada-002
) [5]ã§ååãªç²¾åº¦ãåºããã¨ã確èªã§ããã®ã§ãå½åã¯ãã®APIãå©ç¨ããæ¹éã¨ãã¦ãã¾ããã
ãããã試ç®ãã¦ã¿ãã¨ããããã¹ã¦ã®ååã«OpenAI Embeddings APIãå©ç¨ããã®ã¯å¦çæéçã«ãã³ã¹ãçã«ãå°ãå³ããã¨ãããã¨ãããã«ãããã¾ããã
ãããªä¸ãMTEB[6]ãJapaneseEmbeddingEval[7]ãçºãã¦ããã¨è±èªä»¥å¤ã®è¨èªã§ãOpenAI Embeddings APIã«å¹æµããOSSã®ã¢ãã«ãå¤æ°ãããã¨ã«æ°ã¥ãã¾ãããèªåãã¡ã§è©ä¾¡ç¨ãã¼ã¿ã»ãããä½ã£ã¦è©¦ãã¦ã¿ãã¨ãããOpenAI Embeddings APIåçã®ç²¾åº¦ãåºãããOSSã®ã¢ãã«ãå©ç¨ãããã¨ã«ãã¾ããã
ç§ãã¡ããã®ããã¸ã§ã¯ããè¡ãªã£ã¦ãã2023å¹´10ææç¹ã®ãã¼ã¿ã§ã¯ã以ä¸ã®ã¢ãã«ãé«ã精度ã示ãã¦ãããæçµçã«ã¯è¨ç®ã³ã¹ãã¨ç²¾åº¦ã®ãã©ã³ã¹ãéã¿ intfloat/multilingual-e5-base ãå©ç¨ãã¾ããã(MTEBã®ã©ã³ãã³ã°ã¯å¸¸æå ¥ãæ¿ãã£ã¦ããããã2024å¹´4æç¾å¨ã¯ãã£ã¨å¼·ãã¢ãã«ãããã¨æãã¾ã)
- intfloat/multilingual-e5-large [8]
- intfloat/multilingual-e5-base [9]
- intfloat/multilingual-e5-small [10]
- cl-nagoya/sup-simcse-ja-large [11]
ãã®ããã«ãOSSã§ãé常ã«é«æ§è½ãªEmbeddingã¢ãã«ãåå¨ãã¦ãããããEmbeddingãå©ç¨ããããã¸ã§ã¯ããè¡ãå ´åã¯ãã·ã³ãã«ãªåé¡ãä½æãã¦ãOSSã§ãååãªæ§è½ãæã¤ã¢ãã«ããããã©ããã確èªãã¦ã¿ããã¨ããå§ããã¾ãã
2. Sentence Transformers ã©ã¤ãã©ãªã«ããMulti-GPUã®æ´»ç¨
OSSã¢ãã«ãå©ç¨ãããã¨ã§ OpenAI Embeddings APIã«æ¯ã¹ã¦é£èºçã«å¦çé度ãä¸ãã£ããã®ã®ãæ°ååååãå¦çããã«ã¯ããå°ãæ¹åãå¿ è¦ã§ããã
A100ãªã©ã®å¼·åãªGPUãå©ç¨ã§ããã°è©±ãæ©ãã£ãã®ã§ãããä¸ççãªGPUæ¯æ¸ã®å½±é¿ãåãã¦ããããã¸ã§ã¯ãå®æ½æã®2023å¹´11-12ææç¹ã§ã¯å¼·ãGPUãæ´ããã¨ã¯ãªããªãå°é£ã§ããã(ç¾å¨ããã¾ãç¶æ³ã¯å¤ãã£ã¦ããªããã¨æãã¾ã)
ãã®ãããV100ãL4ãªã©ã®GPUãè¤æ°å°ä¸¦åã§å©ç¨ãã¦å¯¾å¿ãããã¨ã«ãã¾ããã幸ããªãã¨ã«ãSentence-Transformers[12]ã©ã¤ãã©ãªãå©ç¨ããã¨ä»¥ä¸ã®ãããªã·ã³ãã«ãªã³ã¼ãã§è¤æ°å°ã®GPUãç°¡åã«ä¸¦ååã§ãããããé常ã«å©ããã¾ããã
from sentence_transformers import SentenceTransformer
def embed_multi_process(sentences):
    if 'intfloat' in self.model_name:
        sentences = ["query: " + b for b in sentences]
    model = SentenceTransformer(model_name)
    pool = model.start_multi_process_pool()
    embeddings = model.encode_multi_process(sentences, pool)
    model.stop_multi_process_pool(pool)
å¼·åãªGPUã大éã«ä½¿ããã°çæ³çã§ããããããé£ããç¶æ³ã§ã工夫次第ã§å¦çãé«éåãããã¨ãã§ãã¾ããSentence-Transformersã®ãããªã©ã¤ãã©ãªãæ´»ç¨ãã¦ãéããããªã½ã¼ã¹ãæ大éã«æ´»ç¨ãããã¨ãéè¦ã ã¨æãã¾ããã
3. Voyager Vector DBã«ããCPUä¸ã§ã®é«éãªè¿åæ¤ç´¢
kNNãå©ç¨ããéã«ã¯ãã¯ãã«ãã¼ã¿ãã¼ã¹ãå¿ è¦ã§ããããµã³ããªã³ã°ããã¨ã¯ããæ°ç¾ä¸ååã®å¦ç¿ãã¼ã¿ã«ãªã£ããããGPUã®ã¡ã¢ãªã«è¼ããªãç¶æ³ã§ãããA100 80GBãªã©ã®å¤§ããªã¡ã¢ãªãæã¤GPUã使ãã°è¼ã£ãããããã¾ããããåè¿°ã®éãå¼·åãªGPUã¯ç¢ºä¿ãå°é£ã ã£ãã®ã§è©¦ããã¨ããã§ãã¾ããã§ããã
ãããªæãSpotify社製ã®Voyager[13]ãCPUã§ãé«éã«åä½ããã¨èããã®ã§è©¦ãã¦ã¿ãã¨ãããå®ç¨ã«è¶³ãé度ãç°¡åã«å®ç¾ã§ãã¾ãããEmbeddingè¨ç®ã«æ¯ã¹ãã¨è¿åæ¢ç´¢ã®æéã¯ããã»ã©å½±é¿ã大ãããªãã£ãããå³å¯ã«ä»ã®ãããã¯ãã¨æ¯è¼ãã¦ãã¾ããããååãªé度ãåºããã¨ãã§ãã¦ãã¾ããã
Voyagerã«ã¯ã¡ã¿ãã¼ã¿ç®¡çæ©è½ããªãã£ãã®ã§èªåãã¡ã§ã¯ã©ã¤ã¢ã³ããæ¸ãå¿ è¦ãããã¾ããããããã§ãå ¨ä½çã«ã¯è¯ãé¸æã ã£ãã¨æã£ã¦ãã¾ãã
4. max_tokensã¨CoTã®æ´»ç¨ã«ããLLMäºæ¸¬ã®é«éå
ä»åã®ããã¸ã§ã¯ãã§ã¯ ChatGPT 4 ã¯ã³ã¹ãé¢ããå©ç¨ã§ãããChatGPT 3.5 turboã使ããããå¾ã¾ããã§ãããChatGPT 3.5 turboã¯ã³ã¹ãã®å²ã«è³¢ãã¨ã¯æãã¾ããã精度ã«ã¯å°ãä¸å®ãããã¾ããããã®ãããChain of Thoughts[14]ãå©ç¨ãã¦èª¬æãçæããããã¨ã§ç²¾åº¦åä¸ãå³ãã¾ããã
çãã¾ããåç¥ãã¨æãã¾ãããChatGPTã«èª¬æãè¡ãããã¨ãã£ã¨åãç¶ãããã¨ããããå¦çæéãåé¡ã¨ãªãã¾ãããããã§ãmax_tokens
ãã©ã¡ã¼ã¿ãå©ç¨ãã¦é·ã話ãéä¸ã§æã¡åããã¨ã§å¦çæéã®ç縮ã«åªãã¾ããã
åçãæã¡åãã¨(Function Callingã®) JSONãå£ããã®ã§ãLangChain[15]ã®llm.stream()ãå©ç¨ãããããããã¯èªåã§JSONã復å ãã¦ãã¼ã¹ããå¿ è¦ãããå°ãæéããããã¾ããå³å¯ãªæ¯è¼ã¯ãã¦ããªããã®ã®ããã®ææ³ã«ãã£ã¦å¦çæéç縮ã¨ç²¾åº¦åä¸ã®è¯ããã©ã³ã¹ãåããã¨æãã¦ãã¾ãã
以ä¸ãLangChainã®llm.stream()
ãå©ç¨ããå ´åã®ãµã³ãã«ã³ã¼ãã§ãã
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from typing import Optional
from langchain_core.pydantic_v1 import BaseModel, Field
class ItemCategory(BaseModel):
item_category_id: int = Field(None, description="åå説æããäºæ¸¬ããã«ãã´ãªID")
reason: Optional[str] = Field(None, description="ãã®ã«ãã´ãªIDãé¸æããçç±ã詳ãã説æãã¦ãã ãã")
system_prompt = """
ä¸ããããååæ
å ±ãå
ã«ãååã®ã«ãã´ãªãäºæ¸¬ãã¦ãã ããã
ååã®ã«ãã´ãªã¯åè£ããé¸ãã§ãã ãããé¸ãã çç±ã説æãã¦ãã ããã
"""
item_info = " (ååãã¼ã¿ã¨æ°ã«ãã´ãªåè£ãªã©ãå
¥ãã) "
llm = ChatOpenAI(
model_name="gpt-3.5-turbo",
max_tokens=25,
)
structured_llm = llm.with_structured_output(ItemCategory)
prompt = ChatPromptTemplate.from_messages(
[
("system", system_prompt),
("human", "{item_info}"),
]
)
chain = prompt | structured_llm
# streamingã®æå¾ã®è¦ç´ ã ãåãåºã
# - é常ãmax_tokensã§åçãæã¡åãã¨jsonãå£ãã¦ãã¼ã¹ã®å¦çãå¿
è¦
# - langchainã®streamã§å®è¡ããã¨å¸¸ã«jsonãå®æããã¦ãããããã
# max_tokensã§åçãæã¡åã£ã¦ãjsonããã¼ã¹ããå¿
è¦ããªã
for res in chain.stream({"item_info": item_info}):
pass
print(res.json(ensure_ascii=False)) # res: ItemCategory
# {"item_category_id": 1, "reason": "åååã«ãã¬ãããã¿ããå«ã¾ã"}
5. Numbaã»cuDFã®æ´»ç¨
æ°ååååãå¦çããéã¯äºç´°ãªå¦çã§ãå¦çé度ãæ°ã«ãªããããå¯è½ãªéããã¹ã¦ã®å¦çãcuDF[16]ããã³Numba[17]ã§é«éåãã¾ããã
æ£ç´ãªã¨ãã Numba ãæ¸ãã®ã¯è¦æã ã£ãã®ã§ãããPythonã®ç´ ã®ã³ã¼ããChatGPT 4ã«è¦ããã¨æ¸ãç´ãã¦ããããããã»ã¨ãã©èªåã§æ¸ãå¿ è¦ããªãã³ã¼ãã£ã³ã°å·¥æ°ãå¤§å¹ ã«åæ¸ãããã¨ãã§ãã¾ããã
ã¾ã¨ã
ChatGPTã¯ä¼è©±å½¢å¼ã§å©ç¨ããããã¨ãå¤ã注ç®ãéãã¦ãã¾ããããã®é«ãæèè½åãæ´»ç¨ãããã¨ã§ãããã¾ã§é¢åã ã£ããä¸å¯è½ã§ãã£ãã¿ã¹ã¯ãç°¡åã«è§£æ±ºã§ããããã«ãªãã¾ããç§ãã¡ã®ããã¸ã§ã¯ãã§ã¯ãè¨å¤§ãªååãã¼ã¿ãæ°ããã«ãã´ãªã«çæéã§åé¡ãç´ãã¨ããé¢åãªèª²é¡ããChatGPTãæ´»ç¨ãããã¨ã§è§£æ±ºãããã¨ãã§ãã¾ããã
ã¾ããOSSã®Embeddingã¢ãã«ããã«ãGPUã®æ´»ç¨ãé«éãªè¿åæ¤ç´¢ãå¯è½ãªãã¯ãã«ãã¼ã¿ãã¼ã¹ã®æ¡ç¨ãChatGPTã§ã®äºæ¸¬ã®é«éåãNumbaãç¨ããå¦çã®é«éåãªã©ãæ§ã ãªå·¥å¤«ãè¡ããã¨ã§ãéãããæéã¨ãªã½ã¼ã¹ã®ä¸ã§ãæ大éã®ææãåºããã¨ãã§ãã¾ããã
ä»åã®äºä¾ããChatGPTãã¯ããã¨ãã大è¦æ¨¡è¨èªã¢ãã«ã®å¯è½æ§ã®ä¸ç«¯ã示ããçæ§ã®ããã¸ã§ã¯ãã®åèã«ãªãã°å¹¸ãã§ãããã²ãæ§ã ãªå ´é¢ã§LLMãæ´»ç¨ããããã¾ã§ã¯é£ããã£ã課é¡ã«ãã£ã¬ã³ã¸ãã¦ã¿ã¦ãã ããã
Refs
- å調ãã£ã«ã¿ãªã³ã°ã¨ãã¯ãã«æ¤ç´¢ã¨ã³ã¸ã³ãå©ç¨ããååæ¨è¦ç²¾åº¦æ¹åã®è©¦ã¿
- OpenAI API
- ããªãã¢ããªãã¡ã«ã«ãªãç´¯è¨åºåæ°ã30ååãçªç ´
- k-nearest neighbors algorithm
- OpenAI Embeddings API
- Massive Text Embedding Benchmark (MTEB) Leaderboard
- JapaneseEmbeddingEval
- intfloat/multilingual-e5-large
- intfloat/multilingual-e5-base
- intfloat/multilingual-e5-small
- cl-nagoya/sup-simcse-ja-large
- Sentence-Transformers
- Voyager
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (Wei et al. 2022)
- LangChain
- rapidsai/cudf
- Numba: A High Performance Python Compiler