- ãã®è¨äºã¯ä½ï¼
- ãã«ãã¢ã¼ãã«æ¤ç´¢ã®æµã
- Vespaãå©ç¨ãããã«ãã¢ã¼ãã«æ¤ç´¢ã®å®è£
- ãã«ãã¢ã¼ãã«æ¤ç´¢ã®å®è¡
- ãããã«
ãã®è¨äºã¯ä½ï¼
情報検索・検索技術 Advent Calendar 2023 - Adventar ã®1æ¥ç®ã®è¨äºã§ãã
ãã«ãã¢ã¼ãã«æ¤ç´¢ã¨ã¯ãããã¹ããç»åãé³å£°ãªã©è¤æ°ã®ç°ãªãã¿ã¤ãã®ãã¼ã¿ã使ç¨ãã¦æ å ±ãæ¤ç´¢ããæè¡ã§ãã
ä¾ãã° マルチモーダル検索とは何か: 「視覚を持った LLM」でビジネスが変わる | Google Cloud 公式ブログ ã§ã¯ã
- å ¥åï¼æ¤ç´¢ãã¼ã¯ã¼ãï¼ããã¹ãï¼
- æ¤ç´¢å¯¾è±¡ï¼ååç»å
ã¨ããæ¤ç´¢ãå®ç¾ãã¦ãã¾ãã ç¹ã«ãåååãåå説ææã¨ãã£ãããã¹ãæ å ±ã¯æ¤ç´¢å¯¾è±¡ã«å«ã¾ãã¦ããããæ¤ç´¢ãã¼ã¯ã¼ãã¨ååç»åã®ããã度åãã®ã¿ãèæ ®ããã¦æ¤ç´¢ããã¦ããã®ãé¢ç½ãã§ãã
ä¸è¨ã®ä¾ä»¥å¤ã«ããç»åãå ¥åã¨ãã¦ããã¹ããæ¤ç´¢å¯¾è±¡ã«ããããããã¹ãã¨ç»åã®ä¸¡æ¹ãå ¥åã¨ããæ¤ç´¢ãå®ç¾ã§ãã¾ã*1ã
ãã®è¨äºã§ã¯ãæ¤ç´¢ã¨ã³ã¸ã³Vespaã使ã£ã¦ãããã¹ãã¨ç»åãçµã¿åããããã«ãã¢ã¼ãã«æ¤ç´¢ã·ã¹ãã ã®å®è£ æ¹æ³ãç´¹ä»ãã¾ãã å®è£ ã«ç¨ããã³ã¼ãã¯ä»¥ä¸ã«è¼ãã¾ããã
ãã«ãã¢ã¼ãã«æ¤ç´¢ã®æµã
ãã«ãã¢ã¼ãã«æ¤ç´¢ã«ã¯ä»¥ä¸ã®2ã¤ã®ã³ã³ãã¼ãã³ããå¿ è¦ã§ãã
Embeddingæ¨è«ã¨ã³ã¸ã³ï¼ããã¹ããç»åãªã©ã®å ¥åãåãåããæ©æ¢°å¦ç¿ã¢ãã«ãå©ç¨ãã¦embeddingï¼ç¹å¾´éãã¯ãã«ï¼ãçæãããããåºåããã³ã³ãã¼ãã³ãã§ããFastAPIãªã©ã®Webãã¬ã¼ã ã¯ã¼ã¯ãå©ç¨ãã¦ç¬èªã«å®è£ ãããã¨ãã§ãã¾ãããTritonãTensorFlow Servingã¨ãã£ãæ©æ¢°å¦ç¿ã¢ãã«ã®æ¨è«ã«ç¹åããä»çµã¿ãå©ç¨ãããã¨ãå¯è½ã§ãã
ãã¯ãã«æ¤ç´¢ã¨ã³ã¸ã³ï¼æ¤ç´¢å¯¾è±¡ã³ã³ãã³ãã®embeddingãæ ¼ç´ããæ¤ç´¢ã¯ã¨ãªã«åºã¥ãã¦é¡ä¼¼ããã³ã³ãã³ããé«éã«æ¤ç´¢ãæä¾ããã³ã³ãã¼ãã³ãã§ãããã¯ãã«æ¤ç´¢ã®éè¦ãé«ã¾ã£ã¦ãããããFaissãAnnoyãValdãVertex AI Matching EngineãPineconeãããã¦Vespaãªã©ããã¾ãã¾ãªã¨ã³ã¸ã³ãã©ã¤ãã©ãªãéçºããã¦ãã¾ã*2*3ã
以ä¸ã®å³ã¯ãæåã«ç´¹ä»ããããã¹ãã«ããç»åæ¤ç´¢ã®å®è£ ä¾ã示ãã¦ãã¾ãã
- ååãã£ã¼ãã¼ï¼ååã®ç»åãç»åembeddingæ¨è«ã¨ã³ã¸ã³ã«éä¿¡ããçæãããç»åembeddingããã¯ãã«æ¤ç´¢ã¨ã³ã¸ã³ã«ãã£ã¼ããã¾ãã
- ã¦ã¼ã¶ã¼ã®æ¤ç´¢ãªã¯ã¨ã¹ãï¼ã¦ã¼ã¶ã¼ã¯æ¤ç´¢APIã«æ¤ç´¢ãã¼ã¯ã¼ãï¼ããã¹ãï¼ãéä¿¡ãã¾ããæ¤ç´¢APIã¯ãã®ãã¼ã¯ã¼ãã®embeddingãåå¾ããããããã¯ãã«æ¤ç´¢ã¨ã³ã¸ã³ã«åãåãããæ¤ç´¢çµæãå¾ã¾ããããã¦ããããã®æ¤ç´¢çµæãã¦ã¼ã¶ã¼ã«è¿ãã¾ãã
Vespaãå©ç¨ãããã«ãã¢ã¼ãã«æ¤ç´¢ã®å®è£
ãã¦ãããããã¯Vespaã使ç¨ãã¦ãã«ãã¢ã¼ãã«æ¤ç´¢ã·ã¹ãã ãå®è£ ããæ¹æ³ã«ã¤ãã¦èª¬æãã¾ãã
å®è£ ããæ¤ç´¢ã·ã¹ãã ã®æ¦è¦
æ¤ç´¢ã·ã¹ãã ã®ãã¼ã¿ã½ã¼ã¹ã¨ãã¦ãAmazon Berkeley Objects (ABO) Datasetãå©ç¨ãã¾ãã ãã®ãã¼ã¿ã»ããã«ã¯ãAmazonã§è²©å£²ããã¦ããååã®IDãã¿ã¤ãã«ã説ææãç»åãªã©ãå«ã¾ãã¦ãã¾ãã
å®è£ ãããã«ãã¢ã¼ãã«æ¤ç´¢ã·ã¹ãã ã§ã¯ã以ä¸ã®2種é¡ã®æ¤ç´¢ãå¯è½ã§ã*4ï¼
- å ¥åï¼æ¤ç´¢ãã¼ã¯ã¼ãï¼ããã¹ãï¼ãæ¤ç´¢å¯¾è±¡ï¼ååç»å
- å ¥åï¼ç»åãæ¤ç´¢å¯¾è±¡ï¼ååã¿ã¤ãã«ï¼ããã¹ãï¼
ã¤ã¾ããããã¹ãã¨ç»åã®ã©ã¡ããæ¤ç´¢å ¥åã¨ãã¦ä½¿ç¨ã§ãã¾ãã
ååembeddingã®ç®¡çæ¹æ³
ååã«ã¯ãã¿ã¤ãã«ï¼ããã¹ãï¼ããè¨ç®ãããembeddingã¨ãç»åããè¨ç®ãããembeddingã®2種é¡ãåå¨ãã¾ãã ããã2ã¤ã®embeddingã管çããæ¹æ³ã¨ãã¦ã以ä¸ã®2ã¤ã®ã¢ããã¼ããèãããã¾ãã
ã¢ããã¼ã | 説æ | ã¡ãªãã | ãã¡ãªãã |
---|---|---|---|
åå¥ãã£ã¼ã | ã¿ã¤ãã«ã¨ç»åã®embeddingãåå¥ã«Vespaã«ãã£ã¼ã | ç´ç²ãªï½¢ãã¼ã¯ã¼ã-ç»åï½£ãï½¢ç»å-ã¿ã¤ãã«ï½£æ¤ç´¢ãå¯è½ | ã¤ã³ããã¯ã¹ãµã¤ãºã大ãããªã |
ãã¼ã¸ãã£ã¼ã | 2ã¤ã®embeddingã1ã¤ã«çµ±åãã¦Vespaã«ãã£ã¼ã | ã¤ã³ããã¯ã¹ãµã¤ãºãå°ãããªã | ååã1ã¤ã®embeddingã§è¡¨ããããå³å¯ã«ã¯ï½¢ãã¼ã¯ã¼ã-ã¿ã¤ãã«&ç»åï½£ã®ãããªæ¤ç´¢ã«ãªã |
Vespaã¯1ã¤ã®ããã¥ã¡ã³ãã«å¯¾ãè¤æ°ã®embeddingããã£ã¼ããããã¨ãå¯è½ãªã®ã§*5ãã¿ã¤ãã«ã¨ç»åã®embeddingãåå¥ã«ãã£ã¼ããããã¨ãå¯è½ã§ãã ä¸æ¹ã§Vespaå é¨ã«ãååãã¨ã«embeddingã2ã¤ããã¯ãã«æ¤ç´¢ã¤ã³ããã¯ã¹ï¼HNSWï¼ã2ã¤æã¤ãã¨ã«ãªãã®ã§ãã¤ã³ããã¯ã¹ãµã¤ãºã大ãããªãã¨ããåé¡ãããã¾ãã
ã¿ã¤ãã«ã¨ç»åã®embeddingãåå¥ã«ãã£ã¼ãããã®ã§ã¯ãªãããã¾ã1ã¤ã®embeddingã«çµ±åãã¦ããã¦ãçµ±åå¾ã®embeddingã®ã¿ããã£ã¼ãããæ¹æ³ãèãããã¾ãã 1ã¤ã®embeddingã®ã¿ããã¯ãã«æ¤ç´¢ã¨ã³ã¸ã³ã§æ±ãã°è¯ãã®ã§ãã¤ã³ããã¯ã¹ãµã¤ãºã¯å°ãããªããã®ã®ãååã表ãembeddingã«ã¿ã¤ãã«ã¨ç»åã®ä¸¡æ¹ã®æ å ±ãå«ã¾ãããã¨ã«æ³¨æãå¿ è¦ã§ãã
embeddingã®çµ±åã«ã¤ãã¦ãä¾ãã°CIKM 2023ã®è«æ"Unsupervised multi-modal representation learning for high quality retrieval of similar products at e-commerce scale"ã§ã¯ãããã¹ãã¨ç»åã®embeddingã足ãåããã¦L2æ£ååãè¡ãã1ã¤ã®embeddingã¨ãã¦æ±ãææ³ãææ¡ãã¦ãã¾ãã
ãããã¯ãã¨ãã¦å®è£ ããéã«ã¯ãã¦ã¼ã¹ã±ã¼ã¹ãã³ã¹ããå å³ãã¦ã©ã¡ããã®æ¡ãæ¡ç¨ããå¿ è¦ãããã¾ãã ãã®è¨äºã§ã¯ãå¾ã«2ã¤ã®æ¡ãå®é¨ããããã両æ¹ã®æ¡ãå®è£ ãã¦ããã¾ãï¼ä»¥ä¸å³ï¼ã
Embeddingã®æ¨è«
ã¿ã¤ãã«ã¨ç»åã®embeddingã®çæã«ã¯ãCLIP ("openai/clip-vit-large-patch14") ã使ç¨ãã¾ãã
CLIPã¯å é¨ã«Text Encoderã¨Image Encoderã®2ã¤ã®ã¢ãã«ãæã¡ãããããããå¾ãããembeddingããã«ãã¢ã¼ãã«æ¤ç´¢ã«å©ç¨ãã¾ãã CLIPã®è©³ãã説æã¯OpenAIã®ç´¹ä»ãã¼ã¸ãåç §ãã¦ãã ããã
Vespaã®ã¹ãã¼ãè¨å®
次ã«ããã¯ãã«æ¤ç´¢ã¨ã³ã¸ã³ã¨ãã¦æ©è½ããVespaã®ã¹ãã¼ãï¼ããã¥ã¡ã³ãã®æ§é ï¼ãå®ç¾©ãã¾ãã Vespaã®ã¹ãã¼ãã«ã¤ãã¦ã¯Schema Referenceããã³Schemasã詳ããã§ãã
ãã®è¨äºã®å®è£ ã§ã¯ã以ä¸ã®ååæ å ±ãVespaã«æ ¼ç´ãã¾ãã
- ååID
- ååã¿ã¤ãã«
- ç»åãã¹ï¼ABOãã¼ã¿ã»ããä¸ã®ãã¹ã表ãã¾ãï¼
- ååã¿ã¤ãã«ããè¨ç®ãããembedding
- ååç»åããè¨ç®ãããembedding
- ã¿ã¤ãã«ã¨ç»åã®embeddingãçµ±åããembeddingï¼åç´ã«åãåãã¾ãï¼
ã¹ãã¼ãã®å®ç¾©
ååæ å ±ã«åºã¥ãã¦ã以ä¸ã®ããã«ã¹ãã¼ããå®ç¾©ãã¾ããã
schema item { document item { field item_id type string { indexing: summary | attribute } field item_name_en_us type string { indexing: summary | index index: enable-bm25 } field path type string { indexing: summary | attribute } field text_embedding type tensor<float>(x[768]) { indexing: attribute | index attribute { distance-metric: angular } index { hnsw { max-links-per-node: 16 neighbors-to-explore-at-insert: 50 } } } field image_embedding type tensor<float>(x[768]) { indexing: attribute | index attribute { distance-metric: angular } index { hnsw { max-links-per-node: 16 neighbors-to-explore-at-insert: 50 } } } field synthetic_embedding type tensor<float>(x[768]) { indexing: attribute | index attribute { distance-metric: angular } index { hnsw { max-links-per-node: 16 neighbors-to-explore-at-insert: 50 } } } } ...
Embeddingã®å®ç¾©ã«ã¤ãã¦è£è¶³ãã¾ãã
text_embedding
,image_embedding
,synthetic_embedding
ã¯ãããããååã®ããã¹ãã¨ç»åããè¨ç®ãããembeddingããã³ãã¼ã¸å¾ã®embeddingãä¿æãã¾ãã- åembeddingãã£ã¼ã«ãã§ã¯ã
attribute
ã«ããã¦distance-metric: angular
ãæå®ãã¦ãã¾ããããã¯ããã¯ãã«éã®è§åº¦ã«åºã¥ãã¦è·é¢ãè¨ç®ãããã¨ã表ãã¾ããä»ã«æå®å¯è½ãªè·é¢è¨ç®å¼ã¯Schema Referenceã«æãããã¦ãã¾ãã index
é ç®ã«ããã¦ãfieldã§æå®ããããã¯ãã«ããã¯ãã«æ¤ç´¢ã¤ã³ããã¯ã¹ï¼HNSWï¼ã§ç®¡çããããã«æå®ãã¦ãã¾ããfieldå®ç¾©æã«HNSWã®ãã©ã¡ã¼ã¿ï¼max-links-per-node
ãneighbors-to-explore-at-insert
ï¼ãæå®ãããã¨ãã§ãã¾ã*6ã詳ããã¯Approximate Nearest Neighbor Search using HNSW Indexã確èªãã¦ãã ããã
ã©ã³ãã³ã°ãã¸ãã¯ã®å®ç¾©
Vespaã§ã¯ã©ã³ãã³ã°ã§å©ç¨ãããã¸ãã¯ãrank-profile
ã¨ããé
ç®ã§å®ç¾©ã§ãã¾ãã
ä»åã¯åç´ã«embeddingéã®é¡ä¼¼åº¦é ã§ã©ã³ãã³ã°ãã¾ã*7ã
rank-profile text_embedding_closeness { match-features: distance(field, text_embedding) inputs { query(query_embedding) tensor<float>(x[768]) } first-phase { expression: closeness(field, text_embedding) } } rank-profile image_embedding_closeness { match-features: distance(field, image_embedding) inputs { query(query_embedding) tensor<float>(x[768]) } first-phase { expression: closeness(field, image_embedding) } } rank-profile synthetic_embedding_closeness { match-features: distance(field, synthetic_embedding) inputs { query(query_embedding) tensor<float>(x[768]) } first-phase { expression: closeness(field, synthetic_embedding) } }
Vespaã¸ã®ãããã¤ã¨ãã¼ã¿ãã£ã¼ã
ã¹ãã¼ãè¨å®ãå®äºããããVespaã®ã¢ããªã±ã¼ã·ã§ã³ããããã¤ããååæ å ±ããã£ã¼ããã¾ãã
$ vespa deploy --wait 300
ååãã¼ã¿ã¯JSONã§Vespaã«ãã£ã¼ããã¾ãã以ä¸ã¯ãã®ä¸ä¾ã§ãã
{ "put": "id:item:item::B074J5TWYL", "fields": { "item_id": "B074J5TWYL", "item_name_en_us": "365 Everyday Value, Organic Black Tea (70 Tea Bags), 4.9 oz", "path": "03/03fde183.jpg", "text_embedding": [0.04524907, 0.00058629, ... }
ãã«ãã¢ã¼ãã«æ¤ç´¢ã®å®è¡
Vespaã®ãããã¤&ååæ å ±ã®ãã£ã¼ãã¾ã§åºæ¥ãã¨ãã¦ãå®éã«ãã«ãã¢ã¼ãã«æ¤ç´¢ã試ãã¦ã¿ã¾ãã
Vespaã§ã®ãã¯ãã«æ¤ç´¢ã®æ¹æ³
Vespaã§ãã¯ãã«æ¤ç´¢ãè¡ãã«ã¯ã以ä¸ã®ããã«ã¯ã¨ãªãæãã¾ãã
$ vespa query \ 'yql=select * from item where {targetHits:100, approximate:true}nearestNeighbor(image_embedding, query_embedding)' \ 'ranking=image_embedding_closeness' \ 'input.query(query_embedding)=[0.1, 0.2, ...]'
- Vespaã§ã¯YQLã¨ããã¯ã¨ãªè¨èªã§æ¤ç´¢ã¯ã¨ãªãçµã¿ç«ã¦ã¾ã
where
ã®nearestNeighbor
ã«ããã¦ãã¯ãã«æ¤ç´¢ãè¡ãããã«æå®ãã¾ãã詳ããã¯ããã¥ã¡ã³ããåç §ãã¦ãã ãããinput.query(query_embedding)=...
ã§ãæ¤ç´¢ã§å©ç¨ããå ¥åãã¯ãã«ãæå®ãã¾ãã
å ¥åï¼æ¤ç´¢ãã¼ã¯ã¼ããæ¤ç´¢å¯¾è±¡ï¼ç»å
æ¤ç´¢ãã¼ã¯ã¼ã"short modern cabinet"ã«å¯¾ããç»åæ¤ç´¢çµæã®ä¸ä½10件ã以ä¸ã«ç¤ºãã¾ãã
3件ç®ã¨4件ç®ã®é»ããã£ããããã¯"short"ãã¨è¨ãããã¨æªããã§ãããããããæ¤ç´¢ãã¼ã¯ã¼ãã®æå³ã«æ²¿ã£ãååãå¾ããã¦ãã¾ãã ç¹ã«ãååã¿ã¤ãã«ï¼item_name_en_usï¼ã«å¿ ããã"short"ï½¥"modern"ï½¥"cabinet"ã¨ããåèªãå«ã¾ããªããã¨ã«æ³¨ç®ãã¦ãã ããã ååã®ããã¹ããã£ã¼ã«ãã«æ示çã«å«ã¾ãã¦ããªãããååç»åããæ¨å®ãããæ å ±ãå©ç¨ãã¦æ¤ç´¢ãè¡ããã®ã¯ããã«ãã¢ã¼ãã«æ¤ç´¢ã®å¼·ã¿ã¨è¨ããã§ãããã
次ã«ãæ¤ç´¢ãã¼ã¯ã¼ãã«"black"ã¨ããåèªãå«ãã"black short modern cabinet"ã§æ¤ç´¢ãã¦ã¿ã¾ãã
追å ããã"black"ã¨ããåèªãèæ ®ããé»ããã£ãããããæ¤ç´¢çµæã«æ²åºãããããã«ãªãã¾ããã ãã¡ãã·ã§ã³ååãå®¶å ·ååã¯ãååã®è²æ å ±ãååé¸æã«ãããéè¦ãªè¦ç´ ã«ãªãããããã«ãã¢ã¼ãã«æ¤ç´¢ã®å¼·ã¿ãæ´»ããããããããã¾ããã
ããã¾ã§ã¯æ¤ç´¢å¯¾è±¡ï¼ç»åï¼item_embeddingï¼ã¨ãã¦ãã¾ããããã¿ã¤ãã«ã¨ç»åã®embeddingã足ãåãããembeddingï¼synthetic_embeddingï¼ãæ¤ç´¢å¯¾è±¡ã¨ãã¦æ¤ç´¢ãã¦ã¿ã¾ãã
æ¤ç´¢ãã¼ã¯ã¼ã"short modern cabinet"
æ¤ç´¢ãã¼ã¯ã¼ã"black short modern cabinet"
æ¤ç´¢å¯¾è±¡ï¼ç»åï¼item_embeddingï¼ã¨ãã¦æ¤ç´¢ããæã¨ã¯å¤§ããç°ãªãæ¤ç´¢çµæã¨ãªãã¾ããã ã©ã¡ããè¯ãæ¤ç´¢çµæãã¨è¨ãããã¨é£ããã§ããããã¥ã¼ãã³ã°ã®ä½å°ã¯ããããã§ã*8ã
å ¥åï¼ç»åãæ¤ç´¢å¯¾è±¡ï¼ååã¿ã¤ãã«
次ã«ãç»åãå ¥åã¨ãã¦æ¤ç´¢å¯¾è±¡ãã¿ã¤ãã«ã¨ãããã«ãã¢ã¼ãã«æ¤ç´¢ã試ãã¦ã¿ã¾ãã
å ¥åã¯ä»¥ä¸ã®ååã®ç»åã¨ãã¦ã¿ã¾ããã
ã¿ã¤ãã«ããè¨ç®ãããembeddingï¼text_embeddingï¼ã対象ã«æ¤ç´¢ãã¦ã¿ãã¨ã以ä¸ã®çµæãå¾ããã¾ããã
æå³éãããã£ããããããã§ã¹ãååãæ¤ç´¢çµæã«å«ã¾ãã¦ãã¾ãã
ããã«ãã¿ã¤ãã«ã¨ç»åã®embeddingããã¼ã¸ï¼åï¼ããembeddingï¼synthetic_embeddingï¼ãæ¤ç´¢å¯¾è±¡ã¨ãã¦æ¤ç´¢ãã¦ã¿ã¾ãã
å ã®ååã¿ã¤ãã«ã®ã¿ãæ¤ç´¢å¯¾è±¡ã¨ããæ¤ç´¢ã¨ç°ãªãããããã£ããããã®å½¢ãæããæ¤ç´¢çµæã«ãªãã¾ããã
å¿ç¨ï¼ãã¯ãã«æ¤ç´¢ã¨ãã¼ã¯ã¼ãæ¤ç´¢ã®çµã¿åãã
ããã¾ã§ã¯ãå ¥åã®ãã¯ãã«ã¨é¡ä¼¼ãããã¯ãã«ã®ååãæ¤ç´¢ããã¨ãããåç´ãªãã¯ãã«æ¤ç´¢ãè¡ã£ã¦ãã¾ããã å¿ç¨äºä¾ã¨ãã¦ããã¼ã¯ã¼ãæ¤ç´¢ããã£ã«ã¿ãªã³ã°ãªã©ã®é常ã®æ¤ç´¢ããã¯ãã«æ¤ç´¢ã¨çµã¿åãããæ¤ç´¢ãç´¹ä»ãã¾ãã
ä¾ãã°ããã¯ãã«æ¤ç´¢ãè¡ãã¤ã¤ååã¿ã¤ãã«ï¼item_name_en_usï¼ã«"black"ã¨ããåèªãå«ã¾ããååãæ¤ç´¢ããã«ã¯ã以ä¸ã®ããã«æ¤ç´¢ã¯ã¨ãªãçµã¿ç«ã¦ã¾ãã
$ vespa query \ 'yql=select * from item where userQuery() and {targetHits:100, approximate:true}nearestNeighbor(image_embedding, query_embedding)' \ 'ranking=image_embedding_closeness' \ 'query=item_name_en_us:black' \ 'input.query(query_embedding)=[0.1, 0.2, ...]'
ä¸è¨ã®ããã«ãé常ã®æ¤ç´¢ããã¯ãã«æ¤ç´¢ã¨çµã¿åãããæ¤ç´¢ï¼ãã¤ããªããæ¤ç´¢ãVespaã§ã¯å¯è½ã§ã*9ã ä¾ãã°ååæ¤ç´¢ã«ããã¦ã¯ãã«ãã´ãªãªã©ã®ååå±æ§ãæå®ãã¦ãã¼ã¯ã¼ãæ¤ç´¢ã§ããã¨ä¾¿å©ã§ãããã ãã¤ããªããæ¤ç´¢ã¯ãã®ãããªæ¤ç´¢ã±ã¼ã¹ã«å¯¾å¿ã§ãã¾ãã
å®éã«ãå ¥åï¼æ¤ç´¢ãã¼ã¯ã¼ã"short modern cabinet"ãæ¤ç´¢å¯¾è±¡ï¼ç»åãååã¿ã¤ãã«ã«"black"ãå«ãååãæ¤ç´¢ãã¦ã¿ã¾ãããã
æå³ããéããååã¿ã¤ãã«ã«"black"ãå«ããã¤ãæ¤ç´¢ãã¼ã¯ã¼ãã«æ²¿ã£ãç»åã®ååãæ¤ç´¢ã§ãã¦ãã¾ãã
ãããã«
ãã®è¨äºã§ã¯Vespaã§ãã«ãã¢ã¼ãã«æ¤ç´¢ãå®è£ ããæµããç´¹ä»ãã¾ããã
è¨äºä¸ã§ç´¹ä»ããããã«ãVespaã§ã¯1ã¤ã®ããã¥ã¡ã³ãã«è¤æ°ã®ãã¯ãã«ãç´ä»ãããã¨ãã§ãã¾ãã ãã®ãããæè»æ§ã®é«ããã«ãã¢ã¼ãã«æ¤ç´¢ãå®è£ ãããã¨ãã§ãã¾ãã
ã¾ããVespaã¯ãã¯ãã«æ¤ç´¢ããã¼ã¯ã¼ãæ¤ç´¢ããã£ã«ã¿ãªã³ã°ã¨çµã¿åããããã¤ããªããæ¤ç´¢ããµãã¼ããã¦ãããï½¢â¯â¯ãªæ¤ç´¢ãå®ç¾ã§ãã¡ããã®ããªï½ï½ï¼ï¼ï½£ã¨å¤¢ãåºããã¾ãï¼
*1:ä¾ãã°ã「赤ããã¬ã¹ï½£ã®ç»åã¨ï½¢çµå©å¼ç¨ï½£ã¨ããããã¹ããå ¥åã¨ãããã¨ã§ããã©ã¼ãã«ãªèµ¤ããã¬ã¹ãæ¢ããããããã¾ããã
*2:Vespaã¯ãåã«ãã¯ãã«æ¤ç´¢ã¨ã³ã¸ã³ã¨ããããã¯ãæ§ã ãªæ©è½ãæã¤ãããæ¤ç´¢ã¨ã³ã¸ã³ã¨ããã®ãæ£ç¢ºããããã¾ããã
*3:ãªããVespaã«ã¯Embeddingæ¨è«ã¨ã³ã¸ã³ãå å ããæ©è½ãããã¾ãã詳ããã¯Embeddingãã確èªãã ããã
*4:ãã¡ãããï½¢å ¥åï¼æ¤ç´¢ãã¼ã¯ã¼ãï¼ããã¹ãï¼ãæ¤ç´¢å¯¾è±¡ï¼ååã¿ã¤ãã«ï¼ããã¹ãï¼ï½£ï½¢å ¥åï¼ç»åãæ¤ç´¢å¯¾è±¡ï¼ååç»åï½£ã¨ããããã«ãããã¹ãéï½¥ç»åéã®æ¤ç´¢ãèªç¶ã¨å®è£ ããã¾ãã
*5:Revolutionizing Semantic Search with Multi-Vector HNSW Indexing in Vespa
*6:HNSWã®ä¸»è¦ãªãã©ã¡ã¼ã¿ã¨ãã¦ã°ã©ãã®ã¬ã¤ã¤ã¼æ°ãããã¾ãããç¾æç¹ã§Vespaã§ã¯ã¬ã¤ã¤ã¼æ°ãæå®ãããã¨ã¯ã§ããªãããã§ããVespaã®HNSWå®è£ ã¯è¿½ãã¦ãã¾ããããããããå é¨ã§æ±ºãæã¡ã®æ°å¤ãå©ç¨ããã¦ããã®ã ã¨æ³åãã¦ãã¾ãã
*7:rank-profileã¯ããªãèªç±åº¦ãããã¾ããä¾ãã°ãã¯ãã«éé¡ä¼¼åº¦ã¨ä»å±æ§å¤ãçµã¿åããã¦ã©ã³ãã³ã°å¼ãä½ã£ããããã¼ã¹ãã£ã³ã°æ¨ãªã©ã®æ©æ¢°å¦ç¿ã©ã³ãã³ã°ã¢ãã«ã®ç¹å¾´éã«ãã¯ãã«éé¡ä¼¼åº¦ãå©ç¨ãããã¨ãã§ãã¾ãã
*8:ä¾ãã°ãCLIPã¢ãã«ãååãã¼ã¿ã§fine tuningããæ¹æ³ãèãããã¾ããCLIPã¢ãã«ã®å¦ç¿ãã¼ã¿ã¯ãã¦ã§ããµã¤ãããã¯ãã¼ãªã³ã°ããæ å ±ãå©ç¨ãã¦ãããç¹ã«ååæ å ±ã«ç¹åãã¦ããããã§ã¯ããã¾ãããæ¤ç´¢ãã¼ã¯ã¼ããååã¿ã¤ãã«ã¯ãä»ã®ã¦ã§ããµã¤ãã®ããã¹ãã«æ¯ã¹ã¦çããªã©ãç°ãªãç¹å¾´ãæã¡ã¾ãããã®ãããååæ å ±ã«ç¹åããã¡ãªããã¯å¤§ããããããã¾ããã
*9:ããã§ã¯ãã¼ã¯ã¼ãæ¤ç´¢ã¨ãã¯ãã«æ¤ç´¢ã®çµæã®ANDãåã£ã¦ãã¾ãããORãåããã¨ãå¯è½ã§ãããã¤ããªããæ¤ç´¢ã¨ããã¨ããã¼ã¯ã¼ãæ¤ç´¢ã¨ãã¯ãã«æ¤ç´¢ã®ORãåã£ãçµæãè¿ããã¨ãæãã®ãä¸è¬çããããã¾ããã