Skip to content

Introduce Hybrid Search API using SQLite FTS5 + Vector search #1158

@varshaprasad96

Description

@varshaprasad96

🚀 Describe the new functionality needed

Currently, Llama-Stack supports optimized chunked writes (PR #1094) for efficient SQLite-based storage. However, there is no built-in Hybrid Search API that combines FTS5 and sqlite-vss to enable semantic and lexical retrieval.

This issue proposes the addition of a Hybrid Search API that allows users to:

  1. Store text documents with both full-text and vector embeddings.
  2. Perform hybrid search that ranks results by combining BM25-based text relevance and vector similarity.
  3. Utilize chunked writes (from PR feat: Chunk sqlite-vec writes #1094) to optimize insertions for large datasets.

Ref: https://github.com/liamca/sqlite-hybrid-search/tree/main - The idea would be take Reciprocal Rank Fusion between FTS5 and vector-based search results to ensure that highly ranked documents across multiple lists are prioritized.

💡 Why is this needed? What if we don't build it?

Building Hybrid Search with RRF will ensure better accuracy, more relevant results inside Llama-Stack's current sqlite vector DB implementation.

Other thoughts

No response

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions