-
Notifications
You must be signed in to change notification settings - Fork 235
docs: make document indices self-contained #1678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
caa00fd
chore: first pr
jupyterjazz b45e3a6
docs: modify hnsw
jupyterjazz cad4e60
Merge branch 'main' into docs-self-contained-indices
jupyterjazz 11bda62
docs: rough versions of inmemory and hnsw
jupyterjazz 96319ca
chore: update branch
jupyterjazz f5825f8
docs: weaviate v1
jupyterjazz 8aaedbe
docs: elastic v1
jupyterjazz 4a3e25c
docs: introduction page
jupyterjazz db77beb
docs: redis v1
jupyterjazz 82afb99
docs: qdrant v1
jupyterjazz befc786
docs: validate intro inmemory and hnsw examples
jupyterjazz 9bdb0dc
docs: validate elastic and qdrant examples
jupyterjazz 64f83bf
docs: validate code examples for redis and weaviate
jupyterjazz 759900c
Merge branch 'main' into docs-self-contained-indices
jupyterjazz 60cd4d4
chore: merge recent updates
jupyterjazz ca25feb
docs: milvus v1
jupyterjazz 7fef5d8
Merge branch 'main' into docs-self-contained-indices
jupyterjazz fe572da
docs: validate milvus code
jupyterjazz 10bc14b
docs: make redis and milvus visible
jupyterjazz 6199a2a
docs: refine vol1
jupyterjazz fa8f919
Merge branch 'main' into docs-self-contained-indices
jupyterjazz c257a4e
docs: refine vol2
jupyterjazz ccf17e1
chore: pull recent updates
jupyterjazz f3ca77c
docs: update api reference
jupyterjazz 21e3ad2
Merge branch 'main' into docs-self-contained-indices
jupyterjazz e6ef9c4
docs: apply suggestions
jupyterjazz 19045ec
docs: separate nested data section
jupyterjazz 5736334
Merge branch 'main' into docs-self-contained-indices
jupyterjazz 41c7307
docs: apply suggestions vol2
jupyterjazz a32a1e5
fix: nested data imports
jupyterjazz 8a8aa33
Merge branch 'main' into docs-self-contained-indices
jupyterjazz ef0b7ef
docs: apply johannes suggestions
jupyterjazz 6818688
chore: merge conflicts
jupyterjazz 9268161
docs: apply suggestions
jupyterjazz b402802
docs: app sgg
jupyterjazz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
docs: validate intro inmemory and hnsw examples
Signed-off-by: jupyterjazz <[email protected]>
- Loading branch information
commit befc786c43f5b61a9cb3edf835b5b7c27eb59306
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -38,12 +38,12 @@ Currently, DocArray supports the following vector databases: | |
| - [Qdrant](https://qdrant.tech/) | [Docs](index_qdrant.md) | ||
| - [Elasticsearch](https://www.elastic.co/elasticsearch/) v7 and v8 | [Docs](index_elastic.md) | ||
| - [HNSWlib](https://github.com/nmslib/hnswlib) | [Docs](index_hnswlib.md) | ||
| - InMemoryExactNNSearch | [Docs](index_in_memory.md) | ||
| - InMemoryExactNNIndex | [Docs](index_in_memory.md) | ||
|
|
||
|
|
||
| ## Basic Usage | ||
|
|
||
| For this user guide you will use the [InMemoryExactNNSearch][docarray.index.backends.in_memory.InMemoryExactNNSearch] | ||
| For this user guide you will use the [InMemoryExactNNIndex][docarray.index.backends.in_memory.InMemoryExactNNIndex] | ||
| because it doesn't require you to launch a database server. Instead, it will store your data locally. | ||
|
|
||
| !!! note "Using a different vector database" | ||
|
|
@@ -52,14 +52,13 @@ because it doesn't require you to launch a database server. Instead, it will sto | |
|
|
||
| !!! note "InMemory-specific settings" | ||
| The following sections explain the general concept of Document Index by using | ||
| [InMemoryExactNNSearch][docarray.index.backends.in_memory.InMemoryExactNNSearch] as an example. | ||
| For InMemory-specific settings, check out the [InMemoryExactNNSearch][docarray.index.backends.in_memory.InMemoryExactNNSearch] documentation | ||
| `InMemoryExactNNIndex` as an example. | ||
| For InMemory-specific settings, check out the `InMemoryExactNNIndex` documentation | ||
| [here](index_in_memory.md). | ||
|
|
||
|
|
||
| ```python | ||
| from docarray import BaseDoc, DocList | ||
| from docarray.index import HnswDocumentIndex | ||
| from docarray.index import InMemoryExactNNIndex | ||
| from docarray.typing import NdArray | ||
| import numpy as np | ||
|
|
||
|
|
@@ -72,13 +71,13 @@ class MyDoc(BaseDoc): | |
| # Create documents (using dummy/random vectors) | ||
| docs = DocList[MyDoc](MyDoc(title=f'title #{i}', price=i, embedding=np.random.rand(128)) for i in range(10)) | ||
jupyterjazz marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| # Initialize a new HnswDocumentIndex instance and add the documents to the index. | ||
| doc_index = HnswDocumentIndex[MyDoc](workdir='./my_index') | ||
| # Initialize a new InMemoryExactNNIndex instance and add the documents to the index. | ||
| doc_index = InMemoryExactNNIndex[MyDoc]() | ||
| doc_index.index(docs) | ||
|
|
||
| # Perform a vector search. | ||
| query = np.ones(128) | ||
| retrieved_docs = doc_index.find(query, search_field='embedding', limit=10) | ||
| retrieved_docs, scores = doc_index.find(query, search_field='embedding', limit=10) | ||
|
|
||
| # Perform filtering (price < 5) | ||
| query = {'price': {'$lt': 5}} | ||
|
|
@@ -87,9 +86,9 @@ filtered_docs = doc_index.filter(query, limit=10) | |
| # Perform a hybrid search - combining vector search with filtering | ||
| query = ( | ||
| doc_index.build_query() # get empty query object | ||
| .find(np.ones(128), search_field='embedding') # add vector similarity search | ||
| .find(query=np.ones(128), search_field='embedding') # add vector similarity search | ||
| .filter(filter_query={'price': {'$gte': 2}}) # add filter search | ||
| .build() # build the query | ||
| ) | ||
| results = doc_index.execute_query(query) | ||
| retrieved_docs, scores = doc_index.execute_query(query) | ||
| ``` | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should here again add a big fat link to all the backend documentation pages and tell people that they can get more detailed information there |
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.