Skip to content

Commit 62bc8be

Browse files
authored
docs: add weaviate minimum working example (#206)
1 parent a3538ee commit 62bc8be

File tree

1 file changed

+66
-0
lines changed

1 file changed

+66
-0
lines changed

docs/advanced/document-store/weaviate.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,3 +81,69 @@ The following configs can be set:
8181
| `protocol` | protocol to be used. Can be 'http' or 'https' | 'http' |
8282
| `name` | Weaviate class name; the class name of Weaviate object to presesent this DocumentArray | None |
8383
| `serialize_config` | [Serialization config of each Document](../../fundamentals/document/serialization.md) | None |
84+
85+
## Minimum Example
86+
87+
The following example shows how to use DocArray with Weaviate Document Store in order to index and search text
88+
Documents.
89+
90+
First, let's run the create the `DocumentArray` instance (make sure a Weaviate server is up and running):
91+
92+
```python
93+
from docarray import DocumentArray
94+
95+
da = DocumentArray(
96+
storage="weaviate",
97+
config={
98+
"name": "Persisted",
99+
"host": "localhost",
100+
"port": 8080}
101+
)
102+
```
103+
104+
Then, we can index some Documents:
105+
106+
```python
107+
from docarray import Document
108+
109+
da.extend([
110+
Document(text='Persist Documents with Weaviate.'),
111+
Document(text='And enjoy fast nearest neighbor search.'),
112+
Document(text='All while using DocArray API.'),
113+
])
114+
```
115+
116+
Now, we can generate embeddings inside the database using BERT model:
117+
118+
```python
119+
from transformers import AutoModel, AutoTokenizer
120+
121+
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
122+
model = AutoModel.from_pretrained('bert-base-uncased')
123+
124+
def collate_fn(da):
125+
return tokenizer(
126+
da.texts,
127+
return_tensors='pt',
128+
truncation=True,
129+
padding=True
130+
)
131+
132+
da.embed(model, collate_fn=collate_fn)
133+
```
134+
135+
136+
Finally, we can query the database and print the results:
137+
138+
```python
139+
results = da.find(
140+
DocumentArray([Document(text='How to persist Documents')]).embed(model, collate_fn=collate_fn),
141+
limit=1
142+
)
143+
144+
print(results[0].text)
145+
```
146+
147+
```text
148+
Persist Documents with Weaviate.
149+
```

0 commit comments

Comments
 (0)