Skip to content

Commit b3debde

Browse files
committed
chore(docs): support pydantic data model
1 parent abb332b commit b3debde

File tree

5 files changed

+21
-3
lines changed

5 files changed

+21
-3
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,13 @@
1414

1515
DocArray is a library for nested, unstructured data such as text, image, audio, video, or 3D mesh. It allows deep-learning engineers to efficiently process, embed, search, recommend, store, and transfer the data with a Pythonic API.
1616

17-
🌌 **All data types**: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data.
17+
🌌 **Rich data types**: super-expressive data structure for representing complicated/mixed/nested text, image, video, audio, 3D mesh data.
1818

1919
🐍 **Pythonic experience**: designed to be as easy as a Python list. If you know how to Python, you know how to DocArray. Intuitive idioms and type annotation simplify the code you write.
2020

2121
🧑‍🔬 **Data science powerhouse**: greatly accelerate data scientists' work on embedding, matching, visualizing, evaluating via Torch/TensorFlow/ONNX/PaddlePaddle on CPU/GPU.
2222

23-
🚡 **Portable**: ready-to-wire at anytime with efficient and compact serialization from/to Protobuf, bytes, base64, JSON, CSV, DataFrame.
23+
🚡 **Portable**: ready-to-wire at anytime with fast and compressed serialization from/to Protobuf, bytes, base64, JSON, CSV, DataFrame. Built-in data validation and JSON Schema (OpenAPI) help you build reliable webservices.
2424

2525
<!-- end elevator-pitch -->
2626

docs/fundamentals/document/serialization.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,10 +10,16 @@ One should use {ref}`DocumentArray for serializing multiple Documents<docarray-s
1010

1111
## From/to JSON
1212

13+
```{tip}
14+
If you are building a webservice and want to use JSON for passing DocArray objects, then data validation and field-filtering can be crucial. In this case, it is highly recommended to check out {ref}`fastapi-support` and follow the methods there.
15+
```
16+
1317
```{important}
1418
This feature requires `protobuf` dependency. You can do `pip install "docarray[full]"` to install it.
1519
```
1620

21+
22+
1723
You can serialize a Document as a JSON string via {meth}`~docarray.document.mixins.porting.PortingMixin.to_json`, and then read from it via {meth}`~docarray.document.mixins.porting.PortingMixin.from_json`.
1824

1925
```python

docs/fundamentals/documentarray/serialization.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,8 @@
33

44
DocArray is designed to be "ready-to-wire" at anytime. Serialization is important. DocumentArray provides multiple serialization methods that allows one transfer DocumentArray object over network and across different microservices.
55

6-
- JSON string: `.from_json()`/`.to_json()`
6+
- JSON string: `.from_json()`/`.to_json()`
7+
- Pydantic model: `.from_pydantic_model()`/`.to_pydantic_model()`
78
- Bytes (compressed): `.from_bytes()`/`.to_bytes()`
89
- Base64 (compressed): `.from_base64()`/`.to_base64()`
910
- Protobuf Message: `.from_protobuf()`/`.to_protobuf()`
@@ -13,10 +14,17 @@ DocArray is designed to be "ready-to-wire" at anytime. Serialization is importan
1314

1415
## From/to JSON
1516

17+
18+
```{tip}
19+
If you are building a webservice and want to use JSON for passing DocArray objects, then data validation and field-filtering can be crucial. In this case, it is highly recommended to check out {ref}`fastapi-support` and follow the methods there.
20+
```
21+
1622
```{important}
1723
This feature requires `protobuf` dependency. You can do `pip install "docarray[full]"` to install it.
1824
```
1925

26+
27+
2028
```python
2129
from docarray import DocumentArray, Document
2230

docs/fundamentals/fastapi-support/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
(fastapi-support)=
12
# FastAPI/pydantic Support
23

34
Long story short, DocArray supports [pydantic data model](https://pydantic-docs.helpmanual.io/) via {class}`~docarray.document.pydantic_model.PydanticDocument` and {class}`~docarray.document.pydantic_model.PydanticDocumentArray`.

docs/get-started/what-is.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ DocArray is designed to maximize the local experience, with the requirement of c
3737
| Nested data ||||||
3838
| Mixed data of the above four ||||||
3939
| Easy to (de)serialize ||||||
40+
| Data validation (of the output) ||||||
4041
| Pythonic experience ||||✔️️||
4142
| IO support for filetypes ||||||
4243
| Deep learning framework support ||||||
@@ -118,6 +119,8 @@ Beside code refactoring and optimization, many features have been improved, incl
118119
- revised documentations and examples
119120
- ... and many more.
120121

122+
When first using DocArray, some Jina 2.x user may realize the static typing seems missing. This is due to a deliberate decision of DocArray: DocArray guarantees the types and constraints of the wire data, not the input data. In other words, only the functions that are listed under {ref}`docarray-serialization` chapter will trigger the data validation.
123+
121124
To learn DocArray, the recommendation here is to forget about everything in Jina 2.x, although some interfaces may look familiar. Read [the fundamental sections](../fundamentals/document/index.md) from beginning.
122125

123126
```{important}

0 commit comments

Comments
 (0)