feat(v2): add audio url and predefined document #940

anna-charlotte · 2022-12-14T10:03:58Z

Add support for audio files to docarray v2

Goals:

AudioUrl
AudioTensor with AudioNdarray and AudioTorchTensor
Audio predefined doc (with optional attrs Audio.tensor, Audio.url and Audio.embedding)
tests
check and update documentation, if required. See guide

Signed-off-by: anna-charlotte <[email protected]>

…udio-v2 Signed-off-by: anna-charlotte <[email protected]>

Signed-off-by: anna-charlotte <[email protected]>

anna-charlotte · 2022-12-29T15:41:55Z

@JoanFM @alaeddine-13 I checked all the comments and made corresponding changes. It's ready for re-review now.

docarray/predefined_document/audio.py

tests/integrations/predefined_document/test_audio.py

docarray/typing/tensor/abstract_tensor.py

Signed-off-by: anna-charlotte <[email protected]>

anna-charlotte · 2023-01-02T09:19:17Z

docarray/typing/url/audio_url.py

+            )
+        return cls(str(url), scheme=None)
+
+    def load(self: T) -> np.ndarray:


@samsja Should I return a np.ndarray here, or rather AudioNdArray or AudioTorchTensor?

I would say AudioNdArray.
Normally we just load the np.ndarray, bc the framework specific conversion can happen when putting it back into the document, and having it as np.ndarray makes the least assumptions about the sorrounding code.
In this case, since there are actual audio features that come with the array, I would go with AudioNdArray. It can still be treated like a normal np.ndarray, but bring the aforementioned features.

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue?

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue

Do you mean like this?

def test_load_audio_url_to_audio_torch_tensor(file_url): class MyAudioDoc(Document): audio_url: AudioUrl tensor: Optional[AudioTorchTensor] doc = MyAudioDoc(audio_url=file_url) doc.tensor = doc.audio_url.load() assert isinstance(doc.tensor, np.ndarray) assert isinstance(doc.tensor, AudioNdArray)

Ok I moved it to the computational backends

great I like it better like this

samsja

This pr looks really nice. I like how the idea of having different tensor types for the different modality work :) I added some comments

docarray/typing/tensor/audio/audio_tensor.py

docarray/typing/tensor/abstract_tensor.py

tests/units/typing/tensor/test_audio_tensor.py

JohannesMessner

Great PR, love the audio tensor stuff, this is a good pattern to use moving forward. just some things to consider

docarray/typing/tensor/abstract_tensor.py

docarray/typing/tensor/audio/audio_torch_tensor.py

docarray/typing/tensor/abstract_tensor.py

docarray/typing/tensor/audio/audio_ndarray.py

JohannesMessner · 2023-01-02T10:24:32Z

docarray/typing/url/audio_url.py

+            )
+        return cls(str(url), scheme=None)
+
+    def load(self: T) -> np.ndarray:


I would say AudioNdArray.
Normally we just load the np.ndarray, bc the framework specific conversion can happen when putting it back into the document, and having it as np.ndarray makes the least assumptions about the sorrounding code.
In this case, since there are actual audio features that come with the array, I would go with AudioNdArray. It can still be treated like a normal np.ndarray, but bring the aforementioned features.

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue?

docarray/typing/url/audio_url.py

Signed-off-by: anna-charlotte <[email protected]>

JohannesMessner

Just resolve the conflicts and we're good to go, great work!

Signed-off-by: anna-charlotte <[email protected]>

github-actions · 2023-01-03T09:41:08Z

📝 Docs are deployed on https://ft-feat-add-audio-v2--jina-docs.netlify.app 🎉

anna-charlotte linked an issue Dec 14, 2022 that may be closed by this pull request

Add audio predefined document to v2 #914

Closed

github-actions bot added size/s area/core area/typing labels Dec 14, 2022

anna-charlotte mentioned this pull request Dec 14, 2022

Meta: DocArray v2 Roadmap #780

Closed

47 tasks

anna-charlotte changed the title ~~feat: add audio url and predefined document~~ feat(v2): add audio url and predefined document Dec 14, 2022

anna-charlotte added 2 commits December 14, 2022 15:25

feat: add audio url class

bebc9d4

Signed-off-by: anna-charlotte <[email protected]>

fix: typos

6025c2f

Signed-off-by: anna-charlotte <[email protected]>

anna-charlotte force-pushed the feat-add-audio-v2 branch from 1285a1c to 6025c2f Compare December 14, 2022 14:26

anna-charlotte added 2 commits December 15, 2022 11:34

test: add tests for audio and audio url

9a599e5

Signed-off-by: anna-charlotte <[email protected]>

feat: add audio url and audio predefined class

04abdae

Signed-off-by: anna-charlotte <[email protected]>

github-actions bot added size/m area/entrypoint area/testing component/proto and removed size/s labels Dec 15, 2022

anna-charlotte added 3 commits December 21, 2022 21:58

Merge remote-tracking branch 'origin/feat-rewrite-v2' into feat-add-a…

f8d700d

…udio-v2 Signed-off-by: anna-charlotte <[email protected]>

chore: add types-request

d58f804

Signed-off-by: anna-charlotte <[email protected]>

feat: add audio tensors torch and ndarray

bdf8e88

Signed-off-by: anna-charlotte <[email protected]>

github-actions bot added the area/setup label Dec 22, 2022

anna-charlotte added 2 commits December 22, 2022 10:07

fix: mypy type hints

6572df8

Signed-off-by: anna-charlotte <[email protected]>

test: empty test file

9cd4baa

Signed-off-by: anna-charlotte <[email protected]>

github-actions bot added size/l and removed size/m labels Dec 22, 2022

anna-charlotte added 6 commits December 28, 2022 09:09

test: add more unit and integration tests

b3c1948

Signed-off-by: anna-charlotte <[email protected]>

fix: update audio tensors and audio url

7774181

Signed-off-by: anna-charlotte <[email protected]>

fix: remove print statements

af840d4

Signed-off-by: anna-charlotte <[email protected]>

docs: add documentation

797f488

Signed-off-by: anna-charlotte <[email protected]>

refactor: rename test audio py to test audio tensor py

8b48a77

Signed-off-by: anna-charlotte <[email protected]>

fix: typo in torch tensor py

e135438

Signed-off-by: anna-charlotte <[email protected]>

anna-charlotte added 6 commits December 29, 2022 15:03

fix: revert ndim in abstract tensor and torch tensor and ndarray

83ef649

Signed-off-by: anna-charlotte <[email protected]>

fix: mypy checks

eecca41

Signed-off-by: anna-charlotte <[email protected]>

docs: add docstring to n dim

4762c3c

Signed-off-by: anna-charlotte <[email protected]>

refactor: move n dim to abstract tensor and subclasses

6948122

Signed-off-by: anna-charlotte <[email protected]>

refactor: make to protobuf abstract, change node to protobuf signature

d174087

Signed-off-by: anna-charlotte <[email protected]>

fix: remove not needed methods

3a52303

Signed-off-by: anna-charlotte <[email protected]>

alaeddine-13 reviewed Dec 30, 2022

View reviewed changes

docarray/predefined_document/audio.py Outdated Show resolved Hide resolved

tests/integrations/predefined_document/test_audio.py Outdated Show resolved Hide resolved

docarray/typing/tensor/abstract_tensor.py Outdated Show resolved Hide resolved

anna-charlotte added 3 commits December 30, 2022 10:12

fix: change remote audio file to file from github

a0be12e

Signed-off-by: anna-charlotte <[email protected]>

fix: raw content from remote file

9623d29

Signed-off-by: anna-charlotte <[email protected]>

fix: path to github remote file

6efdcf2

Signed-off-by: anna-charlotte <[email protected]>

anna-charlotte commented Jan 2, 2023

View reviewed changes

samsja reviewed Jan 2, 2023

View reviewed changes

JohannesMessner requested changes Jan 2, 2023

View reviewed changes

anna-charlotte added 6 commits January 2, 2023 14:44

refactor: tensor field name to proto field name

5026543

Signed-off-by: anna-charlotte <[email protected]>

test: remove redundant test in test audio tensor

703de43

Signed-off-by: anna-charlotte <[email protected]>

fix: load audio url to audio ndarray instead of np ndarray

83ece31

Signed-off-by: anna-charlotte <[email protected]>

refactor: move n dim to computational backend

de079e2

Signed-off-by: anna-charlotte <[email protected]>

docs: update docstrings for audio tensors

2ef1350

Signed-off-by: anna-charlotte <[email protected]>

feat: make dtype in audiourl load optional

d51d38e

Signed-off-by: anna-charlotte <[email protected]>

JohannesMessner approved these changes Jan 3, 2023

View reviewed changes

samsja approved these changes Jan 3, 2023

View reviewed changes

anna-charlotte added 3 commits January 3, 2023 10:10

Merge branch 'feat-rewrite-v2' into feat-add-audio-v2

3901cfa

Signed-off-by: anna-charlotte <[email protected]>

test: fix document refactor and ndarray import

a571898

Signed-off-by: anna-charlotte <[email protected]>

fix: fix mypy check

71af630

Signed-off-by: anna-charlotte <[email protected]>

anna-charlotte requested a review from JoanFM January 3, 2023 09:43

JoanFM approved these changes Jan 3, 2023

View reviewed changes

JoanFM merged commit da3b7f0 into feat-rewrite-v2 Jan 3, 2023

JoanFM deleted the feat-add-audio-v2 branch January 3, 2023 09:45

feat(v2): add audio url and predefined document #940

feat(v2): add audio url and predefined document #940

Uh oh!

Conversation

anna-charlotte commented Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anna-charlotte commented Dec 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anna-charlotte Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

JohannesMessner Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

anna-charlotte Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

anna-charlotte Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

samsja Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

samsja left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JohannesMessner left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JohannesMessner Jan 2, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JohannesMessner left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 3, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

anna-charlotte commented Dec 14, 2022 •

edited

Loading

anna-charlotte commented Dec 29, 2022 •

edited

Loading