Skip to content

Conversation

@anna-charlotte
Copy link
Contributor

@anna-charlotte anna-charlotte commented Dec 14, 2022

Add support for audio files to docarray v2

Goals:

  • AudioUrl
  • AudioTensor with AudioNdarray and AudioTorchTensor
  • Audio predefined doc (with optional attrs Audio.tensor, Audio.url and Audio.embedding)
  • tests
  • check and update documentation, if required. See guide

@anna-charlotte anna-charlotte linked an issue Dec 14, 2022 that may be closed by this pull request
@anna-charlotte anna-charlotte mentioned this pull request Dec 14, 2022
47 tasks
@anna-charlotte anna-charlotte changed the title feat: add audio url and predefined document feat(v2): add audio url and predefined document Dec 14, 2022
anna-charlotte added 2 commits December 14, 2022 15:25
Signed-off-by: anna-charlotte <[email protected]>
Signed-off-by: anna-charlotte <[email protected]>
anna-charlotte added 2 commits December 15, 2022 11:34
anna-charlotte added 3 commits December 21, 2022 21:58
anna-charlotte added 2 commits December 22, 2022 10:07
Signed-off-by: anna-charlotte <[email protected]>
Signed-off-by: anna-charlotte <[email protected]>
@github-actions github-actions bot added size/l and removed size/m labels Dec 22, 2022
anna-charlotte added 6 commits December 28, 2022 09:09
@anna-charlotte
Copy link
Contributor Author

anna-charlotte commented Dec 29, 2022

@JoanFM @alaeddine-13 I checked all the comments and made corresponding changes. It's ready for re-review now.

anna-charlotte added 3 commits December 30, 2022 10:12
)
return cls(str(url), scheme=None)

def load(self: T) -> np.ndarray:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samsja Should I return a np.ndarray here, or rather AudioNdArray or AudioTorchTensor?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say AudioNdArray.
Normally we just load the np.ndarray, bc the framework specific conversion can happen when putting it back into the document, and having it as np.ndarray makes the least assumptions about the sorrounding code.
In this case, since there are actual audio features that come with the array, I would go with AudioNdArray. It can still be treated like a normal np.ndarray, but bring the aforementioned features.

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue

Do you mean like this?

def test_load_audio_url_to_audio_torch_tensor(file_url):
    class MyAudioDoc(Document):
        audio_url: AudioUrl
        tensor: Optional[AudioTorchTensor]

    doc = MyAudioDoc(audio_url=file_url)
    doc.tensor = doc.audio_url.load()

    assert isinstance(doc.tensor, np.ndarray)
    assert isinstance(doc.tensor, AudioNdArray)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I moved it to the computational backends

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great I like it better like this

Copy link
Member

@samsja samsja left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pr looks really nice. I like how the idea of having different tensor types for the different modality work :) I added some comments

Copy link
Member

@JohannesMessner JohannesMessner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR, love the audio tensor stuff, this is a good pattern to use moving forward. just some things to consider

)
return cls(str(url), scheme=None)

def load(self: T) -> np.ndarray:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say AudioNdArray.
Normally we just load the np.ndarray, bc the framework specific conversion can happen when putting it back into the document, and having it as np.ndarray makes the least assumptions about the sorrounding code.
In this case, since there are actual audio features that come with the array, I would go with AudioNdArray. It can still be treated like a normal np.ndarray, but bring the aforementioned features.

But just verify in a test that setting a AudioNdArray to a field with type AudioTorchArray actually works without issue?

Copy link
Member

@JohannesMessner JohannesMessner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just resolve the conflicts and we're good to go, great work!

anna-charlotte added 3 commits January 3, 2023 10:10
@github-actions
Copy link

github-actions bot commented Jan 3, 2023

📝 Docs are deployed on https://ft-feat-add-audio-v2--jina-docs.netlify.app 🎉

@anna-charlotte anna-charlotte requested a review from JoanFM January 3, 2023 09:43
@JoanFM JoanFM merged commit da3b7f0 into feat-rewrite-v2 Jan 3, 2023
@JoanFM JoanFM deleted the feat-add-audio-v2 branch January 3, 2023 09:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add audio predefined document to v2

6 participants