-
Notifications
You must be signed in to change notification settings - Fork 835
refactor(framework): pytorch splitting refactor #2317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @larme, Thanks for updating this PR.
Comment last updated at 2022-02-28 11:25:55 UTC |
Codecov Report
@@ Coverage Diff @@
## main #2317 +/- ##
==========================================
+ Coverage 45.24% 48.81% +3.56%
==========================================
Files 106 134 +28
Lines 8281 8420 +139
==========================================
+ Hits 3747 4110 +363
+ Misses 4534 4310 -224
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are still some small and quick fix needed
@@ -43,8 +33,6 @@ | |||
""" # noqa | |||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this block anymore since you already do an import check under .common.pytorch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking about just keeping these blocks in case one day we remove torch import from .common.pytorch
. What's your opinion? @aarnphm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm why would we do so? I don't think we would ever do such things right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a change to import torch from .common.pytorch directly @aarnphm
You also have to add flags for |
the PyTorchContainer is failing the test, and for some reason the torchscript integration test is not running ? |
Let's fix this after |
Co-authored-by: Aaron Pham <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
Co-authored-by: Aaron Pham <[email protected]>
e0f5a52
to
6b153d5
Compare
def _mapping(item: t.Union["ext.NpNDArray", torch.Tensor]) -> torch.Tensor: | ||
if LazyType["ext.NpNDArray"]("numpy.ndarray").isinstance(item): | ||
item = torch.Tensor(item, device=self._device_id) | ||
return item |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's already a tensor, we need to call to(device)
on it as well, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init with self._device_id
so no need to call to(device)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aarnphm @bojiang I think you mean in self._configure()
we have set default tensor storage type? But if the tensor is deserialized from DataContainer payload which is serialized in another machine and these 2 machine have mismatched device, then the tensor won't be converted to the proper device automatically. Also in my simple benchmark to(device)
won't cost a lot if from_device
and to_device
is the same
Here's my simple demo:
https://gist.github.com/larme/7640f2d97d45c92946f1573b3eaaa0be
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think calling to
shouldn't affect perf that much. I think we just need to make sure that device is consistent across, so that #2230 won't happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change will make device more consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good then, lets do that
paddlepaddle test will be addressed in a separate PR, seems like it just randomly fails all the time when using paddlehub module. |
* refactor(framework): split pytorch impl to pytorch and torchscript * style: reformat * style: minor fix * doc: fix pytorch/torchscript changes * chore: add torchscript github workflow * style: minor fix * Update bentoml/_internal/frameworks/common/pytorch.py Co-authored-by: Aaron Pham <[email protected]> * Update bentoml/_internal/frameworks/common/pytorch.py Co-authored-by: Aaron Pham <[email protected]> * Update bentoml/_internal/frameworks/common/pytorch.py Co-authored-by: Aaron Pham <[email protected]> * Update bentoml/_internal/frameworks/common/pytorch.py Co-authored-by: Aaron Pham <[email protected]> * Update bentoml/_internal/frameworks/pytorch.py Co-authored-by: Aaron Pham <[email protected]> * Update bentoml/_internal/frameworks/common/pytorch.py Co-authored-by: Aaron Pham <[email protected]> * Update bentoml/_internal/frameworks/common/pytorch.py Co-authored-by: Aaron Pham <[email protected]> * Update bentoml/_internal/frameworks/common/pytorch.py Co-authored-by: Aaron Pham <[email protected]> * style: minor fix * test: pytorch frameworks test save format * chore: update codecov.yml for torchscript * chore: github workflow add torchscript * test(framework): disable pytorch container batch_axis=1 test * chore: remove double import * fix: pytorch tensor to device Co-authored-by: Aaron Pham <[email protected]>
Description
Motivation and Context
How Has This Been Tested?
Checklist:
make format
andmake lint
script have passed(instructions).