Open
Description
Summary
It seems that the following line in def broadcast_inputs(x, y)
triggerred a tensor storage copy that caused a CUDA memory overflow when I tried to run a small bundle adjustment dataset with 31843 pixel observations. Both reshape
and contiguous
could trigger a memory copy. If we can avoid memory copy in broadcast_inputs
, we can avoid overflowing CUDA memory at this step.
pypose/pypose/lietensor/operation.py
Line 914 in 6598a84
![image](https://private-user-images.githubusercontent.com/61036578/248678558-3bd6c109-dfb7-4e84-9413-41d1fe2637bb.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3Mzk3MjA4NTcsIm5iZiI6MTczOTcyMDU1NywicGF0aCI6Ii82MTAzNjU3OC8yNDg2Nzg1NTgtM2JkNmMxMDktZGZiNy00ZTg0LTk0MTMtNDFkMWZlMjYzN2JiLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNTAyMTYlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjUwMjE2VDE1NDIzN1omWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWVjMjgwZGI4MGY4NWI5NDU3MTY0YzBhZDhkODBjN2ZjYTVjNTliZTlmNzQ4MjVmZmZlMDEyOGQ4NTk3MmViMGUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0In0.p5ocJTY8deilg6aGNyQj2tuBbW-dR2md91pstWng1DU)
Improvements
refactor broadcast_inputs
to not use reshape and contiguous.
Risks
TBD
Involved components
Optional: Intended side effects
TBD
Optional: Missing test coverage
TBD
Metadata
Assignees
Labels
No labels