There are two ways to setup the environment: conda in your desktop and docker container isolate environment.
If you want to build docker with compile all things inside, there are some things need setup first in your own desktop environment:
- NVIDIA-driver: which I believe most of people may already have it. Try
nvidia-smito check if you have it. - Docker:
# Add Docker's official GPG key: sudo apt-get update sudo apt-get install ca-certificates curl sudo install -m 0755 -d /etc/apt/keyrings sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc sudo chmod a+r /etc/apt/keyrings/docker.asc # Add the repository to Apt sources: echo \ "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \ $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \ sudo tee /etc/apt/sources.list.d/docker.list > /dev/null sudo apt-get update
- nvidia-container-toolkit
sudo apt update && apt install nvidia-container-toolkit
Then follow this stackoverflow answers:
-
Edit/create the /etc/docker/daemon.json with content:
{ "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } }, "default-runtime": "nvidia" } -
Restart docker daemon:
sudo systemctl restart docker
-
Then you can build the docker image:
cd OpenSceneFlow && docker build -f Dockerfile -t zhangkin/opensf .
We will use conda to manage the environment with mamba for faster package installation.
Install conda with mamba for package management and for faster package installation:
curl -L -O "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).shCreate base env: [5~15 minutes based on your network speed and cpu]
git clone https://github.com/KTH-RPL/OpenSceneFlow.git
cd OpenSceneFlow
mamba env create -f assets/environment.ymlChecking important packages in our environment now:
mamba activate opensf
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available()); print(torch.version.cuda)"
python -c "import lightning.pytorch as pl; print(pl.__version__)"
python -c "from assets.cuda.mmcv import Voxelization, DynamicScatter;print('successfully import on our lite mmcv package')"
python -c "from assets.cuda.chamfer3D import nnChamferDis;print('successfully import on our chamfer3D package')"
python -c "from av2.utils.io import read_feather; print('av2 package ok')"-
ImportError: libtorch_cuda.so: undefined symbol: cudaGraphInstantiateWithFlags, version libcudart.so.11.0The cuda version:pytorch::pytorch-cudaandnvidia::cudatoolkitneed be same. Reference link -
In cluster have error:
pandas ImportError: /lib64/libstdc++.so.6: version 'GLIBCXX_3.4.29' not foundSolved byexport LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/proj/berzelius-2023-154/users/x_qinzh/mambaforge/lib -
torch_scatter problem:
OSError: /home/kin/mambaforge/envs/opensf-v2/lib/python3.10/site-packages/torch_scatter/_version_cpu.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEESolved by install the torch-cuda version:pip install https://data.pyg.org/whl/torch-2.0.0%2Bcu118/torch_scatter-2.1.2%2Bpt20cu118-cp310-cp310-linux_x86_64.whl -
cuda package problem:
ValueError(f"Unknown CUDA arch ({arch}) or GPU not supported")Solved by checking GPU compute then manually assign:export TORCH_CUDA_ARCH_LIST=8.6