You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please use the caffe-users list for usage, installation, or modeling questions, or other requests for help. Do not post such requests to Issues. Doing so interferes with the development of Caffe.
3.Gpu Test within Docker
$ docker run --gpus all -it --rm bvlc/caffe:gpu
And I did make, make test, and make runtest with MNIST Dataset in caffe.
It was confirmed that Train and Validation were performed with the GPU (RTX3070).
After that, I followed what I saw here to drive the SSD,
When executing this command, the following error occurs.(The full log is attached as a file.)
At first, gpu1, gpu2, and gpu3 were all turned on, so another cuda Error occurred. After checking what was in Issues and modifying the python code as follows, the error disappeared.
But the above error occurs.
Steps to reproduce
If you are having difficulty building Caffe or training a model, please ask the caffe-users mailing list. If you are reporting a build error that seems to be due to a bug in Caffe, please attach your build configuration (either Makefile.config or CMakeCache.txt) and the output of the make (or cmake) command.
Your system configuration
Operating system: Ubuntu 18.04.6 LTS
(I used nvidia-docker 2. Please refer to the above work procedure.)
Compiler: I haven't changed any settings regarding the compiler.(Makefile.config)
If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
INCLUDE_DIRS += $(shell brew --prefix)/include
LIBRARY_DIRS += $(shell brew --prefix)/lib
Uncomment to use pkg-config to specify OpenCV library paths.
(Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
USE_PKG_CONFIG := 1
N.B. both build and distribute dirs are cleared on make clean
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
Uncomment for debugging. Does not work on OSX due to BVLC#171
DEBUG := 1
The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
enable pretty build (comment to see full commands)
Q ?= @
CUDA version (if applicable):
I still don't understand this.
The result of the nvidia-smi command and the result of the nvcc -V command are different.
I don't know if it's because of caffe-docker.
Because of this, I have changed the cuda configuration part of the Makefile several times, but the result is the same.(See Makefile above, For reference, I upgraded cuda version from 11 to 12.)
this is the result of 'nvidia-smi'
root@a50f950a8134:/opt/caffe# nvidia-smi
Wed May 3 12:30:35 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.47 Driver Version: 531.68 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3070 On | 00000000:01:00.0 On | N/A |
| 33% 33C P8 16W / 220W| 1231MiB / 8192MiB | 5% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
this is the result of 'nvcc -V'
root@a50f950a8134:/opt/caffe# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
root@a50f950a8134:/opt/caffe#
CUDNN version (if applicable):
BLAS: Even after installing openblas, an error occurred, so atlas was used instead of using it. (Refer to the Makefile above)
Python or MATLAB version (for pycaffe and matcaffe respectively):
The text was updated successfully, but these errors were encountered:
Please use the caffe-users list for usage, installation, or modeling questions, or other requests for help.
Do not post such requests to Issues. Doing so interferes with the development of Caffe.
Please read the guidelines for contributing before submitting this issue.
Issue summary
I worked like blew.
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list |
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' |
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo systemctl restart docker
docker pull bvlc/caffe:gpu
3.Gpu Test within Docker
$ docker run --gpus all -it --rm bvlc/caffe:gpu
And I did make, make test, and make runtest with MNIST Dataset in caffe.
It was confirmed that Train and Validation were performed with the GPU (RTX3070).
After that, I followed what I saw here to drive the SSD,
When executing this command, the following error occurs.(The full log is attached as a file.)
python examples/ssd/ssd_pascal.py
I0503 12:11:42.210090 59284 solver.cpp:295] Learning Rate Policy: multistep
I0503 12:11:42.216208 59284 blocking_queue.cpp:50] Data layer prefetch queue empty
F0503 12:11:42.328768 59284 im2col.cu:61] Check failed: error == cudaSuccess (8 vs. 0) invalid device function
*** Check failure stack trace: ***
@ 0x7f15e23a15cd google::LogMessage::Fail()
@ 0x7f15e23a3433 google::LogMessage::SendToLog()
@ 0x7f15e23a115b google::LogMessage::Flush()
@ 0x7f15e23a3e1e google::LogMessageFatal::~LogMessageFatal()
@ 0x7f15e2cd9b1a caffe::im2col_gpu<>()
@ 0x7f15e2baa829 caffe::BaseConvolutionLayer<>::conv_im2col_gpu()
@ 0x7f15e2baa926 caffe::BaseConvolutionLayer<>::forward_gpu_gemm()
@ 0x7f15e2caad96 caffe::ConvolutionLayer<>::Forward_gpu()
@ 0x7f15e2c64642 caffe::Net<>::ForwardFromTo()
@ 0x7f15e2c64767 caffe::Net<>::Forward()
@ 0x7f15e2bcaff0 caffe::Solver<>::Step()
@ 0x7f15e2bcba7e caffe::Solver<>::Solve()
@ 0x40b9c4 train()
@ 0x407590 main
@ 0x7f15e1311840 __libc_start_main
@ 0x407db9 _start
@ (nil) (unknown)
Aborted
At first, gpu1, gpu2, and gpu3 were all turned on, so another cuda Error occurred. After checking what was in Issues and modifying the python code as follows, the error disappeared.
But the above error occurs.
Steps to reproduce
If you are having difficulty building Caffe or training a model, please ask the caffe-users mailing list. If you are reporting a build error that seems to be due to a bug in Caffe, please attach your build configuration (either Makefile.config or CMakeCache.txt) and the output of the make (or cmake) command.
Your system configuration
Operating system: Ubuntu 18.04.6 LTS
(I used nvidia-docker 2. Please refer to the above work procedure.)
Compiler: I haven't changed any settings regarding the compiler.(Makefile.config)
Refer to http://caffe.berkeleyvision.org/installation.html
Contributions simplifying and improving our build system are welcome!
cuDNN acceleration switch (uncomment to build with cuDNN).
USE_CUDNN := 1
CPU-only switch (uncomment to build without GPU support).
#CPU_ONLY := 1
uncomment to disable IO dependencies and corresponding data layers
USE_OPENCV := 0
USE_LEVELDB := 0
USE_LMDB := 0
uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
You should not set this flag if you will be reading LMDBs with any
possibility of simultaneous read and write
ALLOW_LMDB_NOLOCK := 1
Uncomment if you're using OpenCV 3
OPENCV_VERSION := 3
To customize your choice of compiler, uncomment and set the following.
N.B. the default for Linux is g++ and the default for OSX is clang++
CUSTOM_CXX := g++
CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
On Ubuntu 14.04, if cuda tools are installed via
"sudo apt-get install nvidia-cuda-toolkit" then use this instead:
CUDA_DIR := /usr
CUDA architecture setting: going with all of them.
For CUDA < 6.0, comment the lines after *_35 for compatibility.
CUDA_ARCH := -gencode arch=compute_20,code=sm_21
-gencode arch=compute_30,code=sm_30
-gencode arch=compute_35,code=sm_35
-gencode arch=compute_50,code=sm_50
-gencode arch=compute_52,code=sm_52
-gencode arch=compute_61,code=sm_61
-gencode arch=compute_61,code=compute_61
-gencode arch=compute_86,code=sm_86
-gencode arch=compute_86,code=compute_86
DCUDA_ARCH_NAME="Manual" -DCUDA_ARCH_BIN="52 60" -DCUDA_ARCH_PTX="60"
BLAS choice:
atlas for ATLAS (default)
mkl for MKL
open for OpenBlas
BLAS := atlas
#BLAS := open
Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
Leave commented to accept the defaults for your choice of BLAS
(which should work)!
BLAS_INCLUDE := /path/to/your/blas
BLAS_LIB := /path/to/your/blas
Homebrew puts openblas in a directory that is not on the standard search path
BLAS_INCLUDE := $(shell brew --prefix openblas)/include
BLAS_LIB := $(shell brew --prefix openblas)/lib
This is required only if you will compile the matlab interface.
MATLAB directory should contain the mex binary in /bin.
MATLAB_DIR := /usr/local
MATLAB_DIR := /Applications/MATLAB_R2012b.app
NOTE: this is required only if you will compile the python interface.
We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7
/usr/lib/python2.7/dist-packages/numpy/core/include
Anaconda Python distribution is quite popular. Include path:
Verify anaconda location, sometimes it's in root.
ANACONDA_HOME := $(HOME)/anaconda2
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
Uncomment to use Python 3 (default is Python 2)
PYTHON_LIBRARIES := boost_python3 python3.5m
PYTHON_INCLUDE := /usr/include/python3.5m \
/usr/lib/python3.5/dist-packages/numpy/core/include
We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
PYTHON_LIB := $(ANACONDA_HOME)/lib
Homebrew installs numpy in a non standard path (keg only)
PYTHON_INCLUDE +=$(dir $ (shell python -c 'import numpy.core; print(numpy.core.file)'))/include
PYTHON_LIB += $(shell brew --prefix numpy)/lib
Uncomment to support layers written in Python (will link against Python libs)
WITH_PYTHON_LAYER := 1
Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
/usr/include/hdf5/serial
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial
If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
INCLUDE_DIRS += $(shell brew --prefix)/include
LIBRARY_DIRS += $(shell brew --prefix)/lib
Uncomment to use
pkg-config
to specify OpenCV library paths.(Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
USE_PKG_CONFIG := 1
N.B. both build and distribute dirs are cleared on
make clean
BUILD_DIR := build
DISTRIBUTE_DIR := distribute
Uncomment for debugging. Does not work on OSX due to BVLC#171
DEBUG := 1
The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0
enable pretty build (comment to see full commands)
Q ?= @
CUDA version (if applicable):
I still don't understand this.
The result of the nvidia-smi command and the result of the nvcc -V command are different.
I don't know if it's because of caffe-docker.
Because of this, I have changed the cuda configuration part of the Makefile several times, but the result is the same.(See Makefile above, For reference, I upgraded cuda version from 11 to 12.)
this is the result of 'nvidia-smi'
root@a50f950a8134:/opt/caffe# nvidia-smi
Wed May 3 12:30:35 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.47 Driver Version: 531.68 CUDA Version: 12.1 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA GeForce RTX 3070 On | 00000000:01:00.0 On | N/A |
| 33% 33C P8 16W / 220W| 1231MiB / 8192MiB | 5% Default |
| | | N/A |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| No running processes found |
+---------------------------------------------------------------------------------------+
this is the result of 'nvcc -V'
root@a50f950a8134:/opt/caffe# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
root@a50f950a8134:/opt/caffe#
CUDNN version (if applicable):
BLAS: Even after installing openblas, an error occurred, so atlas was used instead of using it. (Refer to the Makefile above)
Python or MATLAB version (for pycaffe and matcaffe respectively):
The text was updated successfully, but these errors were encountered: