Install CUDA 10.1 but nvidia-smi gets CUDA 10.2

I am using an nvidia-docker on a remote ubuntu 16.04 server.

When typing in ‘nvidia-smi’, it shows CUDA 10.1. But when I check under ‘/usr/local’, there is no /usr/local/cuda, but only /usr/local/cuda-10.0 (strange, not cuda-10.1).

I am using PyTorch 1.0.0 (cuda 10), and when I ran a program here (GitHub - jwyang/faster-rcnn.pytorch: A faster pytorch implementation of faster r-cnn) (branch pytorch-1.0), I got

THCudaCheck FAIL file=/home/rizhao/projects/SMITHS/code/faster-rcnn.pytorch/lib/model/csrc/cuda/ROIAlign_cuda.cu line=297 error=98 : unrecognized error code
Traceback (most recent call last):
File “trainval_net.py”, line 321, in
rois_label = fasterRCNN(im_data, im_info, gt_boxes, num_boxes)
File “/home/rizhao/anaconda3/envs/smith/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 489, in call
result = self.forward(*input, **kwargs)
File “/home/rizhao/projects/SMITHS/code/faster-rcnn.pytorch/lib/model/faster_rcnn/faster_rcnn.py”, line 77, in forward
pooled_feat = self.RCNN_roi_align(base_feat, rois.view(-1, 5))
File “/home/rizhao/anaconda3/envs/smith/lib/python3.6/site-packages/torch/nn/modules/module.py”, line 489, in call
result = self.forward(*input, **kwargs)
File “/home/rizhao/projects/SMITHS/code/faster-rcnn.pytorch/lib/model/roi_layers/roi_align.py”, line 58, in forward
input, rois, self.output_size, self.spatial_scale, self.sampling_ratio
File “/home/rizhao/projects/SMITHS/code/faster-rcnn.pytorch/lib/model/roi_layers/roi_align.py”, line 20, in forward
output = _C.roi_align_forward(input, roi, spatial_scale, output_size[0], output_size[1], sampling_ratio)
RuntimeError: cuda runtime error (98) : unrecognized error code at /home/rizhao/projects/SMITHS/code/faster-rcnn.pytorch/lib/model/csrc/cuda/ROIAlign_cuda.cu:297
Segmentation fault (core dumped)

Therefore, I thought I should install CUDA 10.1.
So that I follow the instruction (CUDA Toolkit 10.1 Original Archive | NVIDIA Developer) to install cuda 10.1 with deb (local)

Installation Instructions:
sudo dpkg -i cuda-repo-ubuntu1604-10-1-local-10.1.105-418.39_1.0-1_amd64.deb
sudo apt-key add /var/cuda-repo-<version>/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda

However, I found that CUDA-10.2 will be installed.
So I change sudo apt-get install cuda to sudo apt-get install cuda-10.1

Although I can find that there is /usr/local/cuda-10.1, when I typed in ‘nvidia-smi’, it showed CUDA 10.2.
Then I ran the program again, and there were still problems.

Then I typed in sudo apt-get purge '*nvidia*' and ran sudo apt-get install cuda-10.1. Guess what,
I find some wrong. It showed that

sudo apt-get install cuda-10.1
Reading package lists… Done
Building dependency tree
Reading state information… Done
Note, selecting ‘cuda-10-1’ for regex ‘cuda-10.1’
The following additional packages will be installed:
accountsservice acpid apg aptdaemon avahi-daemon avahi-utils bbswitch-dkms bind9-host bluez bluez-obexd cheese-common cracklib-runtime crda
cuda-command-line-tools-10-1 cuda-compiler-10-1 cuda-cudart-10-1 cuda-cudart-dev-10-1 cuda-cufft-10-1 cuda-cufft-dev-10-1 cuda-cuobjdump-10-1
cuda-cupti-10-1 cuda-curand-10-1 cuda-curand-dev-10-1 cuda-cusolver-10-1 cuda-cusolver-dev-10-1 cuda-cusparse-10-1 cuda-cusparse-dev-10-1
cuda-demo-suite-10-1 cuda-documentation-10-1 cuda-driver-dev-10-1 cuda-drivers cuda-gdb-10-1 cuda-gpu-library-advisor-10-1 cuda-libraries-10-1
cuda-libraries-dev-10-1 cuda-license-10-1 cuda-license-10-2

cuda-memcheck-10-1 cuda-misc-headers-10-1 cuda-npp-10-1 cuda-npp-dev-10-1
cuda-nsight-10-1 cuda-nsight-compute-10-1 cuda-nsight-systems-10-1 cuda-nvcc-10-1 cuda-nvdisasm-10-1 cuda-nvgraph-10-1 cuda-nvgraph-dev-10-1
cuda-nvjpeg-10-1 cuda-nvjpeg-dev-10-1 cuda-nvml-dev-10-1 cuda-nvprof-10-1 cuda-nvprune-10-1 cuda-nvrtc-10-1 cuda-nvrtc-dev-10-1 cuda-nvtx-10-1

cuda-license-10-2 makes me very confused. I don’t know how to install the correct version cuda and run my program.

nvidia-smi just gives you driver information - so it shows you the maximum possible version of CUDA that is supported by your driver. It doesn’t actually tell you anything about your CUDA install. yes, it’s confusing.