RuntimeError: operator torchvision::nms does not exist on RTX 4090

I’m working on Win11, WSL 2, Ubuntu 22.04.

The problem is what the title shows, and it happens when I was trying to set the environment for Mamba structure.

To reproduce,

conda create -n cmamba2 python=3.10
conda activate cmamba2
conda install -c pytorch -c conda-forge -c nvidia timm==0.6.5 pytorch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 pytorch-cuda=11.8
conda install -c conda-forge triton

Then, I checked Python, PyTorch and CUDA versions, and their versions decide which .whl file I should download.

Commands and outputs are separately and sequentially showed below.

python -V
Python 3.10.16
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
conda list cuda
# packages in environment at /root/anaconda3/envs/cmamba2:
#
# Name                    Version                   Build  Channel
cuda-cudart               11.8.89                       0    nvidia
cuda-cupti                11.8.87                       0    nvidia
cuda-libraries            11.8.0                        0    nvidia
cuda-nvrtc                11.8.89                       0    nvidia
cuda-nvtx                 11.8.86                       0    nvidia
cuda-runtime              11.8.0                        0    nvidia
cuda-version              11.8                 h70ddcb2_3    conda-forge
cudatoolkit               11.8.0              h4ba93d1_13    conda-forge
pytorch-cuda              11.8                 h7e8668a_6    pytorch
conda list pytorch
# packages in environment at /root/anaconda3/envs/cmamba2:
#
# Name                    Version                   Build  Channel
pytorch                   2.3.0           cuda118_py310h954aa82_301    conda-forge
pytorch-cuda              11.8                 h7e8668a_6    pytorch
pytorch-mutex             1.0                        cuda    pytorch

Also, I ran a command to check whether GLIBCXX_USE_CXX11_ABI is true, maybe it means to use a specific way to compile some packages, and I cannot quite understand it.

python -c 'import torch; print(torch._C._GLIBCXX_USE_CXX11_ABI); print(torch.compiled_with_cxx11_abi())'

Its output is

True
True

So I downloaded 2 below .whl files.

causal_conv1d-1.5.0.post8+cu11torch2.3cxx11abiTRUE-cp310-cp310-linux_x86_64.whl
mamba_ssm-2.2.2+cu118torch2.3cxx11abiTRUE-cp310-cp310-linux_x86_64.whl

separately from Releases · Dao-AILab/causal-conv1d · GitHub and Releases · state-spaces/mamba · GitHub.

These 2 .whl files are stored in a file folder called whl, which is the sub file folder of the main project file folder.

Then I entered the whl file folder with cd command and ran the 2 following commands to install:

pip install causal_conv1d-1.5.0.post8+cu11torch2.3cxx11abiTRUE-cp310-cp310-linux_x86_64.whl
pip install mamba_ssm-2.2.2+cu118torch2.3cxx11abiTRUE-cp310-cp310-linux_x86_64.whl

Then

python
import trochvision

The problem appears, full output

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/root/anaconda3/envs/cmamba2/lib/python3.10/site-packages/torchvision/__init__.py", line 6, in <module>
    from torchvision import _meta_registrations, datasets, io, models, ops, transforms, utils
  File "/root/anaconda3/envs/cmamba2/lib/python3.10/site-packages/torchvision/_meta_registrations.py", line 164, in <module>
    def meta_nms(dets, scores, iou_threshold):
  File "/root/anaconda3/envs/cmamba2/lib/python3.10/site-packages/torch/library.py", line 467, in inner
    handle = entry.abstract_impl.register(func_to_register, source)
  File "/root/anaconda3/envs/cmamba2/lib/python3.10/site-packages/torch/_library/abstract_impl.py", line 30, in register
    if torch._C._dispatch_has_kernel_for_dispatch_key(self.qualname, "Meta"):
RuntimeError: operator torchvision::nms does not exist
nvidia-smi
Tue Apr 15 17:30:07 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.07             Driver Version: 572.83         CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4070 ...    On  |   00000000:01:00.0  On |                  N/A |
|  0%   40C    P8             18W /  285W |     788MiB /  16376MiB |      5%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A              28      G   /Xwayland                             N/A      |
+----------------------------------------------------------------------------------