Unable to access pynvml methods

cs20s020 · September 3, 2022, 8:41am

I am using 1080Ti on Ubuntu 16.04 with CUDA 10.2 and NVIDIA driver 440.59.

1. I am trying to profile my PyTorch code using scalene. When I run my code as `scalene main.py`, I get the following error:

Error in program being profiled:
 Function Not Found
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/scalene/scalene_profiler.py", line 1612, in profile_code
    exec(code, the_globals, the_locals)
  File "./code/main.py", line 1, in <module>
    import numpy as np
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/__init__.py", line 140, in <module>
    from . import core
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/core/__init__.py", line 22, in <module>
    from . import multiarray
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/core/multiarray.py", line 12, in <module>
    from . import overrides
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/core/overrides.py", line 9, in <module>
    from numpy.compat._inspect import getargspec
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/compat/__init__.py", line 14, in <module>
    from .py3k import *
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/scalene/scalene_profiler.py", line 719, in cpu_signal_handler
    (gpu_load, gpu_mem_used) = Scalene.__gpu.get_stats()
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/scalene/scalene_gpu.py", line 110, in get_stats
    mem_used = self.gpu_memory_usage(self.__pid)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/scalene/scalene_gpu.py", line 101, in gpu_memory_usage
    for proc in pynvml.nvmlDeviceGetComputeRunningProcesses(handle):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2223, in nvmlDeviceGetComputeRunningProcesses
    return nvmlDeviceGetComputeRunningProcesses_v2(handle);
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found

To validate whether or not this issue is coming from `scalene` library, I run the following commands:


>>> from pynvml import *
>>> nvmlInit()
>>> nvmlSystemGetDriverVersion()
b'440.59'
>>> handle = nvmlDeviceGetHandleByIndex(0)
>>> nvmlDeviceGetComputeRunningProcesses(handle)
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2223, in nvmlDeviceGetComputeRunningProcesses
    return nvmlDeviceGetComputeRunningProcesses_v2(handle);
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found
>>> nvmlDeviceGetGraphicsRunningProcesses(handle)
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetGraphicsRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2260, in nvmlDeviceGetGraphicsRunningProcesses
    return nvmlDeviceGetGraphicsRunningProcesses_v2(handle)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2228, in nvmlDeviceGetGraphicsRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetGraphicsRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found
>>> list(map(str, nvmlDeviceGetGraphicsRunningProcesses(handle)))
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetGraphicsRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2260, in nvmlDeviceGetGraphicsRunningProcesses
    return nvmlDeviceGetGraphicsRunningProcesses_v2(handle)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2228, in nvmlDeviceGetGraphicsRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetGraphicsRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found
>>> nvmlDeviceGetComputeRunningProcesses_v2(handle)
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found

This error seems to be coming from `pynvml`. I am not sure why this is the case.

Robert_Crovella · September 4, 2022, 9:10pm

https://stackoverflow.com/questions/73591281/nvml-cannot-load-methods-nvmlerror-functionnotfound?noredirect=1#comment129972360_73591281

GWeatherby · March 1, 2024, 5:21pm

Unfortunately, that link is now 404

Robert_Crovella · March 1, 2024, 7:15pm

The main suggestion was to update your GPU driver to the latest version available. This is what was there:

NVML cannot load methods “NVMLError_FunctionNotFound”

Ask Question

Asked 1 year, 6 months ago

Modified 5 months ago

Viewed 2k times

-1

This post is hidden. It was automatically deleted 5 months ago by CommunityBot.

I have a 1080Ti GPU with CUDA 10.2, NVIDIA driver 440.59 and pynvml version 11.4.1 running on Ubuntu 16.04.

1. I am trying to profile my PyTorch code using scalene. When I run my code as `scalene main.py` I get the following error:

Error in program being profiled:
 Function Not Found
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/scalene/scalene_profiler.py", line 1612, in profile_code
    exec(code, the_globals, the_locals)
  File "./code-exp/main.py", line 1, in <module>
    import numpy as np
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/__init__.py", line 140, in <module>
    from . import core
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/core/__init__.py", line 22, in <module>
    from . import multiarray
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/core/multiarray.py", line 12, in <module>
    from . import overrides
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/core/overrides.py", line 9, in <module>
    from numpy.compat._inspect import getargspec
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/numpy/compat/__init__.py", line 14, in <module>
    from .py3k import *
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/scalene/scalene_profiler.py", line 719, in cpu_signal_handler
    (gpu_load, gpu_mem_used) = Scalene.__gpu.get_stats()
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/scalene/scalene_gpu.py", line 110, in get_stats
    mem_used = self.gpu_memory_usage(self.__pid)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/scalene/scalene_gpu.py", line 101, in gpu_memory_usage
    for proc in pynvml.nvmlDeviceGetComputeRunningProcesses(handle):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2223, in nvmlDeviceGetComputeRunningProcesses
    return nvmlDeviceGetComputeRunningProcesses_v2(handle);
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found

To validate whether or not this issue is coming from `scalene` library, I run the following commands:

>>> from pynvml import *
>>> nvmlInit()
>>> nvmlSystemGetDriverVersion()
b'440.59'
>>> handle = nvmlDeviceGetHandleByIndex(0)
>>> nvmlDeviceGetComputeRunningProcesses(handle)
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2223, in nvmlDeviceGetComputeRunningProcesses
    return nvmlDeviceGetComputeRunningProcesses_v2(handle);
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found
>>> nvmlDeviceGetGraphicsRunningProcesses(handle)
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetGraphicsRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2260, in nvmlDeviceGetGraphicsRunningProcesses
    return nvmlDeviceGetGraphicsRunningProcesses_v2(handle)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2228, in nvmlDeviceGetGraphicsRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetGraphicsRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found
>>> list(map(str, nvmlDeviceGetGraphicsRunningProcesses(handle)))
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetGraphicsRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2260, in nvmlDeviceGetGraphicsRunningProcesses
    return nvmlDeviceGetGraphicsRunningProcesses_v2(handle)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2228, in nvmlDeviceGetGraphicsRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetGraphicsRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found
>>> nvmlDeviceGetComputeRunningProcesses_v2(handle)
Traceback (most recent call last):
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 782, in _nvmlGetFunctionPointer
    _nvmlGetFunctionPointer_cache[name] = getattr(nvmlLib, name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 386, in __getattr__
    func = self.__getitem__(name)
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/ctypes/__init__.py", line 391, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1: undefined symbol: nvmlDeviceGetComputeRunningProcesses_v2

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 2191, in nvmlDeviceGetComputeRunningProcesses_v2
    fn = _nvmlGetFunctionPointer("nvmlDeviceGetComputeRunningProcesses_v2")
  File "/home/kube-admin/miniconda3/envs/temporl/lib/python3.8/site-packages/pynvml/nvml.py", line 785, in _nvmlGetFunctionPointer
    raise NVMLError(NVML_ERROR_FUNCTION_NOT_FOUND)
pynvml.nvml.NVMLError_FunctionNotFound: Function Not Found

This error seems to be coming from `pynvml`. I am not sure why this is the case.

Following is the nvidia-smi output:

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.59       Driver Version: 440.59       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:02:00.0 Off |                  N/A |
| 23%   60C    P0    64W / 250W |      0MiB / 11177MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 108...  Off  | 00000000:03:00.0 Off |                  N/A |
| 24%   60C    P0    63W / 250W |      0MiB / 11178MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   2  GeForce GTX 108...  Off  | 00000000:83:00.0 Off |                  N/A |
| 23%   58C    P0    57W / 250W |      0MiB / 11178MiB |      3%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

[Edit tags](javascript:void(0))

Share

Edit

Follow

Close

Undelete

Flag

edited Sep 11, 2023 at 9:37

Sam Mason

15.7k11 gold badge4242 silver badges6565 bronze badges

asked Sep 3, 2022 at 9:57

user529295

18922 silver badges1111 bronze badges

update your GPU driver to the latest. pynvml 11.4.1 expects a driver install that is consistent with CUDA 11.4

– Robert Crovella

Sep 3, 2022 at 15:13

Is there a mapping between pynvml version and CUDA driver version I can refer? To reproduce PyTorch rmodel results, the constraint is to use CUDA 10.2.

– user529295

Sep 4, 2022 at 17:45

Is there a problem with just updating your GPU driver to the latest one? Anyhow, the relationship between the minimum driver version and the CUDA version is listed in the CUDA toolkit release notes. You should refer to table 3 not table 2. Perhaps you may also want to study this so that you can learn that CUDA 10.2 can run with any driver that advertises 10.2 or newer.

– Robert Crovella

Sep 4, 2022 at 21:12

Topic		Replies	Views
Installing pytorch - /usr/local/cuda/lib64/libcudnn.so: error adding symbols: File in wrong format collect2: error: ld returned 1 exit status Jetson TX2 pytorch	20	5380	March 11, 2022
nvprof never returns CUDA Programming and Performance	8	6311	March 30, 2016
Tao Text Classification Evaluate failing TAO Toolkit	18	1366	October 12, 2021
Cpp pytorch inference OpenGL tensorrt , cuda , tensorflow , nvbugs	8	1323	June 27, 2023
Kernel 4.16-rc1 Breaks latest drivers - 'Unknown symbol swiotlb_map_sg_attrs' Linux	31	24083	January 29, 2019
cudnn lstm is broken above driver 431.60, 'Unexpected Event status: 1 cuda' cuDNN	14	8738	February 4, 2021
Error when evaluate PointPillar network TAO Toolkit	6	754	June 4, 2023
Jetson AGX Xavier \| l4t-ml:r36.2.0-py3 \| Pytorch finds wrong Cuda version (7.2 instead of 12.2) Jetson AGX Xavier pytorch , generative_ai	11	1684	February 23, 2024
Nsys profile mpirun -np 1 ./MyOpenACC_App ./input.file has float point error Profiling Linux Targets cuda	16	904	November 28, 2023
NVProf error on samples CUDA Programming and Performance	28	20459	December 29, 2020

Unable to access pynvml methods

1. I am trying to profile my PyTorch code using scalene. When I run my code as scalene main.py, I get the following error:

To validate whether or not this issue is coming from scalene library, I run the following commands:

This error seems to be coming from pynvml. I am not sure why this is the case.

NVML cannot load methods “NVMLError_FunctionNotFound”

1. I am trying to profile my PyTorch code using scalene. When I run my code as scalene main.py I get the following error:

To validate whether or not this issue is coming from scalene library, I run the following commands:

This error seems to be coming from pynvml. I am not sure why this is the case.

Related topics

1. I am trying to profile my PyTorch code using scalene. When I run my code as `scalene main.py`, I get the following error:

To validate whether or not this issue is coming from `scalene` library, I run the following commands:

This error seems to be coming from `pynvml`. I am not sure why this is the case.

1. I am trying to profile my PyTorch code using scalene. When I run my code as `scalene main.py` I get the following error:

To validate whether or not this issue is coming from `scalene` library, I run the following commands:

This error seems to be coming from `pynvml`. I am not sure why this is the case.