I am trying to run RAPIDS library in my H100 GPU.
I tried installing RAPIDS in my environment using this: RAPIDS | GPU Accelerated Data Science*
But while importing cudf library, am getting the following error:
In [1]: import cudf
/home/nvidia/.local/lib/python3.10/site-packages/cudf/utils/_ptxcompiler.py:61: UserWarning: Error getting driver and runtime versions:
stdout:
stderr:
Traceback (most recent call last):
File "/home/nvidia/.local/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 258, in ensure_initialized
self.cuInit(0)
File "/home/nvidia/.local/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 331, in safe_cuda_api_call
self._check_ctypes_error(fname, retcode)
File "/home/nvidia/.local/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 399, in _check_ctypes_error
raise CudaAPIError(retcode, msg)
numba.cuda.cudadrv.driver.CudaAPIError: [101] Call to cuInit results in CUDA_ERROR_INVALID_DEVICE
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 4, in <module>
File "/home/nvidia/.local/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 296, in __getattr__
self.ensure_initialized()
File "/home/nvidia/.local/lib/python3.10/site-packages/numba/cuda/cudadrv/driver.py", line 262, in ensure_initialized
raise CudaSupportError(f"Error at driver init: {description}")
numba.cuda.cudadrv.error.CudaSupportError: Error at driver init: Call to cuInit results in CUDA_ERROR_INVALID_DEVICE (101)
Not patching Numba
warnings.warn(msg, UserWarning)
---------------------------------------------------------------------------
CUDARuntimeError Traceback (most recent call last)
<ipython-input-1-e13365c50bc4> in <module>
----> 1 import cudf
~/.local/lib/python3.10/site-packages/cudf/__init__.py in <module>
8
9 _setup_numba()
---> 10 validate_setup()
11
12 import cupy
~/.local/lib/python3.10/site-packages/cudf/utils/gpu_utils.py in validate_setup()
53 except CUDARuntimeError as e:
54 if e.status in notify_caller_errors:
---> 55 raise e
56 # If there is no GPU detected, set `gpus_count` to -1
57 gpus_count = -1
~/.local/lib/python3.10/site-packages/cudf/utils/gpu_utils.py in validate_setup()
50
51 try:
---> 52 gpus_count = getDeviceCount()
53 except CUDARuntimeError as e:
54 if e.status in notify_caller_errors:
~/.local/lib/python3.10/site-packages/rmm/_cuda/gpu.py in getDeviceCount()
100 status, count = cudart.cudaGetDeviceCount()
101 if status != cudart.cudaError_t.cudaSuccess:
--> 102 raise CUDARuntimeError(status)
103 return count
104
CUDARuntimeError: cudaErrorInvalidDevice: invalid device ordinal