Hello,
I’ve run into “CUDA error: an illegal memory access was encountered” multiple times regardless of using pytorch/ollama. I’ve tried to cut down my code to the simplest form that still produces this error. Here’s my code:
import torch
param = torch.randn((150000, 1000), dtype=torch.bfloat16, device='cuda:0')
param = param.to(torch.float16)
param = param.cpu()
print("success")
And the error is the following:
(torch) root@Htzr:~/code# python ./repro_error_2.py
Traceback (most recent call last):
File "/root/code/./repro_error_2.py", line 5, in <module>
param = param.cpu()
torch.AcceleratorError: CUDA error: an illegal memory access was encountered
Search for `cudaErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
When I run more complicated code in pytorch or deploy models with ollama, I run into this same error more or less. My cuda driver infos are the following:
(torch) root@Htzr:~/code# conda list
# packages in environment at /root/miniconda3/envs/torch:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
accelerate 1.12.0 pypi_0 pypi
bitsandbytes 0.49.0 pypi_0 pypi
bzip2 1.0.8 h5eee18b_6
ca-certificates 2025.12.2 h06a4308_0
certifi 2025.11.12 pypi_0 pypi
charset-normalizer 3.4.4 pypi_0 pypi
filelock 3.20.0 pypi_0 pypi
fsspec 2025.12.0 pypi_0 pypi
hf-xet 1.2.0 pypi_0 pypi
huggingface-hub 0.36.0 pypi_0 pypi
idna 3.11 pypi_0 pypi
jinja2 3.1.6 pypi_0 pypi
ld_impl_linux-64 2.44 h153f514_2
libexpat 2.7.3 h7354ed3_4
libffi 3.4.4 h6a678d5_1
libgcc 15.2.0 h69a1729_7
libgcc-ng 15.2.0 h166f726_7
libgomp 15.2.0 h4751f2c_7
libmpdec 4.0.0 h5eee18b_0
libstdcxx 15.2.0 h39759b7_7
libstdcxx-ng 15.2.0 hc03a8fd_7
libuuid 1.41.5 h5eee18b_0
libxcb 1.17.0 h9b100fa_0
libzlib 1.3.1 hb25bd0a_0
markupsafe 3.0.2 pypi_0 pypi
modelscope 1.33.0 pypi_0 pypi
mpmath 1.3.0 pypi_0 pypi
ncurses 6.5 h7934f7d_0
networkx 3.6.1 pypi_0 pypi
numpy 2.3.5 pypi_0 pypi
nvidia-cublas-cu12 12.6.4.1 pypi_0 pypi
nvidia-cuda-cupti-cu12 12.6.80 pypi_0 pypi
nvidia-cuda-nvrtc-cu12 12.6.77 pypi_0 pypi
nvidia-cuda-runtime-cu12 12.6.77 pypi_0 pypi
nvidia-cudnn-cu12 9.10.2.21 pypi_0 pypi
nvidia-cufft-cu12 11.3.0.4 pypi_0 pypi
nvidia-cufile-cu12 1.11.1.6 pypi_0 pypi
nvidia-curand-cu12 10.3.7.77 pypi_0 pypi
nvidia-cusolver-cu12 11.7.1.2 pypi_0 pypi
nvidia-cusparse-cu12 12.5.4.2 pypi_0 pypi
nvidia-cusparselt-cu12 0.7.1 pypi_0 pypi
nvidia-nccl-cu12 2.27.5 pypi_0 pypi
nvidia-nvjitlink-cu12 12.6.85 pypi_0 pypi
nvidia-nvshmem-cu12 3.3.20 pypi_0 pypi
nvidia-nvtx-cu12 12.6.77 pypi_0 pypi
openssl 3.0.18 hd6dcaed_0
packaging 25.0 pypi_0 pypi
pillow 12.0.0 pypi_0 pypi
pip 25.3 pyhc872135_0
psutil 7.2.1 pypi_0 pypi
pthread-stubs 0.3 h0ce48e5_1
python 3.13.11 hcf712cf_100_cp313
python_abi 3.13 3_cp313
pyyaml 6.0.3 pypi_0 pypi
readline 8.3 hc2a1206_0
regex 2025.11.3 pypi_0 pypi
requests 2.32.5 pypi_0 pypi
safetensors 0.7.0 pypi_0 pypi
setuptools 80.9.0 py313h06a4308_0
sqlite 3.51.0 h2a70700_0
sympy 1.14.0 pypi_0 pypi
tk 8.6.15 h54e0aa7_0
tokenizers 0.22.1 pypi_0 pypi
torch 2.9.1+cu126 pypi_0 pypi
torchvision 0.24.1+cu126 pypi_0 pypi
tqdm 4.67.1 pypi_0 pypi
transformers 4.57.3 pypi_0 pypi
triton 3.5.1 pypi_0 pypi
typing-extensions 4.15.0 pypi_0 pypi
tzdata 2025b h04d1e81_0
urllib3 2.6.2 pypi_0 pypi
wheel 0.45.1 py313h06a4308_0
xorg-libx11 1.8.12 h9b100fa_1
xorg-libxau 1.0.12 h9b100fa_0
xorg-libxdmcp 1.1.5 h9b100fa_0
xorg-xorgproto 2024.1 h5eee18b_1
xz 5.6.4 h5eee18b_1
zlib 1.3.1 hb25bd0a_0
Any idea? Thanks
