See below: I have correctly installed and configured CUDA, TensorRT and Pytorch.
However, when I pulled “Ollama run qwen2” – only CPU was used in the interaction with the LLM, no GPU parallel computing was there despite all appropriate installation and configuration of CUDA, TensorRT and Pytorch are done.
WHY??
PS C:\Users\patta\WSL> wsl -d Ubuntu-22.04
pattang56892@PTWin11P01:/mnt/c/Users/patta/WSL$ nvidia-smi
Sat Jun 22 22:32:59 2024
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.52.01 Driver Version: 555.99 CUDA Version: 12.5 |
|-----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 On | 00000000:09:00.0 On | N/A |
| 56% 53C P0 51W / 170W | 4566MiB / 12288MiB | 56% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+
±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
±----------------------------------------------------------------------------------------+
pattang56892@PTWin11P01:/mnt/c/Users/patta/WSL$ dpkg -l | grep nvinfer
ii libnvinfer-dev 10.1.0.27-1+cuda11.8 amd64 TensorRT development libraries
ii libnvinfer-headers-dev 10.1.0.27-1+cuda11.8 amd64 TensorRT development headers
ii libnvinfer-headers-plugin-dev 10.1.0.27-1+cuda11.8 amd64 TensorRT plugin headers
ii libnvinfer-plugin-dev 10.1.0.27-1+cuda11.8 amd64 TensorRT plugin libraries
ii libnvinfer-plugin10 10.1.0.27-1+cuda11.8 amd64 TensorRT plugin libraries
ii libnvinfer-vc-plugin10 10.1.0.27-1+cuda11.8 amd64 TensorRT vc-plugin library
ii libnvinfer10 10.1.0.27-1+cuda11.8 amd64 TensorRT runtime libraries
ii python3-libnvinfer 10.1.0.27-1+cuda11.8 amd64 Python 3 bindings for TensorRT standard runtime
pattang56892@PTWin11P01:/mnt/c/Users/patta/WSL$ python3 -c “import torch; print(torch.version); print(torch.cuda.is_available()); print(torch.cuda.get_device_name(0))”
2.3.1+cu118
True
NVIDIA GeForce RTX 3060