Unable to Install CUDA-Enabled PyTorch for NVIDIA GB10 GPU (Only CPU Version Installed)

Hi,
I am currently using an NVIDIA GB10 GPU and trying to install PyTorch, Torchvision, and Torchaudio with CUDA support. However, every time I install PyTorch, it only installs the CPU-only version, and CUDA is not detected.

I need the GPU-enabled version because I’m using Unsloth for fine-tuning models, and it requires CUDA acceleration.

Could someone guide me on how to properly install the CUDA-enabled PyTorch stack for the GB10 GPU?
Are there specific CUDA toolkit, PyTorch versions, or wheels that I need to use for this GPU?

current driver version of CUDA is 13.00

1 Like

I’m facing the same problem. If you got the solution, please let me know as well.

Use this below with docker and let me know if it is working or not for you

  1. sudo docker run --gpus all -it --name unsloth-pytorch-24.12 -v “$PWD”:/workspace nvcr.io/nvidia/pytorch:24.12-py3

  2. pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128

  3. python -c “import torch; print(‘PyTorch:’, torch.version); print(‘CUDA available:’, torch.cuda.is_available()); print(‘Device:’, torch.cuda.get_device_name(0)); print(‘Compute capability:’, torch.cuda.get_device_capability(0)); print(‘Inductor config exists:’, hasattr(torch._inductor, ‘config’))”

The version indicated above has been built with Cuda 12.8.

Your GB10 is CC12.1, which wasn’t supported until Cuda 12.9, so if there is no 12.9 build available, you’ll have to compile it yourself.

Thanks for helping. I did the following things, and it’s working for me. I have drafted a full-fledged document to make it work. Please find below my inputs.

Background: The GB10 (Blackwell architecture) isn’t officially supported in standard PyTorch containers yet, so I had to build a custom environment from scratch using NVIDIA’s CUDA base container.

Here’s what I did:

Initial Setup

Since the standard PyTorch containers weren’t recognizing our GPU, I started with NVIDIA’s basic CUDA container:

sudo docker run --gpus all -it \
  --name gb10-pytorch \
  --ipc=host \
  --ulimit memlock=-1 \
  --ulimit stack=67108864 \
  -v "$PWD":/workspace \
  nvcr.io/nvidia/cuda:12.9.0-devel-ubuntu22.04

Building the Environment

Once inside the container, I installed everything manually

# Basic Python setup
apt update
apt install -y python3 python3-pip git build-essential cmake ninja-build

# PyTorch nightly build (has CUDA 12.9 support for newer GPUs)
pip3 install --pre torch torchvision torchaudio \
  --index-url https://download.pytorch.org/whl/nightly/cu129

# Core ML libraries
pip3 install numpy scipy pandas matplotlib seaborn scikit-learn

# Transformers stack for LLM work
pip3 install transformers datasets accelerate peft trl bitsandbytes sentencepiece protobuf

# Unsloth for efficient training
pip3 install unsloth

I verified everything was working with:

python3 -c "import torch; print('CUDA available:', torch.cuda.is_available()); print('GPU:', torch.cuda.get_device_name(0))"

Making it Persistent

After getting everything working, I needed to save this setup so we don’t have to rebuild every time:

# Exit container first
exit

# Save the container as an image
sudo docker commit gb10-pytorch my-gb10-ml:latest

# Remove the temporary container
sudo docker stop gb10-pytorch
sudo docker rm gb10-pytorch

# Create a persistent container that auto-starts on reboot
sudo docker run -d \
  --name gb10-pytorch \
  --restart unless-stopped \
  --gpus all \
  --ipc=host \
  --ulimit memlock=-1 \
  --ulimit stack=67108864 \
  -v "$PWD":/workspace \
  my-gb10-ml:latest \
  sleep infinity

Important Notes:

  • The container now runs in the background.

  • The ~/home/project folder is mounted to /workspace inside the container - this is where all our project files are accessible

  • It will auto-restart after server reboots

Daily Usage

I’ve set up some aliases to make things easier. Add these to your ~/.bashrc:

alias ml-env='sudo docker exec -it gb10-pytorch bash'
alias ml-python='sudo docker exec -it gb10-pytorch python3'
alias ml-pip='sudo docker exec -it gb10-pytorch pip3'

Then just run source ~/.bashrc to activate them.

Now you can:

  • Jump into the environment: ml-env

  • Run Python directly: ml-python script.py

  • Install packages: ml-pip install package-name

Updating the Image

Whenever you install new packages and want to save them:

# 1. Install whatever you need
ml-env
pip3 install new-package
exit

# 2. Save the updated state
sudo docker commit gb10-pytorch my-gb10-ml:latest

# 3. Restart container with updated image
sudo docker stop gb10-pytorch
sudo docker rm gb10-pytorch
sudo docker run -d \
  --name gb10-pytorch \
  --restart unless-stopped \
  --gpus all \
  --ipc=host \
  --ulimit memlock=-1 \
  --ulimit stack=67108864 \
  -v "$PWD":/workspace \
  my-gb10-ml:latest \
  sleep infinity

Current Status

The container is now running and ready to use. You can check it with:

sudo docker ps

You should see gb10-pytorch.

1 Like