Which PyTorch base image to use in AI Workbench with DGX Spark?

I have been using 1.0.6 of the PyTorch base image (with CUDA 12.6.3), but it gives me the following warning. I’m ignoring this warning, things seem to work, but I am curious what it means and how I can upgrade to something that PyTorch believes would be better suited? Thanks!

/usr/local/lib/python3.12/dist-packages/torch/cuda/__init__.py:235: UserWarning: 
NVIDIA GB10 with CUDA capability sm_121 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_80 sm_86 sm_90 compute_90.
If you want to use the NVIDIA GB10 GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

  warnings.warn(

Recommend checking out our latest pytorch image from NGC nvcr.io/nvidia/pytorch:25.10-py3

I get a 401 on that link? and even just nvcr.io

If you want to open a webpage, you may try this.: https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch?version=25.10-py3

In Docker, you could try: docker pull nvcr.io/nvidia/pytorch:25.10-py3

1 Like

Thanks, it’s not clear to me just yet how I would use this in AI Workbench, which I have thus far been trying to use for the container stuff. It gives me this error trying to use it as a custom container:

Connection Error: No data returned for operation `createProjectMutation`, got error(s): problem creating project files; required base environment label is not set: ‘com.nvidia.workbench.schema-version’ See the error `source` property for more information..

The AI Workbench bases must be updated. Trying to create a project based on nvcr.io/nvidia/pytorch:25.10-py3 fails with com.nvidia.workbench.schema-version not being set, but I just can’t figure out how to do it. Maybe @aniculescu can guide us.

(nvwb:local) elsaco@spark:~$ nvwb create project --base-url nvcr.io/nvidia/pytorch:25.10-py3
? Enter a unique name for the project: PyTorch1
? Enter a description: PyTorch test
? Choose a base environment: Custom
? Enter the URL for the custom base image: nvcr.io/nvidia/pytorch:25.10-py3

  input: createProject problem creating project files; required base environment label is not set:
  'com.nvidia.workbench.schema-version'

(nvwb:local) elsaco@spark:~$

These are all the current bases that can be used for custom projects on my device. Importing the Docker container didn’t change anything.

nvcr.io/nvidia/ai-workbench/python-basic:1.0.8
nvcr.io/nvidia/ai-workbench/python-cuda117:1.0.6
nvcr.io/nvidia/ai-workbench/python-cuda120:1.0.8
nvcr.io/nvidia/ai-workbench/python-cuda122:1.0.8
nvcr.io/nvidia/ai-workbench/python-cuda124:1.0.2
nvcr.io/nvidia/ai-workbench/python-cuda126:1.0.2
nvcr.io/nvidia/ai-workbench/python-cuda128:1.0.2
nvcr.io/nvidia/ai-workbench/python-cuda129:1.0.1
nvcr.io/nvidia/ai-workbench/pytorch:1.0.6
nvcr.io/nvidia/rapidsai/notebooks:25.06-cuda12.8-py3.11

Run nvwb list bases --wide to show what’s available on your Spark. There’s more output than what’s posted above.

1 Like

Using Pyton with CUDA 12.9 as base creating a custom container works okay:

(nvwb:local) elsaco@spark:~$ nvwb create project
? Enter a unique name for the project: ptTest
? Enter a description: PyTorch test
? Choose a base environment: Python with CUDA 12.9

  Created new project 'ptTest' (/home/elsaco/nvidia-workbench/ptTest)


  ✓ Container build complete (2m5.387084856s)

and docker inspect shows “com.nvidia.workbench.schema-version”: “v2”

Yes that is one of the standard ones, I am hoping to get a recent PyTorch base.

Seems like we need to add a bunch of labels to these later docker containers to make them compatible with AI Workbench.

I am starting to get the impression that AI Workbench isn’t a highly supported workflow? I quite like the jupyterlab and vscode integrations via Sync, it’s handy, but manually updating Dockerfiles with special tags that it needs kinda spoils it. What do professionals do for this stuff, when they want something which can be easily used on a e.g. DGX Spark but the resulting projects are also easy to push into a cloud for actual training cluster time?

I’m not even convinced this “pytorch 2.6” image is 2.6, when I click through it seems to still be 2.4??

EDIT: the docs are just wrong, but I do feel like I need to get an AI Workbench base image with pytorch 2.9 working, so I guess I gotta start from the 2.9 docker container from nvcr then add those tags? Or would it be easier to start from the latest Python with CUDA 12.9 and then just add pytorch 2.9 to it?

Is there a guide to converting one of these NGC containers into an AI Workbench base image, @aniculescu ?

For AI Workbench specific questions, you can reach out to their forum: NVIDIA AI Workbench - NVIDIA Developer Forums
However, you can start by looking at their documentation for using your own container: Use Your Own Container — NVIDIA AI Workbench User Guide

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.