Error with tensorflow on DGX Spark / GB10

I’ve got a DGX Spark/GB10
I created a conda environement with tensorflow, then tested the capacity of tensorflow.
When importing tensorflow I got a error with FAILED_PRECONDITION
also tensorflow was unable to detect the GPU

Here under some details.

  1. Is it possible to use tensorflow (and keras) on DGX Spark/GB10 in a conda environement ?
  2. If yes is my install correct ? If not please could you give me a procedure ?
  3. Is my testing procedure correct ? If no please could you give me one ?

Thank you in advance for helping

OS information

:~$ uname -a
Linux DLB001 6.14.0-1015-nvidia #15-Ubuntu SMP PREEMPT_DYNAMIC Tue Nov 25 18:02:16 UTC 2025 aarch64 aarch64 aarch64 GNU/Linux

GPU information

:~$ nvidia-smi
Tue Jan 6 13:06:53 2026
±----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05 Driver Version: 580.95.05 CUDA Version: 13.0 |
±----------------------------------------±-----------------------±---------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GB10 On | 0000000F:01:00.0 Off | N/A |
| N/A 38C P0 11W / N/A | Not Supported | 5% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+

±----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 2504 G /usr/lib/xorg/Xorg 18MiB |
| 0 N/A N/A 2634 G /usr/bin/gnome-shell 6MiB |
| 0 N/A N/A 3815 C+G …c/gnome-remote-desktop-daemon 176MiB |
| 0 N/A N/A 3907 G /usr/bin/gnome-shell 225MiB |
| 0 N/A N/A 4189 C+G …c/gnome-remote-desktop-daemon 241MiB |
| 0 N/A N/A 4292 G /usr/bin/Xwayland 11MiB |
| 0 N/A N/A 9668 C python 227MiB |
±----------------------------------------------------------------------------------------+

creation of a conda environement

:~$ conda create -n ai_env python=3.10
:~$ conda activate ai_env
:~$ pip install nvidia-tensorflow[horovod] --extra-index-url=https://pypi.ngc.nvidia.com/

testing

:~$ conda activate ai_env
(ai_env):~$ python3
Python 3.10.19 (main, Oct 21 2025, 16:38:01) [GCC 11.2.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.

import tensorflow as tf
2026-01-06 13:08:09.936250: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT: INTERNAL: Cannot dlopen all TensorRT libraries: FAILED_PRECONDITION: Could not load dynamic library ‘libnvinfer.so.10.5.0’; dlerror: libnvinfer.so.10.5.0: cannot open shared object file: No such file or directory

print(tf.config.list_physical_devices(‘GPU’))
2026-01-06 13:11:25.203933: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:998] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at linux/Documentation/ABI/testing/sysfs-bus-pci at v6.0 · torvalds/linux · GitHub
2026-01-06 13:11:25.270606: W tensorflow/core/common_runtime/gpu/gpu_device.cc:2251] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices…

Thank you so much Raphael.

So I imagine that it is better to use pytorch ?

1 Like

Yes, definitely

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.