I try to build jax0.5.3 on jetson thor device, but the code did not recognize the sm_110
Hi,
Please find below the building script:
Thanks.
when i use this script to compile jax 0.5.3, the error come out:
ERROR: /root/.cache/bazel/_bazel_root/cfd1b2cc6fe180f3eb424db6004de364/external/xla/third_party/gpus/cuda/hermetic/cuda_json_init_repository.bzl:50:13: An error occurred during the fetch of repository ‘cuda_redist_json’:
Traceback (most recent call last):
File “/root/.cache/bazel/_bazel_root/cfd1b2cc6fe180f3eb424db6004de364/external/xla/third_party/gpus/cuda/hermetic/cuda_json_init_repository.bzl”, line 50, column 13, in _cuda_redist_json_impl
fail(
Error in fail: The supported CUDA versions are [“11.8”, “12.1.1”, “12.2.0”, “12.3.1”, “12.3.2”, “12.4.0”, “12.4.1”, “12.5.0”, “12.5.1”, “12.6.0”, “12.6.1”, “12.6.2”, “12.6.3”, “12.8.0”]. Please provide a supported version in HERMETIC_CUDA_VERSION environment variable or add JSON URL for CUDA version=13.1.1.
ERROR: Error computing the main repository mapping: no such package ‘@@cuda_redist_json//’: The supported CUDA versions are [“11.8”, “12.1.1”, “12.2.0”, “12.3.1”, “12.3.2”, “12.4.0”, “12.4.1”, “12.5.0”, “12.5.1”, “12.6.0”, “12.6.1”, “12.6.2”, “12.6.3”, “12.8.0”]. Please provide a supported version in HERMETIC_CUDA_VERSION environment variable or add JSON URL for CUDA version=13.1.1.
Computing main repo mapping:
Traceback (most recent call last):
File “/opt/jax/build/build.py”, line 700, in
asyncio.run(main())
File “/usr/lib/python3.12/asyncio/runners.py”, line 194, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File “/usr/lib/python3.12/asyncio/runners.py”, line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3.12/asyncio/base_events.py”, line 687, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File “/opt/jax/build/build.py”, line 693, in main
raise RuntimeError(f"Command failed with return code {result.return_code}")
RuntimeError: Command failed with return code 1
it seems the jax0.5.3 do not support cuda13.x
You might be able to run the script with this set in environment prior to running script.
export JAX_SKIP_CUDA_CONSTRAINTS_CHECK=1
But with significant potential functionality issues in that JAX-0.5.3 seems to want cuda-12* and its supporting libraries.
For JAX on Thor and Cuda-13* you could install with
NVIDIA GPU: pip install -U "jax[cuda13]"
i create a new conda env, use pip install -U “jax[cuda13]”, then run my jax test py,facing this error
(jax_test_py312) user@user:~$ python pi0/check_jax.py
JAX version: 0.9.1
W0313 10:15:53.265163 659646 cuda_executor.cc:1818] Memory clock rate or bus width is 0
W0313 10:15:53.265191 659646 cuda_executor.cc:1820] Using hardcoded values for Thor
W0313 10:15:53.278200 659583 cuda_executor.cc:1818] Memory clock rate or bus width is 0
W0313 10:15:53.278216 659583 cuda_executor.cc:1820] Using hardcoded values for Thor
Devices: [CudaDevice(id=0)]
Warming up (XLA compile)…
F0313 10:15:54.070059 659663 stream_executor_util.cc:519] Could not load RepeatBufferKernel: INTERNAL: CUDA Runtime error: [0] Failed call to cudaGetFuncBySymbol: cudaErrorNoKernelImageForDevice: no kernel image is available for execution on the device
*** Check failure stack trace: ***
@ 0xfffd03b4a968 absl::lts_20250814::log_internal::LogMessage::SendToLog()
@ 0xfffd03b4a8e4 absl::lts_20250814::log_internal::LogMessage::Flush()
@ 0xfffd029ac11c xla::gpu::InitializeTypedBuffer<>()
@ 0xfffd029a7fb8 xla::primitive_util::FloatingPointTypeSwitch<>()
@ 0xfffd029a71e8 xla::gpu::InitializeBuffer()
@ 0xfffcfbbbba28 stream_executor::RedzoneAllocator::CreateBuffer()
@ 0xfffcfbbbaa34 xla::gpu::RedzoneBuffers::CreateInputs()
@ 0xfffcfbbba744 xla::gpu::RedzoneBuffers::FromProgramShape()
@ 0xfffcfb988da0 xla::gpu::GpuProfiler::CreateInputBuffers()
@ 0xfffcfb97fea0 xla::Autotuner::ProfileAll()
@ 0xfffcfb9834d8 xla::Autotuner::TuneBestConfig()::$_0::operator()()
@ 0xfffcfb984290 tsl::internal::FutureBase<>::AndThen<>()::{lambda()#1}::operator()()
@ 0xfffcfb984498 tsl::AsyncValue::EnqueueWaiter<>()::Node::RunWaiterAndDeleteWaiterNode()
@ 0xfffcfb9885d0 tsl::internal::PromiseBase<>::emplace<>()
@ 0xfffcfb987f64 tsl::internal::JoinFutures<>::Update<>()
@ 0xfffcfb987dc0 tsl::JoinFutures<>()::{lambda()#1}::operator()()
@ 0xfffcfb987ca0 tsl::internal::FutureBase<>::AndThen<>()::{lambda()#1}::operator()()
@ 0xfffcfb9887b8 tsl::AsyncValue::EnqueueWaiter<>()::Node::RunWaiterAndDeleteWaiterNode()
@ 0xfffcfb984ddc tsl::internal::PromiseBase<>::emplace<>()
@ 0xfffcfb984c10 absl::lts_20250814::internal_any_invocable::RemoteInvoker<>()
@ 0xfffd039cd8c0 std::_Function_handler<>::_M_invoke()
@ 0xfffd039cc3a0 Eigen::ThreadPoolTempl<>::WorkerLoop()
@ 0xfffd039cc23c std::__invoke_impl<>()
@ 0xfffd039bbe24 tsl::(anonymous namespace)::PThread::ThreadFn()
@ 0xffffb8fc595c (unknown)
Aborted (core dumped)
Try installing these too.
pip install jaxlib jax-cuda13-plugin opt-einsum
Then run jax’s install test
python3 -c 'import jax; print(f"JAX version: {jax.__version__}"); print(f"CUDA devices: {jax.devices()}");'
If that works then try your check_jax.py
i did this ,but it does not work
root@5f8d964fc522:/workspace# python3 -c ‘import jax; print(f"JAX version: {jax.version}“); print(f"CUDA devices: {jax.devices()}”);’
JAX version: 0.9.1
W0316 03:17:13.948120 49624 cuda_executor.cc:1818] Memory clock rate or bus width is 0
W0316 03:17:13.948166 49624 cuda_executor.cc:1820] Using hardcoded values for Thor
W0316 03:17:13.960939 49572 cuda_executor.cc:1818] Memory clock rate or bus width is 0
W0316 03:17:13.960969 49572 cuda_executor.cc:1820] Using hardcoded values for Thor
CUDA devices: [CudaDevice(id=0)]
Hi,
Based on the below config info, we build jax 0.10.0 for Thor.
Does the version work for your use case?
Thanks.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.