Problems about running tinycudann on Jetson AGX Orin

cjiangan · March 14, 2023, 4:49pm

Problems about running tinycudann on Jetson AGX Orin

Platform info:

Model: NVIDIA Orin Jetson-Small Developer Kit

CUDA Arch BIN: 8.7

System: Ubuntu 20.04 focal

Jetpack: 5.0.1 DP

cuDNN: 8.3.2.49

CUDA: release 11.4, V11.4.239

Problem1

I successfully compile the tinycudann (GitHub - NVlabs/tiny-cuda-nn: Lightning fast C++/CUDA neural network framework)

But when running the demo of tinycudann

[./build/mlp_learning_an_image data/images/albert.jpg data/config_hash.json],

it raises the out of memory error

So I reduce the batch size in the mlp_learning_an_image.cu

(tiny-cuda-nn/mlp_learning_an_image.cu at master · NVlabs/tiny-cuda-nn · GitHub )

But it still raises the same error. It is weird since Jetson AGX Orin has 32G GPU memory.

Problem2

Also, run the demo of tinycudann

[./build/mlp_learning_an_image data/images/albert.jpg data/config_hash.json].

I find it is extremely slow when loading the image (tiny-cuda-nn/mlp_learning_an_image.cu at master · NVlabs/tiny-cuda-nn · GitHub), and the slowest part is the “cudaMalloc” function (tiny-cuda-nn/gpu_memory.h at master · NVlabs/tiny-cuda-nn · GitHub).

However, I try the “cudaMalloc” function in a new CUDA project, and it works well. Only in the tinycudann project, “cudaMalloc” works extremely slowly.

AastaLLL · March 15, 2023, 3:32am

Hi,

Since JetPack 5.0.1 is a DP version release, would you mind upgrading to JetPack 5.1 first?
Thanks.

cjiangan · March 15, 2023, 12:02pm

Hi, I have tried to update the jetpack version, but the problem still exists.
Now, the version is:
Release: 5.10.104-teegra
CUDA: 11.4.315
TensorRT: 8.5.2.2

AastaLLL · March 16, 2023, 5:17am

Hi,

Thanks for the testing.

Problem1

When you meet the OOM error, could you confirm it with the output of tegrastats.

$ sudo tegrastats

Problem2

Could you try to enable the device performance to see if it helps?

$ sudo nvpmodel -m 0
$ sudo jetson_clocks

Thanks

cjiangan · March 16, 2023, 5:35am

Hi, I looked at the tegrastats and the used memory is not over the limit. Also, I try the nvpmodel -m 0 and jetson_clocks, but it still cannot work.

AastaLLL · March 20, 2023, 3:11am

Hi,

Thanks for the testing.

We are checking this issue internally.
Will give you an update later.

AastaLLL · March 20, 2023, 7:01am

Hi,

It looks like the library doesn’t contain Orin GPU architecture (87).
Please change it to 87 and build it again.

	elif cuda_version < parse_version("11.8"):
		return 87

github.com

NVlabs/tiny-cuda-nn/blob/master/bindings/torch/setup.py#L26


      
          		return 50
          	else:
          		return 20
          
          
def max_supported_compute_capability(cuda_version):
          	if cuda_version < parse_version("11.0"):
          		return 75
          	elif cuda_version < parse_version("11.1"):
          		return 80
          	elif cuda_version < parse_version("11.8"):
          		return 86
          	else:
          		return 90
          
          
# Find version of tinycudann by scraping CMakeLists.txt
          with open(os.path.join(ROOT_DIR, "CMakeLists.txt"), "r") as cmakelists:
          	for line in cmakelists.readlines():
          		if line.strip().startswith("VERSION"):
          			VERSION = line.split("VERSION")[-1].strip()
          			break

Thanks.

cjiangan · March 21, 2023, 4:06pm

Thanks for your reply. I update CUDA to CUDA11.8 and change some code in tinycudann, now I can run the demo of tinycudann successfully. But I find the cpp version demo is extremely slower than the python version demo.

AastaLLL · March 22, 2023, 2:53am

Hi,

Please share the GPU utilization ratio when running the demo script.

$ sudo tegrastats

Thanks.

cjiangan · March 30, 2023, 5:38am

This is the results when running c++ demo

AastaLLL · March 30, 2023, 6:12am

Hi,

Based on the picture, the GPU utilization is already 99%.
This indicates that GPU is fully occupied.

Not sure why the C++ demo is slower than the python sample.
Are they the same use case? Or a different scenario?

Thanks.

cjiangan · March 30, 2023, 7:42am

Thanks for your help. Maybe I need to check it in detail.

system · April 25, 2023, 7:21am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Jetson TX2 cudaMalloc() failed with error all CUDA-capable devices are busy or unavailable Jetson TX2 cuda	9	1027	September 13, 2023
Jetson Orin Nano upgraded Super mode Jetson Orin Nano jetson	4	51	February 17, 2025
How to specify which CUDA to use Jetson AGX Orin cudnn	4	1403	November 20, 2023
Opencv SURF with CUDA is not faster by a noticeable amount on agx orin Jetson AGX Orin cuda	6	55	November 21, 2024
CUDA is not installed on Jetson Orin Jetson AGX Orin cuda	8	17916	August 10, 2022
Latency of Cuda VMM API on jetson orin CUDA Programming and Performance	6	30	February 25, 2025
NVIDIA EncoderFromCuda initialization failed: Error allocating cuda memory: 709 bytes: 3110400 Jetson Orin NX cuda	5	425	July 12, 2023
Memory Leak on Jetson Orin when calculating ComputeCache with videotestsrc DeepStream SDK cuda , deepstream	2	25	February 10, 2025
Jetson ORIN is not detecting my cuda instsallation Jetson AGX Orin cuda	5	2139	August 15, 2022
GPU 0 does not support virtual memory - (core dumped) Jetson AGX Orin	2	581	September 4, 2023

Problems about running tinycudann on Jetson AGX Orin

Problem1

Problem2

Related topics