What is the equivalent of setting the --useCudaGraph flag of trtexec in deepstream 6.1?

hyperlight · August 11, 2022, 11:36pm

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU): GeForce RTX 3090
• DeepStream Version: 6.1
• TensorRT Version: 8.2
• NVIDIA GPU Driver Version (valid for GPU only): 510
• Issue Type( questions, new requirements, bugs): question

I did some network benchmarking using trtexec and noticed that setting the --useCudaGraph flag increase the engine performance significantly. The results are shown bellow:

	Throughput	Enqueue Time	H2D Latency	GPU Compute Time
No flag	352.52	2.24	1.62	2.65
noDataTransfers	407.44	2.16	0.00	2.45
useCudaGraph	615.66	0.07	1.59	1.21
Both flags	848.53	0.09	0.00	1.17

All experiments used 1000 queries, and compute the averages every 10 queries (--warmUp=0 --duration=0 --iterations=1000 --avgRuns=10). Except for throughput (measured in queries per second), other measurements use the mean (measured in milliseconds).

I figure I can improve the pipeline performance by enabling Cuda Graph in deepstream 6.1 some how but I see no mention of it in nvinfer plugin documentation. Does deepstream support Cuda Graph?

Fiona.Chen · August 12, 2022, 5:35am

The --useCudaGraph flag will enable CUDA Graph.
Developer Guide :: NVIDIA Deep Learning TensorRT Documentation

So you may try to add cuda Graph in nvinfer source code to check whether it will help to improve your inference.

system · August 26, 2022, 5:35am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.