Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU): GeForce RTX 3090
• DeepStream Version: 6.1
• TensorRT Version: 8.2
• NVIDIA GPU Driver Version (valid for GPU only): 510
• Issue Type( questions, new requirements, bugs): question
I did some network benchmarking using
trtexec and noticed that setting the
--useCudaGraph flag increase the engine performance significantly. The results are shown bellow:
|Throughput||Enqueue Time||H2D Latency||GPU Compute Time|
All experiments used 1000 queries, and compute the averages every 10 queries (
--warmUp=0 --duration=0 --iterations=1000 --avgRuns=10). Except for throughput (measured in queries per second), other measurements use the mean (measured in milliseconds).
I figure I can improve the pipeline performance by enabling Cuda Graph in deepstream 6.1 some how but I see no mention of it in
nvinfer plugin documentation. Does deepstream support Cuda Graph?