• Hardware Platform (Jetson / GPU)
Jetson NX on Floyd FLD-BB01 carrier board.
deviceQuery gives the following output:
./deviceQuery Starting…
CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: “Xavier”
CUDA Driver Version / Runtime Version 10.2 / 10.2
CUDA Capability Major/Minor version number: 7.2
<…some more stuff, then at the end…>
Compute Mode:
< Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
Result = PASS
• DeepStream Version
Followed Jetson Setup on this page: Quickstart Guide — DeepStream 6.3 Release documentation
…and used DeepStream SDK according to Method 4 with container: nvcr.io/nvidia/deepstream-l4t:6.0-samples
• JetPack Version (valid for Jetson only)
JetPack 4.6 installed from instructions on this page: How to Install JetPack :: NVIDIA JetPack Documentation
Due to limited (16GB) disk space, this method mentioned was used:
If disk space is limited (for example, when using a 16GB microSD card with a Jetson Nano or Jetson Xavier NX developer kit), use these commands:
sudo apt update
apt depends nvidia-jetpack | awk ‘{print $2}’ | xargs -I {} sudo apt install -y {}
• TensorRT Version
The command:
dpkg -l | grep TensorRT
gives:
ii graphsurgeon-tf 8.0.1-1+cuda10.2 arm64 GraphSurgeon for TensorRT package
ii libnvinfer-bin 8.0.1-1+cuda10.2 arm64 TensorRT binaries
ii libnvinfer-dev 8.0.1-1+cuda10.2 arm64 TensorRT development libraries and headers
ii libnvinfer-doc 8.0.1-1+cuda10.2 all TensorRT documentation
ii libnvinfer-plugin-dev 8.0.1-1+cuda10.2 arm64 TensorRT plugin libraries
ii libnvinfer-plugin8 8.0.1-1+cuda10.2 arm64 TensorRT plugin libraries
ii libnvinfer-samples 8.0.1-1+cuda10.2 all TensorRT samples
ii libnvinfer8 8.0.1-1+cuda10.2 arm64 TensorRT runtime libraries
ii libnvonnxparsers-dev 8.0.1-1+cuda10.2 arm64 TensorRT ONNX libraries
ii libnvonnxparsers8 8.0.1-1+cuda10.2 arm64 TensorRT ONNX libraries
ii libnvparsers-dev 8.0.1-1+cuda10.2 arm64 TensorRT parsers libraries
ii libnvparsers8 8.0.1-1+cuda10.2 arm64 TensorRT parsers libraries
ii nvidia-container-csv-tensorrt 8.0.1.6-1+cuda10.2 arm64 Jetpack TensorRT CSV file
ii nvidia-tensorrt 4.6-b199 arm64 NVIDIA TensorRT Meta Package
ii python3-libnvinfer 8.0.1-1+cuda10.2 arm64 Python 3 bindings for TensorRT
ii python3-libnvinfer-dev 8.0.1-1+cuda10.2 arm64 Python 3 development package for TensorRT
ii tensorrt 8.0.1.6-1+cuda10.2 arm64 Meta package of TensorRT
ii uff-converter-tf 8.0.1-1+cuda10.2 arm64 UFF converter for TensorRT package
• NVIDIA GPU Driver Version (valid for GPU only)
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.2, NumDevs = 1
• Issue Type( questions, new requirements, bugs)
When running the peopleNet sample, the speed is not even close the to Nvidia claim of 157fps.
Only about 30 fps is achieved
Also, it seems like the ‘out-of-the-box’ sample setup has some issue as a HUGE amount of warnings are produced.
See attached file which gives all the run-time output
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
Add the peopleNet samples according to the instructions in the README.md file inside the directory: /opt/nvidia/deepstream/deepstream-6.0/samples/configs/tao_pretrained_models
All config files are used as-is without any modifications. Then run the demo with:
deepstream-app -c deepstream_app_source1_peoplenet.txt
Thank you for your support.
peoplenet_run_output.txt (130.3 KB)