Hi all,
I have a program doing detection and tracking of moving vehicles with some statistics visualization. The program is well optimized to work in real-time on Jetson Nano (using TensorRT engine for detection). I moved the same code + models from Nano to TX2 with generating new engine files. The program runs 5-6 times slower than Nano. Normally TX2 has 2 additional Denver CPUs and is supposed to be faster. I measure the processing time for detection and tracking. Detection Time is similar, but the tracking with Dlib is slower. I m using the correlation tracker in Dlib.
Environment (for both TX2 and Nano):
- Jetpack L4T 32.2.1
- GPU ARCH (6.2 TX2) / (5.3 Nano)
- OpenCV 4.1.1 compiled CUDA YES (Compiled from source in a 128GB SD card (TX2) / internal storage (Nano))
- TensorRT 5.1.6.1
- Dlib 19.17 (Compiled from source in internal storage)
- Matplotlib 2.1.1 (installed with pip3)
- CUDA 10.0.326
- cuDNN 7.5.0.56
- VisionWorks 1.6.0.500n
- Python 3.6.8
Things I tried and failed :
- all the NVP models on TX2
- compiled OpenCV with TBB support
- use the last version of Dlib with CUDA/LAPACK/BLAS (19.18 released 2/3 weeks ago): https://github.com/davisking/dlib/releases
Are there any ideas, i can try to fix this issue?
Thanks in advance