Why have the lower frame rate in Nano than TX1?

Hi,
we use a USB3.0 camera of 400M pixel(16bit data), we use TUCSEN Dhyana 401DS. the same as 400D, just heat sink difference. http://www.tucsen.com/Home/Product/parameter/dataid/15/id/18.html
we found some different about frame rate in TX1 module and Nano module, frame rate on TX1 module is 35fps, but only 20 fps on Nano,the same camera and test software,why is the nano lower frame rate? do you have some sugestion to improve the frame rate? TKS

Justin

Hi,
Could you run ‘sudo jetson_clocks’ and ‘sudo tegrastats’ and do comparison on TX1 and Nano? TX1 is with better performance than Nano. tegrastats should be able to show some clues.

where can we find the nano CPU and GPU mode? and how to set up the mode? TKs

Hi,

Please run ‘sudo jetson_clocks’ and ‘sudo tegrastats’ and check console output. You will see status of CPU, GPU and other hardware engines.

Hi,
we test the TX and Nano with tegrastats, result as below. it seams no maxiam loader in Nano,why the frame rate still is lower?

Nano test result
RAM 736/3964MB (lfb 496x4MB) SWAP 0/1982MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [48%@1428,57%@1428,52%@1428,45%@1428] EMC_FREQ 15%@1600 GR3D_FREQ 2%@921 APE 25 PLL@27.5C CPU@30.5C PMIC@100C GPU@29C AO@36.5C thermal@29.75C
RAM 736/3964MB (lfb 496x4MB) SWAP 0/1982MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [56%@1428,43%@1428,41%@1428,57%@1428] EMC_FREQ 15%@1600 GR3D_FREQ 9%@921 APE 25 PLL@27.5C CPU@31C PMIC@100C GPU@29C AO@37C thermal@29.75C
RAM 736/3964MB (lfb 496x4MB) SWAP 0/1982MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [73%@1428,56%@1428,41%@1428,31%@1428] EMC_FREQ 15%@1600 GR3D_FREQ 7%@921 APE 25 PLL@27.5C CPU@31.5C PMIC@100C GPU@29C AO@36.5C thermal@30C
RAM 736/3964MB (lfb 496x4MB) SWAP 0/1982MB (cached 0MB) IRAM 0/252kB(lfb 252kB) CPU [48%@1428,55%@1428,42%@1428,52%@1428] EMC_FREQ 15%@1600 GR3D_FREQ 6%@921 APE 25 PLL@28C CPU@30.5C PMIC@100C GPU@29C AO@36C thermal@30.25C

TX1 test result:
RAM 377/3999MB (lfb 791x4MB) cpu [45%,36%,7%,100%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 28%@76 EDP limit 1734
RAM 377/3999MB (lfb 791x4MB) cpu [20%,34%,29%,100%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [11%,35%,32%,100%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [15%,35%,29%,100%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 43%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [13%,34%,34%,100%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [11%,35%,31%,100%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [48%,41%,0%,99%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [27%,36%,18%,100%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 56%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [50%,65%,9%,71%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [12%,100%,29%,34%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [13%,100%,31%,33%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 50%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [11%,100%,31%,33%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [11%,100%,30%,35%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 0%@76 EDP limit 1734
RAM 376/3999MB (lfb 791x4MB) cpu [13%,100%,35%,33%]@1734 EMC 12%@1600 AVP 4%@12 NVDEC 192 MSENC 192 GR3D 32%@76 EDP limit 1734

Hi,
On TX1, we can see one CPU is at 1.7GHz 100%. On Nano, four CPUs are at 1.4GHz ~50%. Not sure but probably the application runs better in one CPU at full loading.

Please upgrade for a try. On r32 releases, thermal information is printed and may bring more information.

Any better solution for NANO to improve the frame rate? TKS

Hi,
On Jetson Platforms, optimal software frameworks are gstreamer and tegra_multimedia_api.
https://developer.nvidia.com/embedded/dlc/l4t-accelerated-gstreamer-guide-32-2
https://docs.nvidia.com/jetson/archives/l4t-multimedia-archived/l4t-multimedia-3231/index.html

Both utilizes hardware DMA buffers(named NVMM in gstreamer, NvBuffer in tegra_multimedia_api) to get optimized performance. Would be great if you can take a look and see if your usecase can be adapted.

we use R32.2.3 package, R32.3.1 is better about the issue?