I did enabled the two denver cores several months ago after your kind assistance.
We test the performance then, the performance did not take any improvement. The reason maybe that the 4 ARM cores are enough for our algoithm, while the ability of GPU being the limitation. Thus, the additional two denver cores are redundant even though they took the majority of the cpu calculations.
We would try to add more loads on the cpu in order to see if the optimization for cpu scheduling is problematic or it is the power management who limited the performance of the two denver cores.
Finally, thank you for your patience and the Nvidia team supports.
Actually, it seems that lots of bugs appeared on the Jetpack 4.4 DP, but I found that the performance of cuda 10 .2 is much better than that on hte Jetpack 4.3. The cusparse became much more faster.
If you have some other caracteristics found to improve the calculation performance, please remind me. Thank you all.
I know this will be a bit of a pain, but if you get this set up working again, and then go to do an update, perhaps you could log the update. Then, if the problem comes back, there will be a list of packages triggering the issue. For example:
sudo apt update
sudo apt-get upgrade 2>&1 | tee log_upgrade.txt
I have the same problem and want to enable denver cores in jetpack 4.4 dp.
I am new to jetson and I am a little confused. Would you please let me know where the Linux_for_Tegra folder is?
Itās on your host machine that you flashed 4.4 with. However, I would not bother as it will return to 4 core state after an update. Wait for a fix or flash 4.3.
Hi,
Please check 5.16 Increased Kernel Launch Latency on Denver 2 Cores in release notes of r32.4.3.
Due to the concern, we disable Denver cores by default. For leveraging Denver cores, please execute sudo nvpmodel -m 0 and taskset: https://man7.org/linux/man-pages/man1/taskset.1.html
For using core #1, please set 0x2:
$ taskset 0x2 <_USER_APPLICATION_> &
For using core #2, please set 0x4:
$ taskset 0x4 <_USER_APPLICATION_> &
We have worked out a solution. Please check this comment.
Is NVIDIA going to fix the latency issue with the Denver cores?
OR
Is the plan to leave them disabled by default as it is for all future JetPack releases and force customers to make the described change if they want to run processes on the Denver cores?
We have seen performance deviation between Denver cores and ARM cores in MAXN mode, triggering certain issues in benchmark. With further investigation, a better solution for now is to let tasks be scheduled to ARM cores. For leveraging Denver cores, would need to manually schedule it through taskset. From r32.4.3, this is the default mode for TX2.