I wish to write code for the Devner2 cores on the TX2 but cant find any documentation on how to go about this, does anyone know of any resources for this such as example code and guides on how to compile VLIW code from C/C++?
I’m interested in this, too, as well as any white paper discussing the differences between the two architectures. OS messages indicate that it things all 6 cores are identical. To use Denver versus standard(?) armhf cores, do establish a corelock/assignment in our source code? How can one tell which cpu is which, 0-5?
I have noticed that the Denver cores always seem to be allocated as CPU 1 and 2 while the A57 cores are 0, 3, 4, and 5. Looking at /proc/cpuinfo only shows the A57 cores and dmesg shows that CPU1 and CPU2 are shutdown during boot around the 16.8 second mark.
It appears that the TX2 default boots with the 4 A57 cores running at the full 2GHz and Denver2 cores disabled, if I had to guess why I would say its because Linux can’t take advantage of these cores.
I guess the first thing is to actually enable the cores.
The Denver cores run the same ARM user-space code as the A57 cores, but do so allegedly slightly faster.
If you want to run on a particular core, you can set CPU affinity for the processes or threads of interest.
You also need to use nvpmodel to actually turn on the denver cores:
sudo nvpmodel -m 0
You can find the different models described in /etc/nvpmodel.conf
Note that you cannot turn off core 0, which is one of the A57 cores, so the best you can do is run on CPU 0, 1, 2, and lock threads to 1/2 to make sure you run your code on Denver.
Thanks for that! the Denver cores are showing up now.
I noticed the nvpmodel 0 has all the cores enabled and has them set to a clock of 2GHz along with the GPU set to 1.3GHz, I would assume this falls outside the allowed TDP of ~7.5W with no fan (but less than the ~15W maximum package TDP) so I guess I will just run it with the fan on always to stay safe.
Am I correct in assuming that if I run regular ARM code specifically on the Denver cores it will be using the hardware to translate the instructions to VLIW before being executed? is there any way I can write VLIW specific code rather than relying on this?
The NVIDIA developer board has a fan controller that starts the fan based on actual temperature, so if you’re using that, you’ll be fine.
If you have a board where the fan control is manual, and you expect to actually be using the GPU and many CPU cores at the same time, yes, turning on the fan is a good idea!
Modern cores of high-performance architectures (x86, x86/64, POWER, ARM, and so forth) are not strictly “VLIW” or “RISC” or “CISC” or any one implementation mechanism. They mix and match concepts for different parts of the execution engine, so as to get the best bang for the buck. In front of all that, sits the instruction fetch, decode, and retirement functions, which make sure that, to the programmer, the CPU looks like the architecture manual says it looks.
No, you cannot program the internals of how the chip is designed to fetch, decode, execute, and retire instructions directly. That’s largely determined by hard-wired logic, and sometimes also by a bit of microcode, and microcode is not publicly available/documented.
If you want an interesting execution model to bite into, I think the modern CUDA model is quite exotic and powerful enough :-) And it’s reasonably well supported on the Jetson, with sm62 level support.
I have a related question. If one wants to take advantage of the Denver cores specific processing, does one use a different compiler?
So far as I know Denver cores behave as an A57 once their microcode is loaded. I doubt the o/s cares if the core was created purely in hardware versus being updated via microcode. What makes it different is that microcode can change the core.
No. Denver looks like an ARM core. It follows the ARM programming model. It is ARM. You use it by programming in ARM instructions, and setting the power model to make sure they are turned on.