TX2 has always run one CPU core, no matter how many threads are launched.

oprell · December 13, 2017, 12:51pm

Hi experts,

Here I encountered a strange problem which is mentioned as the Title.

First, I run ‘lscpu’ command, and It shows as follows:
https://pan.baidu.com/s/1nuDHWAt

It shows that there are 4 CPU cores on-line.

But when I launched 32 threads in a test program, the system seems to put all threads on only one CPU core. That is very strange. The test program and what the ‘top’ command and system Monitor shows are as follows:

1. test program ``` #include #include #include using namespace std;

int main(int argc, char *argv){
long result = 0;

#pragma omp parallel for num_threads(32)
for(long i=0; i<100000000; ++i){
printf(“This is thread %d\n”, omp_get_thread_num());
}

cout<<“result:”<<result<<endl;
return 0;
}


</li>

<li>2. top command and system monitor shows<a target='_blank' rel='noopener noreferrer' href='https://pan.baidu.com/s/1qYRzpX6'>https://pan.baidu.com/s/1qYRzpX6</a>
</li>

Please help me with this problem.
Thanks!

oprell · December 13, 2017, 1:04pm

Please somebody’s here

VickNV · December 14, 2017, 1:27am

Hi oprell,

Please check with below command.

$ ps -o spid,psr -T -p 3140

oprell · December 14, 2017, 2:02pm

Hi vickyy,

I run the command you suggested. But it only shows below.

nvidia@tegra-ubuntu:~$ ps -o spid,psr 3140
 SPID PSR

There’s nothing useful.

linuxdev · December 14, 2017, 7:41pm

You have to adjust the “-p 3140” to be the PID of the process…which is a moving target.

oprell · December 15, 2017, 2:45am

Hi, linuxdev

Thanks for your hint.

I checked command ’ ps -o spid,psr -T -p 3221’, and I got below output.

I launched 32 threads in my test program.
It seems to be that the program execute on one CPU core. What’s wrong with it?

snarky · December 15, 2017, 5:59am

My guess is that OpenMP is not working as you assume it will.
This can be because of compiler options, because of runtime options, or because of other problems.

What happens if you use pthreads instead of the OpenMP pragmas?

VickNV · December 19, 2017, 3:09am

I’m not familiar with openmp but on my side it looks different from yours. FYI.

nvidia@tegra-ubuntu:~$ ps -o spid,psr -T -p 30550
 SPID PSR
30550   4
30551   5
30552   3
30553   5
30554   4
30555   4
30556   3
30557   3
30558   5
30559   3
30560   5
30561   5
30562   4
30563   5
30564   3
30565   5
30566   4
30567   3
30568   4
30569   0
30570   5
30571   5
30572   3
30573   5
30574   3
30575   3
30576   4
30577   0
30578   4
30579   3
30580   5
30581   4

linuxdev · December 19, 2017, 5:05pm

It shows two CPU cores, not one (core 0 and core 3). Performance mode may modify which cores are available (except core 0, which is always available).

Prior to testing try maximizing performance mode:

# To see what is enabled:
sudo cat /sys/devices/system/cpu/online
# To see available modes:
sudo nvpmodel -p --verbose
# Set performance:
sudo nvpmodel -m0
# Also:
sudo /home/ubuntu/jetson_clocks.sh

Try your test when all cores are guaranteed online.

Consider also that unless you specifically force a given core that the scheduler may be picking the best core. The obvious answer to always use all cores may not actually be the correct answer. The problem is that cache has a lot to do with performance, and a cache miss is very expensive relative to a cache hit. If the threads share data it may be a case of the scheduler trying to take advantage of cache. Whether you would want to override this or not might depend on whether you believe it is compute bound or if it is instead data bound as the speed bottleneck. Once all cores are enabled you probably need to profile before you try to outguess the scheduler.

autonomousvcet · December 20, 2019, 6:17am

I tried this commands, but it doesn’t show any effect. Out of 6 cores only 4 cores are activated in my TX 2. Any suggestion please!!.

linuxdev · December 20, 2019, 8:55pm

If you run “sudo nvpmodel -m 0”, then all cores should be active. This doesn’t mean all cores will have software running on them, but it does mean all cores are available.

For a more intuitive test, you can install htop (“sudo apt-get install htop”), then run “htop”. You will see cores listed as bar charts at the top. See if each core varies a bit as you do different things on the system…web browsing is probably a good way to make things jump around.

I do not know about the OpenMP, and there things to consider even about ordinary thread models. The system has a scheduler, and unless something has intervened, then the core the process or thread runs on is entirely up to the scheduler. Core affinity can be used put a process on a single core, but spreading threads out from a single process is a lot more complicated.

The scheduler is aware of cache. Any time you migrate from one core to another you will tend to have a cache miss instead of a cache hit. In cases where the scheduler is keeping things on one core it may in fact be because this is faster due to fewer cache misses.

If you are interested in forcing certain threads or processes to a given core, then you will want to understand process/thread affinity. Here is a good reference to consider:
https://www.linuxjournal.com/article/6799

There are variations on this, e.g., for kernel thread affinity of kthreads (which works with driver modules even though not a user space application).

Not having your threads distribute to all cores is not the same as having inactive cores and can in no way be considered a lack of a working core (cache hit/miss, the scheduler, threading models, so on, is actually a very complicated topic).

Topic		Replies	Views
Jetson TX2 OpenMP: threads working on 5 cores, not 6 cores Jetson TX2	3	948	February 12, 2019
[OpenMP] Performance bug in Denver cores? Jetson TX2	6	1257	July 1, 2020
Thread cpu error on tx2 Jetson TX2	8	780	March 19, 2019
TX2 OpenMP threads only run on One CPU Cores Jetson TX2 jetson-inference	3	1181	December 16, 2021
OpenMP application appears to use only 2 of 4 cores Jetson Nano	9	1321	December 10, 2019
Each thread in each processor Legacy PGI Compilers	12	18103	October 13, 2006
TX2, allocate different threads on different CPUs? Possible? Jetson TX2	10	3212	April 5, 2018
Thread count is being ignored (Optix Prime 4.0.2) OptiX	7	1377	September 19, 2017
TX2, CPU spec details and multi-thread Jetson TX2	3	3110	March 20, 2018
Always use 1 CPU in Tx2 Jetson TX2	3	1020	November 7, 2018

TX2 has always run one CPU core, no matter how many threads are launched.

Related topics