parallel computation by using pgcc compiler in dual core mac

Sitha · May 9, 2006, 8:47am

I am using dual core workstation. it has 8 processors. can i control each processsors myself to do parallel computation?

MatColgrove · May 10, 2006, 4:37pm

Hi Sitha,

Which OS are you using? What type of parallel computation paradigm are you performing, MPI, OpenMP, Auto-parallelization, Threads?

I’m guessing your asking how to bind a OMP thread to a processor on Linux. If this is the case then you a variety of options. First you need ot set the environment variable “NCPUS” (or “OMP_NUM_THREADS”) to the number of processors your application should use and must be set no matter how you bind your threads.

The first option, which most systems support, is “taskset” (see “man taskset” for more info) where you give it a hexadecimal bitmask (or numerical list if using “-c”) corresponding to the CPUs your processes are allowed to use. However, it’s not needed it your using all the processors unless you need them bound in a particular order.

On NUMA enabled systems, you should also investigate the use of “numactl”. Like taskset, you are able to bind a set of threads to a set of processors. However, NUMA generally does a better job of memory management. It doesn’t have as fine grain control as taskset so you can only assign threads by socket not CPU. This only matters for multi-core chips since with single-core chips the socket and CPU are analogous. For multi-core chips, you can use both “numactl” and “taskset” to get finer control.

With the recent releases of the PGI compilers (6.0 and newer) on both Linux and Windows, when linking with “-mp” option, the NUMA libraries are linked in with your application on systems which support NUMA. (With 6.0 you need to use “-mp=numa”). So instead of using “numactl” or “taskset”, when linked with “-mp” you can do the same thing by simply setting the environment varaibles “MP_BIND” and “MP_BLIST”. Setting “MP_BIND” to “yes”, tells the runtime to bind your threads to a set of processors. “MP_BLIST” is the list of processors to bind to and the order in which they are bound. For example “setenv MP_BLIST 7,5,3,1,6,4,2,0”, will bind your threads starting at CPU #7 (the 8th processor) and interleave them accross the rest of the CPUs. Unlike “numactl”, the grainularity is by CPU not socket.

Binding MPI threads to particular CPUs is a bit more complex, but “doable”. Let me know if you need help with this.

Mat

Sitha · May 11, 2006, 8:06am

Thank you very much for your useful information. but i need more help from you. I am using CENT OS in my workstation. and PGI compiler is release 6.1 I hope this is the latest version. My machine is called as dual core machine. But when I checked the cpu info it shows as 8 cpus it has. I dont understand well about it. When i run MPI program it shows only one node is there.
I am confusing well. Pls help me where were the problem in my machine.
Thank you.

Sitha

i

MatColgrove · May 11, 2006, 3:35pm

Hi Sitha,

Multi-core chips now being produced by AMD and Intel, contain 2 or more CPUs on the same chip and connect to the motherboard via a single socket. Although the CPUs on these chip do share some components, they can logically be thought as distinct. So while you may have a 4 socket, dual-core system, you can logically think of it as having 8 distinct and separate CPUs.

When i run MPI program it shows only one node is there.

Do you have your system only listed once in the machine.LINUX file? It should be listed 8 times in this file or have a “:8” after its name, i.e. “systemname:8”.

Mat

whao · May 11, 2006, 4:25pm

Hi Mat,

I am interested in knowing how to bind MPI threads to CPUs. Would you please show us how to do it? Thanks.

-Winston

MatColgrove · May 18, 2006, 10:39pm

Hi Winston,

I was afraid someone would ask ;-) First, the cavet is that the process is highly dependent upon the flavor of MPI your working with. You will need to modify the following script and it may not work with all MPI implementations. Also, I’m assuming you’re using a 4 CPU SMP systems.

The basic idea is that you create a wrapper script to launch your application and use “taskset” to bind each process to an individual processor. The processor used is dependant upon a MPI environment variable. Using LAM/MPI as an example, we would start-up lamboot and use mpirun to start-up our script.

 % lamboot -v hostfile
 % mpirun -np 4 run_script

“run_script” uses the LAM environment variable “LAMRANK” to determine which process to run on which processor.

#!/bin/csh -f

if ("$LAMRANK" == "0") then
 echo "RANK 0...."
 taskset 0x1 a.out
else if("$LAMRANK" == "1") then
 echo "RANK 1...."
 taskset 0x4 a.out
else if("$LAMRANK" == "2") then
 echo "RANK 2...."
 taskset 0x10 a.out
else
 echo "OTHER RANKS...."
 taskset 0x14 a.out
endif
exit 0

Again, you’ll need to modify this for your individual needs and it may not work for all MPI versions.

Have Fun!
Mat

whao · May 19, 2006, 2:12pm

Hi Mat,

Thanks for the information.

Our application is using MPICH (1.2.6) on a dual dual-core Opteron cluster. I am wondering if you can provide some pointers on how to do this on MPICH set-up.

I was searching on the web and found Per Ekman’s work ( Linux NUMA stuff ) He posted code patch for MPICH/MPICH2 on ch_shmem device, but I thought what I need is the patch on ch_p4 device. I did not pursue that thread further.

Thanks again.

-Winston

Topic		Replies	Views
How to control which processors are being used Legacy PGI Compilers	3	13172	June 2, 2005
OpenMP thread affinity Legacy PGI Compilers	1	3375	December 14, 2016
Using multiple GPUs Legacy PGI Compilers	7	22086	August 11, 2009
Performance with hybrid setup Legacy PGI Compilers	6	813	March 18, 2022
Thread bindings with OpenACCx86 and MPI Legacy PGI Compilers	3	4853	November 15, 2017
OpenMP + OpenACC problem Legacy PGI Compilers	9	5264	April 17, 2019
Combining OpenMP and OpenACC Legacy PGI Compilers	4	6204	November 14, 2017
Each thread in each processor Legacy PGI Compilers	12	17934	October 13, 2006
OpenMPI + OpenMP problem Legacy PGI Compilers	6	6773	September 28, 2016
Multi-Threaded computation with OpenMP Legacy PGI Compilers	12	4546	June 11, 2018

parallel computation by using pgcc compiler in dual core mac

Related topics