Balancing GPU vs. CPU on Ubuntu Linux CUDA task starved by CPU bound tasks

Hi folks,

I’m participating in a couple distributed computing projects that run on the BOINC framework. One - Rosetta - is CPU bound and the other - GPUGRID - uses CPU only to keep the GPU busy. This is running on a 4 core AMD system with a GTX 460 adapter. I find that with Rosetta using 4 cores at 100%, the GPUGRID application uses about 2% CPU and drives the GPU at about 10%:

hbarta@olive:~$ top -n 1

top - 11:59:54 up 18:29,  5 users,  load average: 5.15, 5.46, 5.40

Tasks: 251 total,   7 running, 244 sleeping,   0 stopped,   0 zombie

Cpu(s):  1.4%us,  0.6%sy, 97.2%ni,  0.5%id,  0.0%wa,  0.1%hi,  0.1%si,  0.0%st

Mem:   4057932k total,  4020360k used,    37572k free,    52612k buffers

Swap:  8000288k total,    20184k used,  7980104k free,  1078336k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND            

17736 boinc     39  19  447m 441m  400 R   98 11.1  24:18.51 minirosetta_2.1                 <<<<<<<<<<  

17737 boinc     39  19  440m 433m  624 R   98 10.9  24:20.91 minirosetta_2.1                 <<<<<<<<<<    

17738 boinc     39  19  453m 445m  304 R   98 11.2  24:25.14 minirosetta_2.1                 <<<<<<<<<<    

17735 boinc     39  19  437m 430m  368 R   89 10.9  24:13.27 minirosetta_2.1                 <<<<<<<<<< 

   41 root      20   0     0    0    0 S    2  0.0   0:03.56 ata/3              

 1908 root      20   0  156m  46m  15m S    2  1.2  20:03.41 Xorg               

 2265 hbarta    20   0  294m  39m  13m S    2  1.0   5:49.05 compiz             

 2398 hbarta    20   0  299m  25m  14m S    2  0.6  16:20.79 boincmgr           

 8431 hbarta    20   0  208m  22m  11m S    2  0.6   0:07.46 gnome-terminal     

 8886 boinc     30  10  166m  90m  32m R    2  2.3   1:54.67 acemd2_6.13_x86                 <<<<<<<<<<

13624 hbarta    20   0  537m  92m  32m S    2  2.3   1:30.03 chromium-browse    

13967 hbarta    25   5  876m  79m  22m S    2  2.0   1:04.08 chromium-browse    

16389 hbarta    25   5  906m  99m  20m S    2  2.5   1:00.22 chromium-browse    

18396 hbarta    20   0  862m  66m  26m S    2  1.7   0:17.94 chromium-browse    

{zero CPU users deleted}

hbarta@olive:~$ nvidia-smi -a 

==============NVSMI LOG==============

Timestamp			: Fri Apr 15 12:00:35 2011

Driver Version			: 260.19.29

GPU 0:

	Product Name		: GeForce GTX 460

	PCI Device/Vendor ID	: e2210de

	PCI Location ID		: 0:1:0

	Board Serial		: 650381377

	Display			: Connected

	Temperature		: 53 C

	Fan Speed		: 26%

	Utilization

	    GPU			: 8%

	    Memory		: 1%

hbarta@olive:~$

If I suspend Rosetta, the GPUGRID task uses about 8% CPU and keeps the GPU closer to 100%:

hbarta@olive:~$ top -n 1

top - 12:01:19 up 18:31,  5 users,  load average: 5.69, 5.55, 5.44

Tasks: 235 total,   1 running, 234 sleeping,   0 stopped,   0 zombie

Cpu(s):  1.4%us,  0.7%sy, 97.2%ni,  0.5%id,  0.0%wa,  0.1%hi,  0.1%si,  0.0%st

Mem:   4057932k total,  2219820k used,  1838112k free,    48924k buffers

Swap:  8000288k total,    20288k used,  7980000k free,  1074920k cached

PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                   

 8886 boinc     30  10  166m  90m  32m S    8  2.3   1:56.26 acemd2_6.13_x86                 <<<<<<<<<<           

 1908 root      20   0  156m  46m  15m S    6  1.2  20:07.18 Xorg                      

 2265 hbarta    20   0  294m  39m  13m S    2  1.0   5:50.29 compiz                    

 2334 hbarta    20   0  215m 9380 7096 S    2  0.2   2:55.87 multiload-apple           

 2398 hbarta    20   0  299m  25m  14m S    2  0.6  16:22.21 boincmgr                  

 8431 hbarta    20   0  208m  22m  11m S    2  0.6   0:08.33 gnome-terminal            

13624 hbarta    20   0  537m  92m  32m S    2  2.3   1:31.34 chromium-browse           

13721 hbarta    20   0  270m  87m  20m S    2  2.2   2:25.03 npviewer.bin              

16389 hbarta    25   5  906m 100m  20m S    2  2.5   1:01.81 chromium-browse           

20289 hbarta    20   0 19352 1372  928 R    2  0.0   0:00.03 top                                            

{zero CPU users deleted}

hbarta@olive:~$ nvidia-smi -a 

==============NVSMI LOG==============

Timestamp			: Fri Apr 15 12:01:32 2011

Driver Version			: 260.19.29

GPU 0:

	Product Name		: GeForce GTX 460

	PCI Device/Vendor ID	: e2210de

	PCI Location ID		: 0:1:0

	Board Serial		: 650381377

	Display			: Connected

	Temperature		: 57 C

	Fan Speed		: 34%

	Utilization

	    GPU			: 91%

	    Memory		: 12%

hbarta@olive:~$

Are there any tweaks I can apply to get the GPUGRID task more CPU w/out sacrificing too much for Rosetta? I know I can configure Rosetta to use only three cores and that will help, but I havte to give up nearly 100% of one core in order to gain the 8% required to feed the GPU. I’ve tried bumping the nice value of the CUDA task but that seems not to help (and it is already less nice than the CPU bound tasks.)

Maybe you could try resettingthe priority for the processes.

Hi hyqneuron,

Thanks for the suggestion. That’s what I though too. However the GPUGRID task is already a higher priority than the Rosetta task (30 vs. 39.) I did raise it even further and that had no effect. I believe there are other aspects of the Linux task scheduler beside priority that try to to provide balance between compute bound and I/O bound (and interactive) tasks and it seems like something there is not providing sufficient CPU cycles for the GPUGRID task.

thanks,

hank

Is there anyway you can modify the context creation of GPUGRID? flags such as CU_CTX_SCHED_SPIN or CU_CTX_SCHED_BLOCKING_SYNC could be used to improve GPU thread performance.

Still, did you go for the extreme? Such as setting the GPUGRID process priority to -20 and at the same time setting the priority for rosetta to 20?

I think that would involve modification of BOINC, no? Or perhaps I could code a trojan for the GPUGRID executables that would set the flags and then exec the correct binary. (If that wouldn’t interfere with normal communication between BOINC and the prograns it kicks off.)

I did WRT the GPUGRID task. It made no difference. The Rosetta tasks are already at nice=19. I don’t imagine one more step would matter.

I did make some progress by switching to a real time kernel (usually used to provide low latency in Ubuntu Studio.) That results in about 44% GPU utilization with Rosetta using near 100% of 4 cores.

My plan at the moment is to allow it to run that way for at least several days to establish performance levels. Following that I may try running two instances of GPUGRID. Normally that does not provide a performance boost, but perhaps it will gain some performance in this situation.

thanks,

hank