High Xorg CPU usage during kernel

icegibbon · October 30, 2008, 5:06pm

I’ve been noticing Xorg processor usage spikes sometimes when running my CUDA programs. I only really see it when the kernel takes awhile to finish. I’m including some code that attempts to recreate the problem.

The kernel is meant to do some busy, work. The usleep is meant to reduce the busy wait from the cudaThreadSynchronize so my process CPU usage stays low.

I’m running

OS: Fedora Core7 x86_64

Linux Kernel: 2.6.21-1.3194.fc7

NVIDIA Driver: 177.67

CUDA: NVIDIA_CUDA_Toolkit_2.0_rhel5.1_x86_64

CUDA SDK: NVIDIA_CUDA_SDK_2.02.0807.1535_linux

Server Version Number: 11.0

Server Vendor String: The X.Org Foundation

Server Vendor Version: 1.3.0 (10300000)

NV-Control Version: 1.17

[codebox]#include <stdio.h>

#include <unistd.h>

global void kernel(float *x)

{

// Just do something that will take time

*x = 2.0;

float a = *x;

for(int i = 0; i < 10000; i++)

{

    a *= cos(a);

}

*x = a;

}

int main(int argc, char **argv)

{

cudaSetDevice(0);

float *deviceMem;

cudaMalloc((void**)&deviceMem, sizeof(float));

for(int i = 0; i < 30;i++)

{

    kernel<<<dim3(1024,1,1),dim3(512,1,1)>>>(deviceMem);

// Reduce CUDA busy wait on thread synchronize

    usleep(500000);

cudaThreadSynchronize();

}

return 0;

}

[/codebox]

bdg146psu · October 30, 2008, 5:22pm

So does it recreate the problem? I was seeing VERY similar behavior a while back. It turned out to be a problem with accessing out of bounds memory in my kernel. But the strange thing is that it was producing errors sometimes, and sometimes it wasn’t. If there are problems in your code, they can sometimes be difficult to pinpoint because the behavior becomes erratic.

That being said, I’ve found the following few lines to be EXTREMELY useful in troubleshooting. Just place them after each and every call you make to a CUDA function:

cudaError_t err;

err = cudaGetLastError();

if (cudaSuccess != err)

{

	fprintf(stderr, "Cuda error: %s: %s.\n", "FunctionName()", cudaGetErrorString( err) );

	exit(EXIT_FAILURE);

}

Of course, the ‘err’ variable only needs to be declared once. The above will (hopefully) catch any error immediately after it occurs, so you can pinpoint what’s going on.

bdg146psu · October 30, 2008, 5:27pm

Another thing I just noticed is that there is an updated driver out for 64-bit LINUX (version 177.80). Try updating that to see if that helps with anything.

icegibbon · October 30, 2008, 5:51pm

My code recreates the problem for me, but I don’t know if it will create the problem for someone else.
I did not include the error checking in my code for the sake of brevity, however I did run with error checking and there are no errors. The include code does not do much anyways, and only writes to one memory location.
I’m wondering if the nvidia Xorg module is also doing the busy wait when running the kernel.
I will update the driver, but I don’t think that will solve the issue because I’ve seen Xorg go crazy on an updated system.
Again, the included code is just something I wrote to recreate the problem by executing a kernel that takes almost a second to run.

bdg146psu · October 30, 2008, 6:05pm

Yeah, I had checked out that code before posting and it looked fine. I guess I was kind of hoping that the problem was with an earlier call (cudaMalloc or whatever) failing.

Anyway, here’s the thread I started relating to the problems I was having. There are some suggestions from Nvidia folks at the beginning, but my questions were largely unanswered and I ended up just posting myself towards the end. It may be worth at least looking at their suggestions:
[url=“http://forums.nvidia.com/index.php?showtopic=77612&hl=”]http://forums.nvidia.com/index.php?showtopic=77612&hl=[/url]

In the meantime, I’ll run your code and see what happens.

bdg146psu · October 30, 2008, 6:22pm

I’m having the same results. Xorg is eating up CPU. You ARE reading and writing to the same memory location in multiple threads at the same time. Looking back on the thread I had created about this problem, I was doing the same thing. I wonder if that’s causing the problem?

netllama · October 30, 2008, 6:25pm

For those experiencing this problem, is the GPU driving X the same GPU as you’re using for CUDA, or do you have multiple GPUs in the system?

icegibbon · October 30, 2008, 6:27pm

Something interesting I found out.

I ran the program without X windows and ran fine.

I ran the program in X windows, logged in from another computer and ran top, and Xorg process wasn’t doing much at all.

I then ran top in a gnome-terminal in Xorg on the computer running the cuda program, and Xorg starts taking up more of the processor.

While running top in that gnome-terminal, I logged in from another computer again, attached gdb to Xorg, and saw that it was consistently in:

_#0 0x00002aaaab8a986b in _nv001232X () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#1 0x00002aaaab8a9b8c in _nv001462X () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#2 0x00002aaaaba29987 in BitOrderInvert () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#3 0x00002aaaaba40d04 in BitOrderInvert () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#4 0x00002aaaaba41214 in BitOrderInvert () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#5 0x00002aaaab9f364a in BitOrderInvert () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#6 0x000000000051c5c4 in ?? ()
#7 0x00000000004bf76d in ?? ()
#8 0x000000000044f7e0 in BlockHandler ()
#9 0x0000000000568395 in WaitForSomething ()
#10 0x000000000044b96a in Dispatch ()
#11 0x000000000043480d in main ()

or

#0 0x00002aaaab8a986b in _nv001232X () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#1 0x00002aaaab8a9b8c in _nv001462X () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#2 0x00002aaaaba29987 in BitOrderInvert () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#3 0x00002aaaaba39f80 in BitOrderInvert () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#4 0x00002aaaaba3aa25 in BitOrderInvert () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#5 0x00002aaaaba3abdf in BitOrderInvert () from /usr/lib64/xorg/modules/drivers//nvidia_drv.so
#6 0x0000000000529e75 in ?? ()
#7 0x0000000000449f38 in ProcCopyArea ()
#8 0x000000000044baab in Dispatch ()
#9 0x000000000043480d in main ()

So my theory is that X windows has a bunch of events queued up to draw to video while CUDA is running a long kernel. When the kernel is finished, the events are dispatched to draw, causing Xorg to start executing a bunch of events in sequence. But what do I know?

icegibbon · October 30, 2008, 6:28pm

Only 1 card, doing both Xorg and CUDA.

bdg146psu · October 30, 2008, 6:30pm

In my case, I have one GPU. It runs X and does my CUDA computations.

bdg146psu · October 30, 2008, 6:43pm

When I first monitored the CPU using top, I had it in a separate tab in my konsole. I noticed that when I ran the program and then switched tabs, xorg always seemed to be at a peak and slowly lowered just after switching tabs. I don’t know as much as I should about this stuff, but switching tabs would cause more events to get queued up, right? That would explain why I always saw a peak when switching over (switching tabs created the extra events). It would also explain why the Xorg CPU usage seems much less “peaky” when running top in a different terminal altogether, since I wasn’t doing as much on-screen during execution.

I think you may be onto something here.

tmurray · October 30, 2008, 8:06pm

Can you give me full code? I’ve got some debugging tools I can use to test this.

icegibbon · October 30, 2008, 8:10pm

I posted the full code in the first post.

icegibbon · October 30, 2008, 8:17pm

My assumption is that while the kernel is doing stuff, other X events generated by other programs using X will be queued up for drawing until the kernel is done, at which point the events for quickly be processed by X, spiking its usage.

Or X is actually trying to process an event, and in order for the nvidia driver to complete it, it does a busywait in waiting for the kernel to finish doing its thing.

Not sure where the spike usage happens, but I’m pretty sure X events are getting queued up.

tmurray · October 30, 2008, 10:45pm

Damnit, code boxes scroll now. Sorry about that.

e.ping · October 31, 2008, 12:42am

This is probably related: I use NX (www.nomachine.com) to access (CUDA) boxes remotely. All these machines use the CUDA device as primary display, and X is always running because I don’t shut it down when I travel home… (there is no fancy GL screensaver involved btw)

One of my benchmarks gave me on my trusty G80:

66 GB/s when sitting in front of the box at university
48 GB/s (same binary, same everything) when NX’ing to that box from home
66 GB/s when NX’ing to another box and ssh’ing to the CUDA machine
66 GB/s when NX’ing to the CUDA box, opening up another shell, ssh’ing to localhost and running the app

Back then, I thought NX was to blame because it forks Xorg, but now, I am no longer sure. It was surely a pain in the lower end of the back to track this down…

seibert · October 31, 2008, 2:02am

I’m trying to figure out what’s different about the last case. Is X forwarding not enabled, so sshing to localhost effectively unsets $DISPLAY?

e.ping · October 31, 2008, 12:43pm

Nope, and setting / unsetting $DISPLAY does not affect this behaviour. The last observation is indeed weird…

bdg146psu · November 3, 2008, 2:31pm

Any luck with the debugging tools tmurray?

tmurray · November 3, 2008, 5:00pm

Haven’t had a chance yet, will play with this as soon as I have time. (which might not be for another week, sorry)

Topic		Replies	Views
Xorg 100% CPU usage in kernel module on Debian testing with drivers 440.44-2 (RTX 2080 Super) Linux	4	725	November 15, 2020
Xorg nvidia module cpu utilization is high Linux	5	2516	October 14, 2021
Xorg use high cpu loading after Ubuntu suspend/wake up many times Jetson Xavier NX nvbugs , performance	22	3578	October 19, 2022
How to deactivate Xorg for using Nvidia CUDA Driver Jetson AGX Xavier	17	17051	October 18, 2021
Xorg high CPU usage after installing nvidia drivers CUDA Programming and Performance	1	3756	March 25, 2019
High CPU usage on xorg when the external monitor is plugged in Linux	120	39100	June 21, 2023
Get rid of XOrg processes CUDA Setup and Installation cuda , ubuntu	0	786	January 28, 2022
NVIDIA 340.96 RedHat 6.8 xorg running at 100% cpu Linux	10	3908	August 29, 2016
Sometimes 100% CPU usage for xorg process Linux	0	1055	March 8, 2022
System unresponsive when running a kernel CUDA Programming and Performance	1	2145	August 15, 2008

High Xorg CPU usage during kernel

Related topics