NVRM: Xid (0084:00) kernel does not terminate

haRsh · April 4, 2008, 5:55am

I got problem with my second Cuda project. The first program runs and produces rezults.

Firts I hit the nvcc “ran out of registers” bug/feature. Attempting to walkaround
it, I did minor changes to the code: made a for loop out of

ENCCYCLE (0);
ENCCYCLE (1);
…
ENCCYCLE (7);

The new code compiles successfully, runs, but never terminates. When run under
X Windows there are no error messages. However, outside X Windows it says

NVRM: Xid (0084:00):13 001 00000000 000050C0 00000368 00000000 00000080

This happens all the time.

I am 99% sure this is a bug in toolkit, not in my program. The aim of my project is
to estimate whether Cuda toolkit+videocard is suitable for a certain purpose, and the current status is “cuda cannot do it due to a bug”

I’m on 64-bit Linux and use nvidia drivers 169.09 and toolkit ver 1.1 if this
matters

I enclose full source code and Makefile. Built with Makefile are 3 executables:

compilin, terminatin within 0.2 seconds
not compilin with
compilin, not terminatin

The 3 executables are built from single source with different preprocessor
directives. The 1st executable is a trimmered version of the 2nd. The 3rd
differs from the 2nd only in the for loop mentioned above

My questions are:

How do I solve the “ran out of registers in integer64” problem without changing
C source code?
How do I change the code of executable 3 to make it compile and work
properly?

DenisR · April 4, 2008, 6:08am

Have you tried to see if driver 169.21 (from memory) has the same problem? There is at least a newer driver out that supports CUDA.

netllama · April 4, 2008, 2:03pm

@haRsh, please generate and attach an nvidia-bug-report.log

haRsh · April 4, 2008, 4:29pm

To netlama: thank you for quick reply. I will send nvidia-bug-report.log monday morning, if the problem remains under drivers 169.12

To DenisR: I failed to find 169.21 for Linux. Did you mean 169.12?

DenisR · April 4, 2008, 5:10pm

Hmm, at home now, so I cannot check. At least it is the last one that you get with FC8 updates :)

haRsh · April 7, 2008, 6:52am

The problem stays the same under drivers ver. 169.12

I attach rezult of nvidia-bug-report.sh for ver. 169.09 and 169.12

netllama · April 7, 2008, 2:09pm

Other than the fact that you’re using an unsupported Linux distribution, I don’t see anything unusual in the bug report. I tried to reproduce this with the code that you attached in a supported Linux distribution, however it failed to build and I found your build instructions unclear.

Please clarify the build command(s) required to build your test app, or update the Makefile so that it can be built by running ‘make’.

thanks

haRsh · April 7, 2008, 5:02pm

[quote name=‘netllama’ date=‘Apr 7 2008, 05:09 PM’]

… I found your build instructions unclear.

…Please clarify the build command(s) required to build your test app, or update the Makefile > …so that it can be built by running ‘make’.

Ups? Did You read file ReAd.It? Build process is described there. I will shortly repeat it here.

The build process involves 2 steps. First make executable script called CUDA and place it somewhere in the $PATH. Then go to directory containin Makefile and type make. This should attempt to build and execute 3 executable files. Run result will be redirected into different files for each executable.

netllama · April 7, 2008, 5:04pm

Yes, I read ReAd.It. Your instructions on how to make an executable script called CUDA were unclear. If building this app requires more than just running ‘make’ using the Makefile you provided, then please provide any additional requisite script(s) or build commands.

thanks.

haRsh · April 8, 2008, 3:57am

To netllama:

Separate script (setting up some environment variables) is needed by Makefile because I don’t know where you installed Cuda toolkit. To run my code do the following:

Go to /usr/local/bin
Open empty file in your favorite text editor
Keyboard 5 lines found between
==== /usr/local/bin/CUDA start ====

and

==== /usr/local/bin/CUDA end ====

inside file ReAd.It

Go to 1st line of the file and replace < path to toolkit > with the directory where you installed Cuda toolkit. This directory should contain 5 subdirs bin, doc, include, lib, open64; and directory bin/ should contain file nvcc and some other
Save file as CUDA
Leave text editor.
Type ls -l to check if the file CUDA is present in current directory /usr/local/bin
Make the file executable by issuing command < chmod +x ./CUDA >.
Now go to the directory containing ReAd.It and Makefile
Type < CUDA nvcc --version >. This should run nvcc executable. You should see 4 lines of nvcc introduction.
If you want to build all 3 executables yourself, type < make clean >. This will erase 2 executable files exe/*
Type < make > then watch executables compile and/or run. You may want to open another window to view files

rezult.*

If the 1st executable did not build, this could be due to the fact that file common/inc/cutil.h was not located my Makefile. cutil.h is part of Cuda SDK.

Standard output and standard error of 1st executable will be inside rezult.1.0.cout and rezult.1.0.cerr respectively

Second executable won’t build

Results of 3rd executable will be rezult.2.1.cout and rezult.2.1.cerr

3rd executable won’t terminate, and it won’t load CPU after a fraction of second. Type < ps | grep cuda > or < ps | grep make > to check what’s going on

If you have further questions, fell free to ask

haRsh · April 8, 2008, 7:43am

Since usage of external script /usr/local/bin/CUDA became a problem, I changed Makefile to automagically find toolkit, so the script is no longer needed.

The new Makefile is attached to this message.

The new version is shorter, more verbose and has correct dependancies. Prior to running executable it outputs a message

Copy the new Makefile over old. Type make

haRsh · April 9, 2008, 10:52am

I managed to walkaround the ran-out-of-registers and kernel-loops-forever bugs by changing source code. The new code compiles, runs and terminates, slightly outperforming central processor (see my signature for details).

My code heavily and randomly accesses constant memory, hence GPU is only slightly faster than CPU now. I hope to speed up cuda code by moving constants to shared memory

Hence I should inform of the intermediate result of the project:

the Nvidia compiler is BUGGY,
but sometimes it is worth spending time programming for GPU

netllama · April 9, 2008, 4:41pm

I’m afraid that the new Makefile still doesn’t work correctly:

$ make
TIMES=1 ENCRYPT_IN_CYCLE=0
make cuda_run
make[1]: Entering directory /root/NVIDIA_CUDA_SDK/projects/63908' /usr/local/cuda/bin/nvcc --maxrregcount 128 -DUNIX twofish.cu -o exe/twofish_cuda_1_0 \ -DBLOCKS=4 -DCYCLE=10 -I./automagically.generated -I/root/NVIDIA_CUDA_SDK/common/inc/cutil.h \ -L/usr/local/cudalib \ -DTIMES=1 -DENCRYPT_IN_CYCLE=0 \ -lcudart \ 2>compile.1.0.err make[1]: *** [exe/twofish_cuda_1_0] Error 255 make[1]: Leaving directory /root/NVIDIA_CUDA_SDK/projects/63908’
make: *** [bug0] Error 2

DenisR · April 9, 2008, 6:10pm

I believe constant memory is as fast as it gets because it is cached, as long as all threads are accessing the same indices (if it is an array), otherwise it is indeed smart to put into shared memory or even a texture might do the trick.

Buggy is a strong statement I think. It does not generate wrong code, it crashes in certain circumstances. I have also had it happen once, but to be honest my code was crappy (in hindsight) and the compiler does not trip over my cleaner code.

FWIW, I rather have it crashing than generating wrong code, I already have enough trouble debugging my bugs :P

haRsh · April 10, 2008, 3:26am

[quote name=‘netllama’ date=‘Apr 9 2008, 07:41 PM’]

I’m afraid that the new Makefile still doesn’t work correctly:

$-I/root/NVIDIA_CUDA_SDK/common/inc/cutil.h \

The command-line is incorrect. Included should be directory, not file name

Probably the magic spell < locate common/inc/cutil.h | xargs dirname > did not cast properly on your computer

Will you please set CUTIL_PATH in 4th line of Makefile manually to /root/NVIDIA_CUDA_SDK/common/inc/

Do I need to write extended instruction for you how to do it?

Next time if you have compilation errors please attach the relevant compile.cerr*

<>

I keep thinking about the possible reason of the build problem. I came to a conclusion that 1 of the following 3 statements is true:

xargs command is present on netllama computer but malfuctions
dirname command is present on nettlama computer but malfunctions
netllama edited 4th line of my Makefile incorrectly, then complained me about a problem in my Makefile

Is my guess right?

<>

haRsh · April 10, 2008, 3:40am

I prefer to take a working cuda-unaware program and convert it to cuda code with perl/bash script. This ensures absence of bugs. And very often compiler-ran-out-of-registers bug stops me (setting Olimit appears to have no effect). I failed several times before creating a variant which compiles and fits in registers completely. And the code is not optimal: if the compiler worked properly I could make it better

haRsh · April 10, 2008, 9:11am

Moving hard-coded tables from constant to shared device memory increased speed more than twice, so now my G84-based card is more than 3 times better than two-core Athlon for the project (which means that $250 videocard should be >12 times better). I change my signature accordingly.

I’m using approximately half of shared memory, so I will try to switch from 1-byte char to 4-byte int

<<>>
Enlarging tables gave slight improvement. Now the program occupies 124 registers per thread and 11K shared memory
<<>>

Topic		Replies	Views
BUG: Broken register allocation, toolkit 2.3 CUDA Programming and Performance	15	6954	May 10, 2010
Possible nvcc compiler error CUDA Programming and Performance	0	1373	September 8, 2009
Very strange behaviour. Maybe a bug...? Kernel fails to run strangely, but no errors are reported. CUDA Programming and Performance	5	1095	May 13, 2009
ran out of registers in predicate nvopencc error CUDA Programming and Performance	3	4162	September 2, 2007
TOO MANY RESOURCES REQUESTED FOR LAUNCH CUDA Programming and Performance	16	11946	September 2, 2008
Deciphering an NVRM: Xid message? CUDA Programming and Performance	27	78245	April 1, 2012
Compiler Error, ran out of registers, please help any suggestions? CUDA Programming and Performance	7	8816	May 3, 2007
CUDA and Murphy's Law Some things you may bump into... CUDA Programming and Performance	16	19288	August 21, 2007
ran out of registers in predicate error CUDA Programming and Performance	8	1264	August 18, 2010
register allocation bug in 1.0? nvcc crashes CUDA Programming and Performance	2	1849	December 1, 2007

NVRM: Xid (0084:00) kernel does not terminate

Related topics