Frustrating error in simple program Am I doing something wrong?

bwq · November 27, 2009, 6:29am

I’m writing a CUDA program, and it is mysteriously failing. I started removing code in an attempt to isolate the problem, but now that I have a ‘minimal’ example, it is even more mind-boggling.

The program below reports a cuda error every time it’s run, but I don’t see what could possibly be wrong. Removing arbitrary pieces or inserting cout’s mysteriously makes the error go away. It does almost nothing - it calls a kernel in a loop with a couple of arguments and checks cudaGetLastError.

[codebox]

#include <stdio.h>

void check_last_error() {

cudaError_t err = cudaGetLastError();

if (err != cudaSuccess)

fprintf(stderr, "Error: %s\n", cudaGetErrorString(err));

}

global void some_kernel(int* a, int b) {}

int *a, b;

void do_something() {

cudaMalloc((void**)&a, 8);

b = 0;

for (int i=0;i<10;i++) {

check_last_error();

some_kernel<<<1, 1>>>(a, b);

check_last_error();

}

int main() {

do_something();

}

[/codebox]

After the first iteration, cudaGetLastError reports “invalid argument”. This makes no sense to me and is causing me problems (in the non-minified example where the kernel actually does stuff).

Am I doing something wrong? Is it a problem with my system configuration?

Any help would be appreciated!

Compiling with “nvcc a.cu -o a.out -O3” (the problem seems to go away without O3; I don’t know if this indicates I’m doing something unsafe, or there is a compiler issue).

System information:

CUDA 2.3

Ubuntu 9.10 (I’ve tested on 9.04 as well, same result)

g++ 4.3.4

NVIDIA GTX 275

Driver version is 190.18

CPU is corei7 920

Tigga · November 27, 2009, 12:49pm

Have you tried restarting/reloading the driver? If you earlier wrote out of bounds it can break future launches in undefined ways.

bwq · November 27, 2009, 8:47pm

I have indeed, and it does not make a difference :(.

Raffles · December 1, 2009, 10:34pm

I am a newb here so this suggestion probably won’t help but… kernel launches are asynchronous right? So you’re effectively trying to launch 10 copies of the same kernel in parallel (albeit a kernel which doesn’t do anything). If you stick a cudaThreadSynchronize() into the loop, does that get rid of the error?

For what it’s worth, your code works fine on my rig, which has similar hardware but different OS:

CUDA 2.3

Windows XP SP3

Visual Studio Express 2008 SP1

NVIDIA GTX 285

CPU is corei7 920

Hope this helps.

Cheers

Raffles

PS I modded the code slightly so that I could see the output before visual studio closed the window. I can’t see that it would change the behaviour, but here’s the code just in case (in Windows sleep is in the Windows.h file, has a capital at the start and takes an argument in millisecs - you learn a new thing every day!):

[codebox]#include <stdio.h>

#include <windows.h>

void check_last_error() {

cudaError_t err = cudaGetLastError();

if (err != cudaSuccess)

	fprintf(stderr, "Error: %s\n", cudaGetErrorString(err));

else

	fprintf(stdout, "OK: %s\n", cudaGetErrorString(err));

}

global void some_kernel(int* a, int B) {}

int *a, b;

void do_something() {

cudaMalloc((void**)&a, 8);

b = 0;

for (int i=0;i<1000;i++) {

	check_last_error();

	some_kernel<<<1, 1>>>(a, B);

	check_last_error();

}

}

int main() {

do_something();

Sleep(9999);

}[/codebox]

Cygnus_X1 · December 2, 2009, 8:52am

Maybe try initialising the cuda before any other call? (cudaInit)
Usually cuda gets initialised with the first cuda call, but what if something went wrong at that point?

Sarnath · December 2, 2009, 9:16am

Does cudaThreadSynchronize() help?

Topic		Replies	Views
Kernel Launch Failure Very simple kernel CUDA Programming and Performance	3	3963	September 14, 2011
Kernel not executed without any errors returned CUDA Programming and Performance	2	6075	March 5, 2012
Silent kernel failure CUDA Programming and Performance	25	8726	May 18, 2020
Losing CUDA calculatons CUDA Programming and Performance	5	2376	March 21, 2011
RNG, CUDA CUDA Programming and Performance	6	5093	August 4, 2009
This code doesn't work maybe too much threads assigned? CUDA Programming and Performance	8	1171	February 2, 2014
basic CUDA help Legacy PGI Compilers	4	3160	December 9, 2012
kernal is not called CUDA Programming and Performance	0	593	March 14, 2016
Getting around apparent CUDA bugs CUDA Programming and Performance	5	1074	September 20, 2011
kernel printf strange behaviour of printf in __global__ sub CUDA Programming and Performance	1	3972	February 22, 2011

Frustrating error in simple program Am I doing something wrong?

Related topics