Unspecified launch failure and garbage values: Please help

lalithsuresh · June 17, 2009, 5:55am

Hello all,

We’ve been trying to run this piece of code which is step one for us in a long series of steps. We’re getting an unspecified launch failure after the second kernel call. Without the cutilCheckMsg() call, input_d is filled with all zeros or zeros with garbage values. We’ve tried higher numbers for NO_BITS. Basically, NUMBLOCKS*NUMTHREADS should be equal to PERMUTATIONS. Please help.

[codebox]#include<stdio.h>

#include<stdlib.h>

#include<sys/time.h>

#include<math.h>

#include"macros.h"

#define NUMBLOCKS 144

#define NUMTHREADS 32

#define NO_BITS 9

#define PERMUTATIONS pow(2,NO_BITS)

global void init(int *input_d){

int index=blockIdx.x*blockDim.x + threadIdx.x;



input_d[index]=0;

}

global void input_set(int *input_d){

int index=blockIdx.x*blockDim.x + threadIdx.x;

int i,k;



i=index;



k=(index+1)*NO_BITS-1;

do{

	input_d[k]=i%2;

	i=i/2;

	k--;

}while(i!=0);

}

int main(){

unsigned long long int k=0;

float timetaken;

struct timeval TimeValue_Start, TimeValue_Final;

struct timezone TimeZone_Start, TimeZone_Final;

int *input_h;

int *input_d;

dim3 dimGrid(NUMBLOCKS);

dim3 dimBlock(NUMTHREADS);

gettimeofday(&TimeValue_Start, &TimeZone_Start);



input_h=(int *)malloc(PERMUTATIONS*NO_BITS*sizeof(int));

cudaMalloc((void **)&input_d,PERMUTATIONS*NO_BITS*sizeof(int));

init<<< dimGrid,dimBlock >>>(input_d);

cudaThreadSynchronize();

input_set<<< dimGrid,dimBlock >>>(input_d);



cutilCheckMsg("Kernel execution failed");

cudaMemcpy(input_h,input_d,PERMUTATIONS*NO_BITS*sizeof(int),

cudaMemcpyDeviceToHost);

for(k=0;k<PERMUTATIONS*NO_BITS;k++){

	if(k%NO_BITS==0)

		printf("\n");

	printf("%d",input_h[k]);

}

gettimeofday(&TimeValue_Final, &TimeZone_Final);



timetaken = (double) (TimeValue_Final.tv_sec - TimeValue_Start.tv_sec) + (double) (TimeValue_Final.tv_usec - TimeValue_Start.tv_usec)/ 1000000.0;

printf("\nTime taken : %lf \n", timetaken);

free(input_h);

cudaFree(input_d);

}[/codebox]

YDD · June 17, 2009, 1:07pm

An ‘unspecified launch failure’ is usually equivalent to a segfault. Compile your code in emulation mode, and run it through valgrind.

MMB · July 5, 2010, 2:19pm

Hi YDD.

If emulation is no longer supported, how would you recommend doing this today?

MMB

SPWorley · July 5, 2010, 3:49pm

Ocelot. It catches almost all memory access errors in emulation mode.

LSChien · July 5, 2010, 7:49pm

#define NUMBLOCKS 144

#define NUMTHREADS 32

dim3 dimGrid(NUMBLOCKS);	

dim3 dimBlock(NUMTHREADS);

init<<< dimGrid,dimBlock >>>(input_d);

__global__ void input_set(int *input_d){	

	  int index=blockIdx.x*blockDim.x + threadIdx.x;

so maximum of index is 143 * 32 + 31 > 512

but you data is

#define NO_BITS 9

#define PERMUTATIONS pow(2,NO_BITS)

input_h=(int *)malloc(PERMUTATIONS*NO_BITS*sizeof(int));	

cudaMalloc((void **)&input_d,PERMUTATIONS*NO_BITS*sizeof(int));

when you do

k=(index+1)*NO_BITS-1;	

 do{		

	input_d[k]=i%2;

	 ...

 }

k is out-of-bound because it exceeds 512 * NO_BITS

Topic		Replies	Views
Unspecified launch failure 4 kernel calls CUDA Programming and Performance	11	5201	April 2, 2008
Unspecified launch failure CUDA Programming and Performance	2	5735	May 24, 2009
Emulation runs ok V.S. GPU run failed CUDA Programming and Performance	1	1249	February 24, 2009
another unspecified launcher failure CUDA Programming and Performance	1	2923	March 15, 2009
Unspecified launch failure CUDA Programming and Performance	9	1210	November 5, 2010
Random Launch Failure CUDA Programming and Performance	2	1284	March 1, 2010
Wrong output when adding blocks what am I doing wrong? CUDA Programming and Performance	13	12076	December 4, 2007
kernel not executed, profiler reports all-zeros CUDA Programming and Performance	18	11085	December 2, 2008
unspecified launch failure CUDA Programming and Performance	1	2250	May 20, 2009
cudaSafeCall() Runtime API error in file <main.cu>, line 76 : unspecified launch failure I am CUDA Programming and Performance	2	11101	July 6, 2009

Unspecified launch failure and garbage values: Please help

Related topics