Strange Kernel Freezes Freezing when the Kernel uses any device memory variable.

LordAlbior · January 15, 2010, 2:44pm

Hello,

In advance, sorry for my English.

I’m having strange problems with a kernel I’m trying to do for a Photon Mapper. Every time I try to access a device variable in the kernel (either static or dynamic), the program simply freezes the system. Unfortunately, I only own a video card to display and the CUDA (a GeForce 9800 GTX) and is hard to debug a program in this way. I’m using a Windows 7 Ultimate and CUDA Toolkit 3.0 and the new SDK (both 32-bit). The driver installed is 195.39.

Now, some code:

[codebox]global void photonmap( const int number_of_triangles,

					  const float4 light_pos,			

					  const float4 light_color,			

					  HostPhoton * out_data)

{

unsigned int index = (blockDim.x * blockIdx.x) + threadIdx.x ;	

unsigned long seed = index;

int photon_depth = 0;	





float4 p_dir;

p_dir.z = 1;

do{

	p_dir.x = 2*random(&seed) - 1;

	p_dir.y = 2*random(&seed) - 1;

	p_dir.z = 2*random(&seed) - 1;		

}while((p_dir.x*p_dir.x + p_dir.y*p_dir.y + p_dir.z*p_dir.z) > 1);



 __syncthreads(); // Sincroniza os threads

Photon p(light_pos, p_dir, light_color); // Criar fÃ³ton

PhotonHitRecord hit_p;	//Criar estrutura de gerenciamento do loop

float4 v0;

float4 e1;

float4 e2; 



for(photon_depth = 0; photon_depth < DEPTH; photon_depth++)

{

	// search through the triangles and find the nearest hit point		

	for(int i = 0; i < number_of_triangles; i++)

	{

		v0 = tex1Dfetch(triangle_texture,i*3);

		e1 = tex1Dfetch(triangle_texture,i*3+1);

		e2 = tex1Dfetch(triangle_texture,i*3+2);

		float t = PhotonTriangleIntersection(p, make_float3(v0.x,v0.y,v0.z),

			make_float3(e1.x,e1.y,e1.z), 

			make_float3(e2.x,e2.y,e2.z));

		if(t < hit_p.t && t > 0.001)

		{

			hit_p.t = t; 

			hit_p.hit_index = i;

		}

	}

	

	 __syncthreads();  

	

	

	if(hit_p.hit_index < 0){

		

		out_data[index+photon_depth].position = make_float4(0.0f,0.0f,0.0f,0.0f); // This variable freezes the system

		out_data[index+photon_depth].direction = make_float4(0.0f,0.0f,0.0f,0.0f); // This variable freezes the system

		out_data[index+photon_depth].power = make_float4(0.0f,0.0f,0.0f,0.0f);	// This variable freezes the system	

		photon_counter++; // This variable freezes the system (static device variable)

		

		if(index == 1) photon_counter++;

		

	}else{

		// Cria uma normal 	

		

		hit_p.normal = cross(make_float3(e1.x,e1.y,e1.z), make_float3(e2.x,e2.y,e2.z));

		hit_p.normal = normalize(hit_p.normal);

	float4 hitpoint = p.pos + p.dir * hit_p.t;

		float3 L = make_float3(light_pos.x - hitpoint.x,light_pos.y - hitpoint.y, light_pos.z - hitpoint.z);

		

		float dist_to_light = length(L);

		L = normalize(L);

		float roulette = random(&seed);

		

		if( roulette <= DIFF){

			out_data[index+photon_depth].position = p.pos; // This variable freezes the system

			out_data[index+photon_depth].direction = p.dir; // This variable freezes the system

			out_data[index+photon_depth].power = p.power; // This variable freezes the system				

			float4 reflection = make_float4(0.0f,0.0f,0.0f,0.0f);

			float r1 = random(&seed);

			float r2 = random(&seed);

			reflection.x = __cosf(2.0f*PI*r1)*__fsqrt_rn(1-__powf(r2,(2.0f/EXPO+1.0f)));

			reflection.y = __sinf(2.0f*PI*r1)*__fsqrt_rn(1-__powf(r2,(2.0f/EXPO+1.0f)));

			reflection.z = __fsqrt_rn(__powf(r2,(1.0f/EXPO+1.0f)));

			reflection.w = 0.0f;

			hit_p.resetT();

			p = Photon(hitpoint, reflection, light_color);

		} else if(roulette > DIFF && roulette <= (DIFF + SPEC)){

			hit_p.resetT();

			float3 reflected = reflect(make_float3(p.dir.x,p.dir.y,p.dir.z),hit_p.normal);

			p = Photon(hitpoint,

                                             make_float4(reflected.x,reflected.y,reflected.z,0.0f),

                                             light_color);				

		} else{

		

		}

	}	

}

}[/codebox]

I discovered the location of this problem by isolating parts of the code and compiling the rest separately until you find what freezes the system. Although I have come to this conclusion, I’m not “the best programmer” and I still a beginner when it comes to programming using CUDA. So, feel free to correct my logic.

I’m really perplexed by this strange behavior of the program.

David_Olsson · January 16, 2010, 2:11am

I’ve been having strange kernel freezes my self that forces me to reboot my system. Sometimes adding a single line of code will cause a crash even if that line couldn’t possibly ever be executed. Shared memory seems extra sensitive.

nitin.life · January 16, 2010, 6:17am

I just skimmed over your code:

4 suggestions

Try using deviceemu and check your array access bounds…
I see a __syncthreads inside an if loop somewhere in your code… try commenting that out first… and try putting it outside the loop.
Check your shared memory access pattern you may be allocating//accessing it incorrectly…

4)Whats your lmem usage ?

Topic		Replies	Views
Freezes during kernel execution CUDA Programming and Performance	1	3645	May 29, 2007
CUDA causes system freeze system has to be reset to work again ... CUDA Programming and Performance	4	11184	November 30, 2009
GPU Never Returns Every few 100 runs, the kernel never returns CUDA Programming and Performance	0	2499	August 7, 2009
CUDA freezes computer CUDA Programming and Performance	11	6497	February 5, 2019
freeze while kernel executes? CUDA Programming and Performance	2	1131	May 11, 2009
Rest crashed CUDA card CUDA Programming and Performance	5	6871	February 23, 2009
CUDA freezes my screen CUDA Programming and Performance	3	3481	April 13, 2009
Program freezes machine after several runs , or cudaThreadSynchronize() and its effect. CUDA Programming and Performance	1	2677	December 2, 2009
WHAT IT'S WRONG IN THIS KERNEL CODE? CUDA Programming and Performance	4	2032	June 17, 2008
Cuda makes my pc crazy CUDA Programming and Performance	10	7534	September 16, 2010

Strange Kernel Freezes Freezing when the Kernel uses any device memory variable.

Related topics