cuda crash without any message

xiangxiango · January 19, 2015, 8:02am

Hi,
When I switch on the computer ， the fist time I Starting the software It will crash without any message.
but the next time I Starting the software ,everything is normal.
And It will be normal until Next time I switch on the computer and starting the software.

Thanks,
TUshar

little_jimmy · January 19, 2015, 9:55am

perhaps then only switch on the computer every other time…

with “software”, are you referring to an application you wrote, or the ide itself?

xiangxiango · January 19, 2015, 10:21am

yes the software is a application I wrote.
When Debugging It will crash at “cudaDeviceSynchronize();”
and almost every time I switch on the computer I will happen.
But I do not why It crash at the First time ,but run normally in the next operation.

little_jimmy · January 19, 2015, 11:27am

i would worry if it crashes in the debugger, and largely focus on that

if it crashes on the cudaDeviceSynchronize call, it is perhaps something upstream

i presume you have a kernel just prior to the crashing cudaDeviceSynchronize, with memory copies perhaps

perhaps check for errors (cudaSuccess) prior to the kernel launch, and after it

alternatively, post some code

xiangxiango · January 22, 2015, 1:53am

Hi:
I add cudaSuccess before every memery operation. And I can sure that there is no memery error.
This is my code

global void Top_One_Function(DFX::Top_Tree::Top_Tree_DataTop_Tree_Data_ , DFX::Middle_Teee::Middle_Tree_DataMiddle_Tree_Data_
,unsigned intIndex_One_Frame , unsigned short intShift_Index_One_Frame , boolMiddle_Tree_Bool , int Middle_limid /,intBool_Index_Txt/
, int Top_Limid /, intHaha_Thread*/)
{
if (threadIdx.x > 125 )
{
return;
}
int offset = blockIdx.y*gridDim.x + blockIdx.x;
int offset_Volume = offset * 125 + threadIdx.x;
int Temple_Top_Tree = Index_One_Frame[offset_Volume];
if (!Temple_Top_Tree)
{
return;
}
unsigned short int Shift = Shift_Index_One_Frame[offset_Volume];
int Z = ((Shift>> 0)&1)4;
int Y = ((Shift>> 1)&1)2;
int X = ((Shift>> 2)&1);
int Top_One_Shift = Top_Tree_Data_[Temple_Top_Tree].Index;
int Bool_Index = Top_One_Shift + X + Y + Z;
/Bool_Index_Txt[offset_Volume] = Bool_Index;/
/__syncthreads();/
if (Bool_Index > Middle_limid||(!Middle_Tree_Bool)||Bool_Index < 0)
{
return;
}

		if (!Middle_Tree_Data_[Bool_Index].Index)
		{
			Middle_Tree_Bool[Bool_Index] = true;
            /*Middle_Bool_Temp[Bool_Index] = true;*/
		}
		/*Haha_Thread[offset_Volume] = Bool_Index + 100;*/
	}

It is crash in this funciont. and I used “cudaDeviceSynchronize” to detect other function.I sure that no error happen before this function.
And today I found that If i add “Haha_Thread” I will not crash. I just use “Haha_Thread” to record some variable, If i Comment out it .It will crash in this function.
So I doubt If it is because Memory alignment.

allanmac · January 22, 2015, 2:01am

@xiangxiango, you could try: “cuda-memcheck --report-api-errors all ”.

Also, cuda-memcheck has many many other options that you can try.

little_jimmy · January 22, 2015, 6:31am

if (threadIdx.x > 125 )
{
return;
}

if (!Temple_Top_Tree)
{
return;
}

…

/__syncthreads();/

you like trouble, don’t you?

i am also not sure whether it is a brilliant idea to pass a kernel: DFX::Top_Tree::Top_Tree_DataTop_Tree_Data_ , DFX::Middle_Teee::Middle_Tree_DataMiddle_Tree_Data_

xiangxiango · January 22, 2015, 6:58am

you mean that I should not use like that
or I should use __syncthreads();???

little_jimmy · January 22, 2015, 7:32am

it is never really a sound idea to call __syncthreads() in divergent code - all threads should be able to reach the __syncthreads(), otherwise you may end up ‘having a bad time’

i would interpret calling __syncthreads() after some threads have already exited, as a case of calling __syncthreads() in divergent code

just turn the conditionals around:

if (threadIdx.x < 125 )
{
work
}

__syncthreads(); // if necesssary

if (Temple_Top_Tree)
{
work
}

__syncthreads(); // if necesssary

xiangxiango · January 22, 2015, 7:53am

thank you sir I will modify my code as you tell.
I had use the memcheck to check where cause the crash
I restart my PC twice.
at the first It did not crash. And memcheck tell that"no error"
but the second time It tell that “unspecified launch failure” then crash happen.

little_jimmy · January 22, 2015, 10:21am

“unspecified launch failure”

use cudaGetLastError() just before and after your kernel launch, to determine whether your kernel actually launches

as mentioned:

i am also not sure whether it is a brilliant idea to pass a kernel: DFX::Top_Tree::Top_Tree_DataTop_Tree_Data_ , DFX::Middle_Teee::Middle_Tree_DataMiddle_Tree_Data_

xiangxiango · January 22, 2015, 10:34am

yes I had put cudaGetLastError().
End I sure that There is a wrong existing that function.
sorry sir I do not know what “pass to kernel” mean
I am from China . My english pool.

little_jimmy · January 22, 2015, 11:36am

your kernel declaration:

global void Top_One_Function(DFX::Top_Tree::Top_Tree_DataTop_Tree_Data_ , DFX::Middle_Teee::Middle_Tree_DataMiddle_Tree_Data_
,unsigned intIndex_One_Frame , unsigned short intShift_Index_One_Frame , boolMiddle_Tree_Bool , int Middle_limid /,intBool_Index_Txt/
, int Top_Limid /, intHaha_Thread*/);

from that, your kernel parameters, passed to the kernel

having kernel parameter: unsigned intIndex_One_Frame, and passing to the kernel: unsigned intIndex_One_Frame; that i can accept

having kernel parameter: DFX::Top_Tree::Top_Tree_DataTop_Tree_Data_ and passing to the kernel
DFX::Top_Tree::Top_Tree_DataTop_Tree_Data_ ; that worries me slightly

Topic		Replies	Views
Potential Bug, cuda-memcheck can someone verify? Program crashing on GPU initialisation with cuda-me CUDA Programming and Performance	11	3577	April 24, 2020
cudaDeviceSynchronize() returns cudaErrorMemoryAllocation CUDA Programming and Performance	1	548	February 2, 2018
Seemingly random crashes in CUDA application CUDA Programming and Performance	2	3729	January 27, 2011
Synchronization synchronizing a n body problem. CUDA Programming and Performance	8	4397	September 22, 2009
How to check if kernel was launched? Is possible that kernel failed to launch but it was not recorde CUDA Programming and Performance	3	3351	March 8, 2010
Unspecifiec launch failure on CUDA_SAFE_CALL(cudaThreadSynchronize()) CUDA Programming and Performance	5	2205	January 27, 2011
cudaThreadSynchronize() error CUDA Programming and Performance	1	2989	October 5, 2009
Cooperative Group sync causing cuda-memcheck error (Debug mode ONLY) CUDA Programming and Performance	0	495	January 25, 2018
Program hangs at cudaThreadsynchronize CUDA Programming and Performance	12	9712	April 7, 2011
Rest crashed CUDA card CUDA Programming and Performance	5	6860	February 23, 2009

cuda crash without any message

Related topics