when should cudaThreadSynchronize() be called?

London · October 22, 2010, 8:05am

Hi, when should cudaThreadSynchronize() be called and what is this function doing?

In the code below, there are three cudaThreadSynchronize() calls. Which of them are redundant? Which of them are necessary? Does not the kernel functions call cudaThreadSynchronize() automatically?

for(int i=0; i<NumberOfSimualtion; i++)

{

	for(int j=0; j<NumberOfSteps; j++)

	{

		 kernel1<<<60,32>>>(a,b,c);

		 cudaThreadSynchronize(); 

		 kernel2<<<60,32>>>(d,e,f);

		  cudaThreadSynchronize(); 

	 }

}

cudaThreadSynchronize(); 

printf("Done");

Thanks

London · October 22, 2010, 8:05am

Hi, when should cudaThreadSynchronize() be called and what is this function doing?

In the code below, there are three cudaThreadSynchronize() calls. Which of them are redundant? Which of them are necessary? Does not the kernel functions call cudaThreadSynchronize() automatically?

for(int i=0; i<NumberOfSimualtion; i++)

{

	for(int j=0; j<NumberOfSteps; j++)

	{

		 kernel1<<<60,32>>>(a,b,c);

		 cudaThreadSynchronize(); 

		 kernel2<<<60,32>>>(d,e,f);

		  cudaThreadSynchronize(); 

	 }

}

cudaThreadSynchronize(); 

printf("Done");

Thanks

tera · October 22, 2010, 8:40am

The first two are redundant, as kernels in the same stream are launched sequentially anyway, and no code is executed on the CPU between the kernels.

The third call does serve a purpose, without it the “Done” message would appear before execution is actually completed.

tera · October 22, 2010, 8:40am

The first two are redundant, as kernels in the same stream are launched sequentially anyway, and no code is executed on the CPU between the kernels.

The third call does serve a purpose, without it the “Done” message would appear before execution is actually completed.

London · October 22, 2010, 9:40am

Thanks. Very helpful.

London · October 22, 2010, 9:40am

Thanks. Very helpful.