CUDA freezes computer

Hi.

I’m programming a CUDA kernel to compute something related to computational geometry from a set of points in the plane. The program divides the plane into a two-dimensional grid (as a sort of window with pixels). For each of these “pixels” and inside the CUDA kernel we have to do a double loop over all the points in our initial set of points, so the algorithm has a n² complexity where n is the number of points.

The CUDA kernel looks something like that:

__global__ void CUDA_kernel(float* points_list, int number_of_points, float* result)

{

	for(int i=0;i<number_of_points;i++)

	{

		for(int j=i+1;j<number_of_points;j++)

		{

			(do something);

		}

	}

	result[index]=something;

}

The problem comes (I only suppose) when n grows and this double loop cause the total amount of instructions inside the CUDA kernel exceeds a determined number (I can see in the reference manual that a CUDA kernel can have, at most, 2 million of native instructions). The PC freezes completely…I can still move the mouse but I can do nothing except reset the machine.

I was thinking that the problem is the number of instructions, so I divide the CUDA kernel in more little loops, something like this

__global__ void CUDA_kernel(float* points_list, int number_of_points, float* result,int partition,int elements_in_partition)

{

	int begin=partition*elements_in_partition;

	for(int i=begin;i<begin+elements_in_partition;i++)

	{

		for(int j=i+1;j<number_of_points;j++)

		{

			(do something);

		}

	}

	result[index]=something;

}
.....

elements_in_partition=something;

for(int partition=0;partition<number_of_partitions;partition++)

{

	  CUDA_kernel<<<block_dim,grid_dim>>>(...,partition,elements_in_partition);

}

....

But, surprisingly, PC still freezes. Do you think that the problem is the number of instructions inside CUDA kernel? Is this could be the problem, why with an inferior number of points (n) the program runs well (even with a greater number of instructions than in the second version posted here)?

Can be anything else? Have you any idea? :).

Thanks in advance.

I highly doubt it is the number of instructions. It is very likely that you’re running into the watchdog timer of the OS.

Search the forums for watchdog timer for more information.

I’ve read about the watchdog timer in your link, but in these problems I see people saying that their CUDA kernels give an error past a certain number of seconds…I dont have any error message…I only get a freezed computer which I have to reset totally.

I’ve read about the watchdog timer in your link, but in these problems I see people saying that their CUDA kernels give an error past a certain number of seconds…I dont have any error message…I only get a freezed computer which I have to reset totally.
Moreover I am using Linux, and I launh the program from a console.

While the watchdog should give a nice error message, more often than not it bluescreens your computer. At least that how it works with pre 185 drivers, haven’t checked them yet.

I don’t get error message or bluescreen. I only get the computer totally freezed, like the CUDA kernel was computing. The problem is that the CUDA kernel never “ends” and the computer is always freezed (with the image freezed, keyboard input freezed…) until I reset it. The time until it freezes varies from a PC to another one, so I suspect it is not related to the computing time.

I’m using the 180.22 drivers and an OpenSuSe 10.2 on a GeForce GTX 280

Well in my cases it would freeze totally for 10+ seconds and then bluescreen.

Have you tried the new drivers?

Reduce the size of your problem and run in debug and emulation (-deviceemu -D_DEVICEEMU /Od etc.)

A simple access violation inside your kernel could cause this problem. Emulation with debug information should detect the source of the problem.

I triggered the exact same problem on XP32 by voluntarily reading beyond a global memory buffer. Everything freezes except the mouse can move a little.

What if I double checked my kernel for such errors? I guess the problem can be caused by kernel taking too much resources…

The kernel from the tutorial freezes my computer as follows:

When executed, the process hangs. If I try to open task manager and kill it, the computer freezes. If I don’t the computer freezes after a number of minutes. This is using GTX960M, Windows 10, latest driver.

#include
#include <math.h>
// Kernel function to add the elements of two arrays
global
void add(int n, float *x, float *y)
{
int index = threadIdx.x;
int stride = blockDim.x;
for (int i = index; i < n; i += stride)
y[i] = x[i] + y[i];
}

int main(void)
{
int N = 1<<20;
float *x, *y;

// Allocate Unified Memory – accessible from CPU or GPU
cudaMallocManaged(&x, Nsizeof(float));
cudaMallocManaged(&y, N
sizeof(float));

// initialize x and y arrays on the host
for (int i = 0; i < N; i++) {
x[i] = 1.0f;
y[i] = 2.0f;
}

// Run kernel on 1M elements on the GPU
add<<<1, 256>>>(N, x, y);

// Wait for GPU to finish before accessing on host
cudaDeviceSynchronize();

// Check for errors (all values should be 3.0f)
float maxError = 0.0f;
for (int i = 0; i < N; i++)
maxError = fmax(maxError, fabs(y[i]-3.0f));
std::cout << "Max error: " << maxError << std::endl;

// Free memory
cudaFree(x);
cudaFree(y);

return 0;
}

I seem to have a similar problem: https://devtalk.nvidia.com/default/topic/1043879/cuda-programming-and-performance/whole-system-freezes-when-using-cudamallocmanaged/

Driver 418.81 was released today and may help with this issue. You may wish to try it.