Struct calling a __host__ function from a __device__ function is not allowed or a crash

Hello I’m doing the Cuda by example lessons and Chapter 4 has a problem.

struct cuComplexG //GPU version
	 float   r;
	 float   i;
	// cuComplexG(float a, float b) // get error
	__device__ cuComplexG(float a, float b) //Crashes card
		 r = a; i = b;
	__device__ float magnitude2(void) 
		return r * r + i * i;
	__device__ cuComplexG operator*(const cuComplexG& a) 
		return cuComplexG(r*a.r - i*a.i, i*a.r + r*a.i);
	__device__ cuComplexG operator+(const  cuComplexG& a) 
		return cuComplexG(r + a.r, i + a.i);

I get the “calling a host function from a device function is not allowed” error when I use the original code, and when I add the device tag it just crashes the card.

So what is needed to fix this problem?

Using Cuda 7.5, VS 2013

any function that is declared without either host or device qualifiers would be assumed host, I presume… meaning the Constructor defaults to a host only function. Trying to instantiate an object of that class on the GPU (which the device operators apparently do) would cause the reported error.

About the crash, I presume the test kernel using the cuComplexG has an unrelated bug in it?


Both the kernal and julia_CoreG function are in the device,
I commented out the cuComplexG part in julia_CoreG and it runs without a hitch.

__device__ int julia_CoreG(int x, int y) {
	const float scale = 1.5;
	float jx = scale * (float)(DIM / 2 - x) / (DIM / 2);
	float jy = scale * (float)(DIM / 2 - y) / (DIM / 2);

	cuComplexG c(-0.8, 0.156);
	cuComplexG a(jx, jy);

	int i = 0;
	for (i = 0; i<200; i++) {
		a = a * a + c;
		if (a.magnitude2() > 1000)
			return 0;

	return 1;
__global__ void kernelG(unsigned char *ptr) {
	// map from blockIdx to pixel position
	int x = blockIdx.x;
	int y = blockIdx.y;
	int offset = x + y * gridDim.x;

	// now calculate the value at that position
	int juliaValue = julia_CoreG(x, y);
	ptr[offset * 4 + 0] = 255 * juliaValue;
	ptr[offset * 4 + 1] = 0;
	ptr[offset * 4 + 2] = 0;
	ptr[offset * 4 + 3] = 255;

You might be running into a WDDM timeout. You might need to be more specific about “it just crashes the card”.

Perhaps you should review this thread:

The crash error is: Driver stopped responding and successfully recovered.

Ah, so microsoft is the bugger in this… should of known…

Thank you for this information…

Oh fun, Mircosoft changed where the Tdr Keys are…

Are there any other places where one can change the Tdr values?

Cuda 7.5
Driver 358.50
VS 2013

If you’re running a code within nsight visual studio edition, there is a nsight VSE setting for this. I haven’t played with it too much, but I don’t think it affects codes running from the command line when nsight VSE is not running.

Also, a bit of googling may be in order. This is a common issue, and many folks have written their methods. You may find someone who has detailed how to do it on your OS, which you haven’t indicated.

Thanks txbob

That link gave most of the answer I needed, along with the needed hint to solve the problem.
And I did try googling, might of just used the wrong words or folks were not clear on what to do outside of normal thought (since there are new comers to CUDA, who may not know better)

The answer was;
I was using the WRONG debugger, I was using the “Local Windows Debugger”, not the “CUDA debugger” which was not mentioned in the book to be used

Though I should add some NOTES for folks:

  • If "Local Windows Debugger" crashes, go NSIGHT -> Start CUDA debugging ans see if it crashes then.
  • The above also activates the Nsight taskbar app where you can alter the TDR for debug use. (will require system reboot)