CUDA function qualifiers in classes

Hi!

I have a Fermi and the latest CUDA 3.1 toolkit. According to the programming guide it should be possible to use function qualifiers like global in class definitions inside .cu files (see programming guide, version 3.1.1, p. 133 and 136). Nonetheless, if I do sth like in the following code snippet, I get a “illegal combination of memory qualifiers” (this is inside a .cu file which I compile with NVCC oc and include in some cpp file):

class TestsKernels

{

	public:

		TestsKernels();

		~TestsKernels();

		// ...

	private:

		// error in following line:

		template <unsigned int blockSize> __global__ void tick();

		// error in following line:

		__global__ void trick();

		// no error in next line:

		__forceinline__ __device__ void track();

};

The only topic I found related to this was back from an ancient time, when the gods were angry and kingdoms were forged with steel :)

Edit: Replacing global with device does work (but gives errors in other places cause I need global for kernels).

I don’t see anywhere in the guide that explicitly says that global can be used for a member function. How would such a call even work? The this pointer passed into the function would be a host pointer.

The class support in CUDA 3.1 is such that you can declare a class where the constructor and all members are device functions. You can then instantiate that class inside a normal global function kernel and use it. For example:

class some_class

	{

	public:

		__device__ some_class(int a) :  b(a) {}

		__device__ int getB() : {return b;}

	private:

		int b;

	}

__global__ void kernel()

	{

	some_class my_class(threadIdx.x);

	do something with my_class.getB()

	}

I don’t see anywhere in the guide that explicitly says that global can be used for a member function. How would such a call even work? The this pointer passed into the function would be a host pointer.

The class support in CUDA 3.1 is such that you can declare a class where the constructor and all members are device functions. You can then instantiate that class inside a normal global function kernel and use it. For example:

class some_class

	{

	public:

		__device__ some_class(int a) :  b(a) {}

		__device__ int getB() : {return b;}

	private:

		int b;

	}

__global__ void kernel()

	{

	some_class my_class(threadIdx.x);

	do something with my_class.getB()

	}

In the programming guide (p. 133) it says:

It also says that restrictions specified in “previous parts of this programming guide […] still apply”. But it is not mentioned that you cant use global functions inside class scope.

On p. 136 regarding classes in detail it says:

These statements clearly include global functions to use in class scope without naming any restrictions to it.

I see there are no examples with global member functions and I also understand your point about a this pointer but could it be possible to use a device pointer to get rid of this obstacle? I hoped I could include my kernels into classes too…

In the programming guide (p. 133) it says:

It also says that restrictions specified in “previous parts of this programming guide […] still apply”. But it is not mentioned that you cant use global functions inside class scope.

On p. 136 regarding classes in detail it says:

These statements clearly include global functions to use in class scope without naming any restrictions to it.

I see there are no examples with global member functions and I also understand your point about a this pointer but could it be possible to use a device pointer to get rid of this obstacle? I hoped I could include my kernels into classes too…

hello, have you solved this problem or not?

hello, have you solved this problem or not?

And what do you think about the following situation ?

class C {

private :

	float f ;

public :

	C () {

		f = ... ;

	}

	C ( C & other ) {

		this -> f = other.f ;

	}

	static __global__ void kernel ( C obj ) {

		float f = obj.f ;

		...

	}

	__host__ void runKernel () {

		...

		kernel <<< numBlocks , numThreadsPerBlock >>> ( * this ) ;

	}

	~ C () {

		

	}

} ;

In this case, the code executed on the GPU does not need to use the “this” pointer. However, this code does not compile with nvcc: “error: illegal combination of memory qualifiers”. Unfortunately, I cannot place the “kernel” function outside the class because I need to access private fields and functions, and I don’t want to make them public ! From my point of view, the code presented hereinbefore should be considered as 100% correct.

Pi2: in that case, can’t you declare that the function is a “friend” function of the class ? That way it can access the private fields without being a class member.

Cedric