Sending class objects to Cuda?

Hi all,

I have written a class Triangle, and have created a list of triangles with

list triList;

I wish to send this data to the kernel to process and perform tests on these triangles.

First, can I send to Cuda lists of class objects that I wrote myself? Can the kernel access the member functions of the triangle class? (My guess is no?)

Second, what is the best way to send this list? I know I have to use something similar to

cudaMemcpy(deviceArray, hostArray, size, cudaMemcpyHostToDevice);

But I’m not sure how to send my triangle list from hostArray to deviceArray, and how the kernel can read that data structure and access its data members?

Thanks for your advice!

C++ is not supported in kernel code at this time (apart from templates)

There isn’t C++ support, but some things work nonetheless. A homogeneous transformation class (which I had handy at the time of writing) would look something like this

[codebox]

ifdef NVCC

#define CUDA_FUNC_SPECIFIER __device__

else

#define CUDA_FUNC_SPECIFIER

endif

class cuda_align(16) Transform{

public:

WpFloat rot[3][3];

WpFloat trans[3];

CUDA_FUNC_SPECIFIER void setRotation(WpFloat val) {

          /* simple set stuff here */

};

CUDA_FUNC_SPECIFIER float getRotation(void) {

         /* simple get stuff here */

};

};

[/codebox]

The key here, is that NVCC compiles the class methods itself for use on the device. You can keep more advanced host-side functions, but for things on the device you want to keep it simple.

As for using a C++ linked list, that really isn’t an option. Use a std::vector instead - that one you can pass along to the kernel as a plain array.

If you get really adventurous, you can use a custom memory allocator to modify the vector to transparently use page-locked memory allocated by cudaMallocHost.

I do this in my own collision test system.

Remember that pointer chasing is terribly ineffective on CUDA. You can only get coalesced memory accesses if the data is allocated in a contiguous block (like an array).

Thanks fugl, I’m trying to understand your code here, so what you have is written in host code or in the device global function? How exactly would you call getRotation from device code? cuda_align->getRotation?

Is cuda_align a keyword? I’m sorry for my utter confusion. If this technique is all explained in a document somewhere, I’d really appreciate it if you pointed me in the right direction.

Thanks!

The Transform class should be defined in a header, used by both host-side functions (compiled with GCC) and the CUDA kernels (compiled with NVCC).

Sorry for the confusion regarding the typedef “cuda__align” - It’s a standin for the ordinary “align” so I can avoid including CUDA specific headers, if

the system doesnt not have it. It’s defined like this:

[codebox]ifdef HAS_CUDA

#include <host_defines.h> /* the cuda specific header */

#define __cuda_align__(n) __align__((n)) /* the cuda alignment keyword */

else

#define __cuda_align__(n) /* empty */

endif

[/codebox]

You can just use the ordinary align(n) syntax if you have no problems including the cuda headers.

I would call getRotation like this:

[codebox]global myKernel(Transform *tfsIn, float *out) {

float elm;

Transform tf = tfsIn[0];

elm = tf.getRotation(0,0); /* call a version of getRotation that returns rot[i][j] */

out[0] = elm;

}[/codebox]

If you have a std::vector of Transform, you obviously need to transfer it to the device. A pretty safe way to do it is:

[codebox]if (!vec.empty()) {

/* copy contents of vector to a plain array of Transforms on the device */

cudaMemcpy( transform_dev, &vec[0], vec.size(), cudaMemcpyHostToDevice);

}[/codebox]

How would this be accomplished in visual studio? My understanding is that anything to be compiled with nvcc has to have a custom build step, for example I specify the nvcc command for my main cuda file as :

“C:\CUDA\bin\nvcc.exe” -ccbin “$(VCInstallDir)bin” -c -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/O2,/Zi,/MT -I"C:\CUDA\include" -I./ -I…/…/common/inc -o $(ConfigurationName)\cudaMain.obj cudaMain.cu

Is this the same command you would use for a header file? And how can you compile the header with both nvcc and c++ compiler?

Many thanks!