Sending class objects to Cuda?

mischan · February 5, 2009, 10:17pm

Hi all,

I have written a class Triangle, and have created a list of triangles with

list triList;

I wish to send this data to the kernel to process and perform tests on these triangles.

First, can I send to Cuda lists of class objects that I wrote myself? Can the kernel access the member functions of the triangle class? (My guess is no?)

Second, what is the best way to send this list? I know I have to use something similar to

cudaMemcpy(deviceArray, hostArray, size, cudaMemcpyHostToDevice);

But I’m not sure how to send my triangle list from hostArray to deviceArray, and how the kernel can read that data structure and access its data members?

Thanks for your advice!

E.D_Riedijk · February 6, 2009, 5:31am

C++ is not supported in kernel code at this time (apart from templates)

Fugl · February 6, 2009, 9:14am

There isn’t C++ support, but some things work nonetheless. A homogeneous transformation class (which I had handy at the time of writing) would look something like this

[codebox]

ifdef NVCC

#define CUDA_FUNC_SPECIFIER __device__

else

#define CUDA_FUNC_SPECIFIER

endif

class cuda_align(16) Transform{

public:

WpFloat rot[3][3];

WpFloat trans[3];

CUDA_FUNC_SPECIFIER void setRotation(WpFloat val) {

          /* simple set stuff here */

};

CUDA_FUNC_SPECIFIER float getRotation(void) {

         /* simple get stuff here */

};

};

[/codebox]

The key here, is that NVCC compiles the class methods itself for use on the device. You can keep more advanced host-side functions, but for things on the device you want to keep it simple.

As for using a C++ linked list, that really isn’t an option. Use a std::vector instead - that one you can pass along to the kernel as a plain array.

If you get really adventurous, you can use a custom memory allocator to modify the vector to transparently use page-locked memory allocated by cudaMallocHost.

I do this in my own collision test system.

_Big_Mac · February 6, 2009, 12:38pm

Remember that pointer chasing is terribly ineffective on CUDA. You can only get coalesced memory accesses if the data is allocated in a contiguous block (like an array).

mischan · February 6, 2009, 6:48pm

There isn’t C++ support, but some things work nonetheless. A homogeneous transformation class (which I had handy at the time of writing) would look something like this

[codebox]

ifdef NVCC
#define CUDA_FUNC_SPECIFIER __device__
else
#define CUDA_FUNC_SPECIFIER
endif

class cuda_align(16) Transform{

public:
WpFloat rot[3][3];

WpFloat trans[3];

CUDA_FUNC_SPECIFIER void setRotation(WpFloat val) {

          /* simple set stuff here */

};

CUDA_FUNC_SPECIFIER float getRotation(void) {

         /* simple get stuff here */

};
};

[/codebox]

The key here, is that NVCC compiles the class methods itself for use on the device. You can keep more advanced host-side functions, but for things on the device you want to keep it simple.

As for using a C++ linked list, that really isn’t an option. Use a std::vector instead - that one you can pass along to the kernel as a plain array.

If you get really adventurous, you can use a custom memory allocator to modify the vector to transparently use page-locked memory allocated by cudaMallocHost.

I do this in my own collision test system.

Thanks fugl, I’m trying to understand your code here, so what you have is written in host code or in the device global function? How exactly would you call getRotation from device code? cuda_align->getRotation?

Is cuda_align a keyword? I’m sorry for my utter confusion. If this technique is all explained in a document somewhere, I’d really appreciate it if you pointed me in the right direction.

Thanks!

Fugl · February 9, 2009, 12:50pm

The Transform class should be defined in a header, used by both host-side functions (compiled with GCC) and the CUDA kernels (compiled with NVCC).

Sorry for the confusion regarding the typedef “cuda__align” - It’s a standin for the ordinary “align” so I can avoid including CUDA specific headers, if

the system doesnt not have it. It’s defined like this:

[codebox]ifdef HAS_CUDA

#include <host_defines.h> /* the cuda specific header */

#define __cuda_align__(n) __align__((n)) /* the cuda alignment keyword */

else

#define __cuda_align__(n) /* empty */

endif

[/codebox]

You can just use the ordinary align(n) syntax if you have no problems including the cuda headers.

I would call getRotation like this:

[codebox]global myKernel(Transform *tfsIn, float *out) {

float elm;

Transform tf = tfsIn[0];

elm = tf.getRotation(0,0); /* call a version of getRotation that returns rot[i][j] */

out[0] = elm;

}[/codebox]

If you have a std::vector of Transform, you obviously need to transfer it to the device. A pretty safe way to do it is:

[codebox]if (!vec.empty()) {

/* copy contents of vector to a plain array of Transforms on the device */

cudaMemcpy( transform_dev, &vec[0], vec.size(), cudaMemcpyHostToDevice);

}[/codebox]

mischan · March 12, 2009, 2:58pm

How would this be accomplished in visual studio? My understanding is that anything to be compiled with nvcc has to have a custom build step, for example I specify the nvcc command for my main cuda file as :

“C:\CUDA\bin\nvcc.exe” -ccbin “$(VCInstallDir)bin” -c -DWIN32 -D_CONSOLE -D_MBCS -Xcompiler /EHsc,/W3,/nologo,/O2,/Zi,/MT -I"C:\CUDA\include" -I./ -I…/…/common/inc -o $(ConfigurationName)\cudaMain.obj cudaMain.cu

Is this the same command you would use for a header file? And how can you compile the header with both nvcc and c++ compiler?

Many thanks!

Topic		Replies	Views
Cuda passing a class to a kernel CUDA Programming and Performance	2	1461	July 5, 2022
Pass the class object from the host to the device CUDA Programming and Performance	0	678	January 18, 2012
Integrate Cuda to an existing OOP application re-design everything? CUDA Programming and Performance	6	13857	September 16, 2009
C++ support 2.2 CUDA Programming and Performance	7	1741	July 2, 2009
How to use class in CUDA C++? CUDA Programming and Performance	1	19552	May 29, 2018
Passing C++ Objects to CUDA kernels Is the accessing of the Object members same as usual Cpp code?? CUDA Programming and Performance	1	1675	October 28, 2011
C++ class functions for management of data transfer CUDA Programming and Performance cuda , hpc	6	705	March 1, 2024
Using std::vector in cuda kernel? Its posible to use a std::vector inside cuda kernel? CUDA Programming and Performance	4	28542	May 1, 2010
cudaMemcpy Host to Device of std::list<Object > population CUDA Programming and Performance	6	1731	June 17, 2014
Need help with Classes (oop) and CUDA CUDA Programming and Performance	9	3458	December 6, 2012

Sending class objects to Cuda?

Related topics