Struct In Cuda

tiagofga · May 24, 2012, 12:51pm

Hi,

I need to program in CUDA a neural network.

The code of this network have a lot of structs.

I want to know, if I can use this structs or I need convert this structs in arrays.

I tried search in forum about your support, and not found any talking about CUDA 4.2.

Thanks for attention.

spiker · May 24, 2012, 12:59pm

Hello,

could you please reply with a post of your structs?

Thankyou

Paolo

tiagofga · May 24, 2012, 1:51pm

The struct is:

typedef int INT;

typedef double REAL;

typedef struct

{ /* A LAYER OF A NET:                     */

    INT Units; /* - number of units in this layer       */

    REAL* Output; /* - output of ith unit                  */

    REAL* Error; /* - error term of ith unit              */

    REAL** Weight; /* - connection weights to ith unit      */

    REAL** WeightSave; /* - saved weights for stopped training  */

    REAL** dWeight; /* - last weight deltas for momentum     */

} LAYER;

typedef struct

{ /* A NET:                                */

    LAYER** Layer; /* - layers of this net                  */

    LAYER* InputLayer; /* - input layer                         */

    LAYER* OutputLayer; /* - output layer                        */

    REAL Alpha; /* - momentum factor                     */

    REAL Eta; /* - learning rate                       */

    REAL Gain; /* - gain of sigmoid function            */

    REAL Error; /* - total net error                     */

} NET;

tiagofga · May 24, 2012, 2:11pm

If i use C++ class, have any problem to use arrays and methods getters/setters?

Skybuck · May 24, 2012, 4:25pm

Cuda supports pointers and function/procedure calls.

The cpu has it’s own memory so own pointers. (Let’s call these “local pointers, local to the cpu/main ram”)

The gpu has it’s own memory so own pointers. (Let’s call these “remote pointers from the perspective of the cpu” )

Integers and floating points are the same format so they can simply be copied back and forth and do not need any modifications or any special treatment.

The cpu program will have to allocate the memory on the gpu (remote allocation).

The cpu program will have to keep track of the pointers returned by this remote allocation (remote pointers).

The cpu program will have to send the remote pointers towards the kernel, this can either be done by kernel parameters (limited and more difficult), or by a single kernel parameter which is a remote pointer to a remote piece of memory containing all the remote pointers (easier/unlimited/indepedent).

The cpu program should also allocate this same structure locally to match the remote structure. (Structure could be called: “KernelParameters”, keep it as simple as possible a direct copy of everything needed.)

The cpu program must then first initialize the local memory (kernel parameters) with the remote pointers.

Then the cpu program copies this local memory to the remote memory, thus initializing the remote memory with the remote pointers from this local memory.

The cpu program should also copy any other memories which are necessary.

The cpu program can then invoke the kernel.

The kernel then receives a pointer which points to it’s own memory which has already been initialized by the cpu.

The kernel can then use this pointer and memory to initialize any structures like c structures or c++ structures which contain pointers/arrays etc. So simple assignments will do, example:

global void Kernel( TKernelParameters* KernelParameters ) // using a remote pointer has adventage that memory can be copied back from gpu to cpu as well.

MyStructure.MyPointer = KernelParameters->MyStructure.MyPointer;
OtherStructure.OtherPointer = KernelParameters->OtherStructure.OtherPointer;
etc

This will then initialize all the pointers on the kernel side.

Now the kernel is setup/ready to be used. The kernel can now run and simply access all it’s pointers as if it were arrays since this is what the C language allows via the index operator, example:
MyStructure.MyPointer[ MyIndex ] = SomeValue;

Doing remote allocations and remote freeing has the adventage that it works well. Cuda also has support for malloc and free inside kernels but is currently buggy. Also transferring data from cpu and gpu might still be required even when using malloc/free inside cuda itself… so might as well do all on cpu side or so External Image. This also has adventage that kernel stays relatively simple…

After the cpu is done with the kernel it should clean up the remote memory and possibly local memory as well if it is to terminate or so.

This should give some somewhat vague idea of what needs to be done ! External Image :)

Topic		Replies	Views
array of structs CUDA Programming and Performance	2	4024	August 10, 2008
Struct in CUDA can i use this struct in CUDA CUDA Programming and Performance	15	89594	June 26, 2009
Complex data structures CUDA Programming and Performance	3	4248	April 22, 2008
struct of arrays for parameters between host and device CUDA Programming and Performance	6	5149	September 21, 2009
handle structure in the array of device CUDA Programming and Performance	4	2323	May 1, 2012
C Structures CUDA Programming and Performance	1	4646	May 23, 2007
Passing a struct into a Kernel CUDA Programming and Performance	1	4376	June 24, 2009
Arrays of Structure Allocating memory for array of structures. CUDA Programming and Performance	7	3732	September 24, 2009
Translate Struct for CUDA CUDA Programming and Performance	11	13967	December 3, 2010
how and best way to allocate C struct to gpu? novice question CUDA Programming and Performance	6	8123	September 3, 2010

Struct In Cuda

Related topics