Dynamic array inside struct

shiftreduce · June 27, 2009, 9:55pm

Hi all!

I’m trying to allocate a struct with a dynamic array to use as a SOA.

typedef struct s {

	 int *array;

	 int size;

} soa;

So, I try to allocate memory as usual:

soa *test;

cudaMalloc((void**)&test, sizeof(test));

which works fine as expected.

The problem is when I try to allocate memory for the array inside it.

How should I do this?

avidday · June 27, 2009, 10:10pm

I doubt that works “fine as expected”, you are using the incorrect size in the cudaMalloc call. As for allocating the array inside the structure, I think you will have to allocate a separate integer array, and then a call device side function to assign its address to the pointer inside the structure.

JaredHoberock · June 27, 2009, 10:17pm

Hi all!

I’m trying to allocate a struct with a dynamic array to use as a SOA.
typedef struct s {

	 int *array;

	 int size;

} soa;
So, I try to allocate memory as usual:
soa *test;

cudaMalloc((void**)&test, sizeof(test));
which works fine as expected.

The problem is when I try to allocate memory for the array inside it.

How should I do this?

You probably don’t need to malloc the soa per se, but you do need to malloc the individual arrays inside. I would take this approach:

struct soa

{

  // each member of the struct of arrays is an array:

  int *member0;

  int *member1;

  int *member2;

  etc...

  int size;

};

__global__ foo(soa arrays, int *result)

{

  unsigned int i = gridDim.x * blockDim.x + threadIdx.x;

// these loads will coalesce

  result[i] = soa.member0[i] * soa.member1[i] + soa.member2[i];

}

// this code is paraphrased, but you get the idea

void main(void)

{

  soa struct_of_arrays;

struct_of_arrays.size = 100000;

// malloc each array inside the soa individually

  cudaMalloc(struct_of_arrays.member0, struct_of_arrays.size);

  cudaMalloc(struct_of_arrays.member1, struct_of_arrays.size);

  cudaMalloc(struct_of_arrays.member2, struct_of_arrays.size);

  etc...

// malloc space for the result

  int *result;

  cudaMalloc(result, struct_of_arrays.size);

// launch the kernel, passing struct_of_arrays by value:

  foo<<<gridDim,blockDim>>>(struct_or_arrays, result);

}

shiftreduce · June 27, 2009, 11:12pm

Thanks for your fast replies!

Thanks. That works as I wanted.

But now, taking things further, what I really want is a SOA where the number of arrays is dynamic and the type of each array is unknow/different.

Something like:

typedef struct s2 {

	 void **elements;

	 int type;

};

typedef struct s1 {

	 soa_array *array;

	 int size;

} soa;

The idea is to create a SOA struct depending on some kind of input.

I tried to “port” your previous solution to this but failed miserably External Media

Is this kind of approach even duable with cuda?

JaredHoberock · June 27, 2009, 11:33pm

If what you want is dynamic typing with C/C++, this would be at worst impossible, or at best very ugly. Certainly no good solution would be possible in device code. You might consider some of the CUDA Python bindings for your problem, as Python is a language with dynamic typing.

If what you want is dynamically sized arrays, then you basically need to make your SOA a dynamically-sized array of pointers.

If you know what types you’ll need at compile time, you could create a tuple of arrays.

In fact, Boost’s zip_iterator nicely encapsulates the SOA idea. Unfortunately, it’s not usable in CUDA yet.

shiftreduce · June 28, 2009, 12:09am

Well, it does work, I’ve written a simple example. My problem is actually having to allocate memory for this.

So, without thinking about this as a dymic type array, how can I have a struct with an array of structs which in turn have an array of ints?

typedef struct s2 {

	 int *elements;

} array;

typedef struct s1 {

	 soa_array *array;

	 int size;

} soa;

The thing is that I need to allocate space for something and space for something inside that…Is it doable?

shiftreduce · June 29, 2009, 2:29am

So, is there no way of doing something like this with CUDA?

tmurray · June 29, 2009, 3:06am

You just have to marshall things? I’m not sure why this is hard or any different from structs in the first place? Something like

array a_host, *a_device; //a_host is only used for marshalling the cudaMemcpy, not for actual storage

soa s_host, *s_device; //same for s_host

cudaMalloc((void**)&a_host.elements, sizeof(int) * someNumberOfElements);

cudaMalloc((void**)&a_device, sizeof(array));

cudaMemcpy(a_device, &a_host, sizeof(array), cudaMemcpyHostToDevice);

cudaMalloc((void**)s_device, sizeof(soa));

s_host.array = a_device;

cudaMemcpy(s_device, &s_host, sizeof(soa), cudaMemcpyHostToDevice);

It’s annoying, but that’s the price you pay when you want to use pointer-based structures in remote address spaces.

shiftreduce · July 19, 2009, 8:14pm

You just have to marshall things? I’m not sure why this is hard or any different from structs in the first place? Something like
array a_host, *a_device; //a_host is only used for marshalling the cudaMemcpy, not for actual storage

soa s_host, *s_device; //same for s_host

cudaMalloc((void**)&a_host.elements, sizeof(int) * someNumberOfElements);

cudaMalloc((void**)&a_device, sizeof(array));

cudaMemcpy(a_device, &a_host, sizeof(array), cudaMemcpyHostToDevice);

cudaMalloc((void**)s_device, sizeof(soa));

s_host.array = a_device;

cudaMemcpy(s_device, &s_host, sizeof(soa), cudaMemcpyHostToDevice);
It’s annoying, but that’s the price you pay when you want to use pointer-based structures in remote address spaces.

Thanks. I’ve tried this and I can make it work when this structs represent an output in my kernel! Which is what I want :)

But I can’t make it work when I want to initialize a struct like this in host, pass it to device, and for example copy all the elements from the input to the output (although what I want to do later on is a bit more complicated).

For example:

__global__ void kernel1(soa *input, soa *output) {

   output->array->elements[0] = input->array->elements[0];

}

I tried to use tmurray’s suggestion and just copy the elements array from host to device, like this:

cudaMalloc((void**)&a_host.elements, sizeof(int) * someNumberOfElements);

cudaMemcpy(&ta_host.attr_data.buffer_int, elements_host, sizeof(int)*N, cudaMemcpyHostToDevice);

But apparently it doesn’t work, although it should, right?

Anyway, if I’m taking the right approach, how can I do this?

shiftreduce · July 19, 2009, 8:22pm

Ok, so I managed to make it work! :)

I was doing the copy in a wrong way.

it should be:

cudaMemcpy(ta_host.attr_data.buffer_int, elements_host, sizeof(int)*N, cudaMemcpyHostToDevice);

instead of:

cudaMemcpy(&ta_host.attr_data.buffer_int, elements_host, sizeof(int)*N, cudaMemcpyHostToDevice);

Thanks again to everyone!

shiftreduce · July 19, 2009, 8:23pm

[sorry, post shoul be deleted]

Topic		Replies	Views
SoA - Structs of Arrays Structs of Arrays CUDA Programming and Performance	1	5852	June 17, 2008
Arrays of Structure Allocating memory for array of structures. CUDA Programming and Performance	7	3812	September 24, 2009
Using array of structures CUDA Programming and Performance	1	6327	February 16, 2010
dynamically allocate array of structs CUDA Programming and Performance	6	13208	September 29, 2008
How to use 2D Arrays wrapped in structs in CUDA? CUDA Programming and Performance	4	1588	October 17, 2017
array of structs with arrays inside CUDA Programming and Performance	5	2764	April 6, 2010
handle structure in the array of device CUDA Programming and Performance	4	2354	May 1, 2012
2D array inside a structure CUDA Programming and Performance	2	2176	February 14, 2012
Pointer in "complex" structure CUDA Programming and Performance	4	3270	March 8, 2009
Struct CUDA Programming and Performance	11	1839	June 30, 2016

Dynamic array inside struct

Related topics