Using structs in kernel calls

I’m attempting to use a struct I’ve written in one of my function kernel calls, but for some reason it seems the data isn’t getting into the devices memory. The struct is very simple:

typedef struct {

	float A,B,C,D;

} Epipolar;

I’m declaring a variable in device memory to hold the data, allocating it, then memcopying the data into the device variable, but it seems that the data isn’t actually there:

Epipolar * pLine_d;

cudaMalloc( ( void**) &pLine_d, sizeof(Epipolar));

cudaMemcpy(pLine_d,image->pLine, sizeof(image->pLine), cudaMemcpyHostToDevice);

//kernel call with pLine_d as argument

image->pLine is a pointer to an Epipolar that serves as my source data. However, it seems its getting copied incorrectly (or not at all) as my output comes out as gibberish. I’m guessing I’m simply doing something wrong, anyone have any idea what? Thanks very much for your time.

cudaMemcpy(pLine_d,image->pLine, sizeof(image->pLine), cudaMemcpyHostToDevice);

should read

cudaMemcpy(pLine_d,image->pLine, sizeof(Epipolar), cudaMemcpyHostToDevice);

since the size of the copy is the size of a single Epipolar, not the size of the pointer image->pLine.

Thanks for the help. Just to make sure I understand, image->pLine is a pointer acting as an array of Epipolars. Should the memcpy size be number of elements in image->pLine * sizeof(Epipolar) in this case?

Ah, I misunderstood you before. Correct, the memcpy size should be the size of Epipolar times the number of Epipolar objects in the array. i.e.,

cudaMemcpy(pLine_d,image->pLine, N*sizeof(Epipolar), cudaMemcpyHostToDevice);

My situation is actually a little different than I thought, image->pLine is actually a pointer to a pointer

Epipolar ** pLine

How exactly would I go about allocating and copying that data into a variable on the device?

this is how I have used sturcts in my kernel:

typedef struct align(16)
{
float a, b, c, f;
}Matrix

and then in main

Matrix *host, *device;

host=(Matrix*)malloc(SIZE * sizeof (Matrix));
CUDA_SAFE_CALL(cudaMalloc((void **) &device, SIZE * sizeof (Matrix)));

CUDA_SAFE_CALL(cudaMemcpy(device, host, SIZE * sizeof (Matrix), cudaMemcpyHostToDevice));

I think first in your cudaMalloc, you are allocating a structure of size one.

cudaMalloc( ( void**) &pLine_d, sizeof(Epipolar));

you should multiply the desired size of struct by sizeof(Epipolar).

I dont know if that will solve your problem :)

It’s hard to say without all the details. Can you post the relevant code?