Cudamemcpy doesn't seem to work

PHamnett · July 19, 2010, 2:05pm

Hey everyone, thanks for reading.

I’m still fairly new to cuda, but I have written a program, and the cudaMemcpy isn’t working, here is an example of the code

#include <iostream>

#include <cutil.h>

struct Particle_drift{ //This structure is for the particle drift

	Particle_drift() : x(0),p_x(0),y(0),p_y(0),t(0),p_t(0) {}

	double* x;

	double* p_x;

	double* y;

	double* p_y;

	double* t;

	double* p_t;

};

struct Particle_quadrupole{ //This is a structure to record particle data as it passes through a quadrupole

	double* x;

	double* p_x;

	double* y;

	double* p_y;

	double* t;

	double* p_t;

};

struct transport_map1{ //This is a structure containing the 1st order transport maps for all the various apparatus

	double drift[6][6];

	struct bend{ //Nested struct for the s and r bends with appropriate matrices

		double edge_focus[6][6];

		double dipole[6][6];

	} sbend, rbend;

	double quadrupole[6][6];

	double sextupole[6][6];

};

__global__ void kernel(double* x, double*p_x, double* y, double* p_y, double* t, double* p_t, Particle_quadrupole* p_device_particle_quadrupole, transport_map1* p_R){

	const unsigned int tid = blockIdx.x * blockDim.x + threadIdx.x;

double dummy;

dummy = x[tid] * p_R[0][0] + p_x[tid] * p_R[0][1]

...

...Insert more calculations here

...

x[tid] = dummy;

}

int main(){

Particle_drift device_particle_drift, host_particle_drift;

Particle_quadrupole device_particle_quadrupole, host_particle_quadrupole;

transport_map1 R;

...

...Stuff gets initialised and stuff here

...

Particle_drift* p_device_particle_drift = &device_particle_drift;

Particle_drift* p_device_particle_drift = &host_particle_drift;

Particle_quadrupole* p_device_particle_quadrupole = &device_particle_quadrupole;

Particle_quadrupole* p_device_particle_quadrupole = &host_particle_quadrupole;

transport_map1* p_R = &R;

size_t size = no_particles*sizeof(double);

size_t Rsize = 6*6*7*sizeof(double);

cutilSafeCall(cudaMalloc((void**)&p_device_particle_drift->x, size));

cutilSafeCall(cudaMalloc((void**)&p_device_particle_drift->p_x, size));

cutilSafeCall(cudaMalloc((void**)&p_device_particle_drift->y, size));

cutilSafeCall(cudaMalloc((void**)&p_device_particle_drift->p_y, size));

cutilSafeCall(cudaMalloc((void**)&p_device_particle_drift->t, size));

cutilSafeCall(cudaMalloc((void**)&p_device_particle_drift->p_t, size));

cutilSafeCall(cudaMalloc((void**)&p_device_particle_quadrupole, size*6));

cutilSafeCall(cudaMalloc((void**)&p_R, Rsize));

	cudaError_t retval;

	retval = cudaMemcpy(p_device_particle_drift->x, p_particle->x, size/6, cudaMemcpyHostToDevice);

	if (retval != cudaSuccess)

	{

		cout << "cudaMemcpy error at p_device_particle_drift: " << cudaGetErrorString(retval) << "value " << retval << endl;

	}

cutilSafeCall(cudaMemcpy(p_device_particle_drift->p_x, p_particle->p_x, size, cudaMemcpyHostToDevice));

cutilSafeCall(cudaMemcpy(p_device_particle_drift->y, p_particle->y, size, cudaMemcpyHostToDevice));

cutilSafeCall(cudaMemcpy(p_device_particle_drift->p_y, p_particle->p_y, size, cudaMemcpyHostToDevice));

cutilSafeCall(cudaMemcpy(p_device_particle_drift->t, p_particle->t, size, cudaMemcpyHostToDevice));

cutilSafeCall(cudaMemcpy(p_device_particle_drift->p_t, p_particle->p_t, size, cudaMemcpyHostToDevice));

cutilSafeCall(cudaMemcpy(p_R, p_Rhost, matrixRsize, cudaMemcpyHostToDevice));

cutilSafeCall(cudaMemcpy(p_T, p_Thost, matrixTsize, cudaMemcpyHostToDevice));

...kernel is executed...

...Program continues...

When I run the code I get an error from my own error handling where cudaGetErrorString(retval) is Uknown Error and its value is 30

If anyone has any idea why its not copying from the host to device please let me know.

Many Thanks,

Phill

o.stava · July 20, 2010, 8:55pm

Is this just a typo in your post? I don’t think you would be able to compile it but who knows…

Particle_drift* p_device_particle_drift = &device_particle_drift;

Particle_drift* p_device_particle_drift = &host_particle_drift;

//the same on the next lines

Ken_Domino · July 21, 2010, 10:13am

Hi PHamnett, You don’t say where you get the error return, so it’s hard to say where the problem is, which particular CUDA call of the many I see there is reporting rv=30. In general, I get rv=30 if I call a kernel with a bad pointer. After that, cuda calls like cudaMalloc go sour, still reporting that kernel error, probably because the threads are unresponsive. After kernel calls, check the error returned with cudaGetLastError. What I can tell from your code is that you have no guard for out of bounds array references in the kernel, so tid may go way pass the bounds of one of the arrays (x or p_x) and cause a segv. This can happen if you don’t set up the grid to overlay exactly the size of the arrays you set up. Also, if there is an error after the kernel, I always seem to have to do a cudaThreadExit to set things straight for successive calls to cuda. You have a bunch of cuda malloc’s and cpy’s. Hard to say if anything wrong there. Try adding in my memory debugging package (http://code.google.com/p/cuda-memory-debug/). If it helps, great. If not, oh well.

Topic		Replies	Views
CudaMemcpy fails Memcpy fails CUDA Programming and Performance	2	2528	October 27, 2011
Problems with Memcpy CUDA Programming and Performance	2	645	November 7, 2011
n00b error with cudaMemcpy CUDA Programming and Performance	4	997	June 30, 2010
cudaMemcpy not working? CUDA Programming and Performance	3	4329	May 27, 2009
Problems with cudaMemcpy CUDA Programming and Performance	2	2597	February 5, 2013
C Structures CUDA Programming and Performance	1	4627	May 23, 2007
cudaMemcpy problem CUDA Programming and Performance	0	2475	June 5, 2009
cudaMemcpy what am i doing wrong? CUDA Programming and Performance	6	4690	July 11, 2008
data not transferring when using cudamemcpy CUDA Programming and Performance	2	908	July 29, 2009
Copying struct from host to device using cudaMemcpy CUDA Programming and Performance	1	7680	May 13, 2011

Cudamemcpy doesn't seem to work

Related topics