Problem with copying char array to device.......

Hi, I have a problem with copying char array to device and copy them back. Even though I do not make any modifications to the array, just copy operations set everything to a weird state… copy the host char array to device and then copy back would set the original char value to \u0001…

Device Code: Really not doing anything…

__global__ void cuda_move(char *our_states_dev, int *our_x_locations_dev, int *our_y_locations_dev, char DEAD, 
	int environment_width, int environment_height, curandState *cuda_states)
{
	int id = threadIdx.x + blockIdx.x * blockDim.x;

	our_x_locations_dev[id] = 1;
	our_y_locations_dev[id] = 1;
}

host code as extern function:

extern "C" void kernel_functions(struct global_t *global, struct our_t *our, 
	struct const_t *constant, struct stats_t *stats, struct cuda_t *cuda)
{

	printf("rank %d our x position %d and our y position %d and state %c \n", 
		our->our_rank, our->our_x_locations[0], our->our_y_locations[0], our->our_states[0]);

	int environment_width = constant->environment_width;
	int environment_height = constant->environment_height;

	// set up cuda Random Number Generator
	curandState *cuda_states;	
	cudaMalloc(&cuda_states, cuda->numThread * cuda->numBlock);
	time_t current_time;
	time(&current_time);
	rand_kernel<<<cuda->numBlock, cuda->numThread>>>(cuda_states, (unsigned long)current_time);

	//cuda_init(global, our, stats, cuda, cuda_states);
	cuda->our_size = sizeof(int) * our->our_number_of_people;
	cuda->our_states_size = sizeof(char) * our->our_number_of_people;

	// cuda memory allocation
	HANDLE_ERROR(cudaMalloc((void**)&cuda->our_x_locations_dev, cuda->our_size));
	HANDLE_ERROR(cudaMalloc((void**)&cuda->our_y_locations_dev, cuda->our_size));
	HANDLE_ERROR(cudaMalloc((void**)&cuda->our_states_dev, cuda->our_states_size));
	
	// copy host memory to device
	cudaMemcpy(cuda->our_x_locations_dev, our->our_x_locations, cuda->our_size, cudaMemcpyHostToDevice);
	cudaMemcpy(cuda->our_y_locations_dev, our->our_y_locations, cuda->our_size, cudaMemcpyHostToDevice);
	cudaMemcpy(cuda->our_states_dev, our->our_states, cuda->our_states_size, cudaMemcpyHostToDevice);

	// set up 1D array for cuda
	cuda->numThread = 128;
	int tempBlock = (our->our_number_of_people + cuda->numThread - 1)/cuda->numThread;
	cuda->numBlock = (32 < tempBlock ? 32 : tempBlock);

	// execute device code on updating people's movement
	cuda_move<<<cuda->numBlock, cuda->numThread>>>(cuda->our_states_dev, 
		cuda->our_x_locations_dev, cuda->our_y_locations_dev, DEAD, 
		environment_width, environment_height, cuda_states);
	// Sync
	cudaThreadSynchronize();

	//cuda_finish(global, our, stats, cuda);
	cudaMemcpy(our->our_x_locations, cuda->our_x_locations_dev, cuda->our_size, cudaMemcpyDeviceToHost);
	cudaMemcpy(our->our_y_locations, cuda->our_y_locations_dev, cuda->our_size, cudaMemcpyDeviceToHost);
	
	cudaMemcpy(our->our_states, cuda->our_states_dev, cuda->our_states_size, cudaMemcpyDeviceToHost);

	cudaFree(cuda->our_x_locations_dev);
	cudaFree(cuda->our_y_locations_dev);
	cudaFree(cuda->our_states_dev);
	cudaFree(cuda_states);
}
Iteration 0 ! 
Iteration 0 ! 
Iteration 0 ! 
rank 0 our x position 6 and our y position 20 and state o 
rank 2 our x position 16 and our y position 8 and state X 
rank 1 our x position 24 and our y position 1 and state o 
The size is 16 
Iteration 1 ! 
The size is 18 
Iteration 1 ! 
The size is 16 
Iteration 1 ! 
ERROR: person 0 has state '\u0001'
rank 1 our x position 1 and our y position 1 and state \u0001
rank 2 our x position 1 and our y position 1 and state \u0001
The size is 18 
The size is 16 
Iteration 2 ! 
Iteration 2 !

I need to add some more information I just found out…

If I comment out the second line of the device code, the program returns no error, but “our_x_locations_dev” and “our_y_locations_dev” are all set to 2. I think this means that the device code is somehow overstepping on “our_x_locations_Dev” array.

Still I cannot find out where the bug is…

I am running a MPI and CUDA hybrid, is that the problem?

__global__ void cuda_move(int *our_x_locations_dev, int *our_y_locations_dev, char *our_states_dev, 
	char DEAD, int environment_width, int environment_height, curandState *cuda_states)
{
	int id = threadIdx.x + blockIdx.x * blockDim.x;
	our_x_locations_dev[id] = 2;
	//our_y_locations_dev[id] = 1;
}

I did a little bit more investigation, and it turned out that the 48th line of the host code is doing something really weird…

I try to get it to print out the “our_states” array on host memory before and after line 48.

Before is

rank 0 and person 0 and x position 5 and y position 22 and state X and day 0 
rank 0 and person 1 and x position 12 and y position 1 and state o and day 0 
rank 0 and person 2 and x position 8 and y position 13 and state o and day 0 
rank 0 and person 3 and x position 20 and y position 3 and state o and day 0 
rank 0 and person 4 and x position 25 and y position 19 and state o and day 0 
rank 0 and person 5 and x position 7 and y position 28 and state o and day 0 
rank 0 and person 6 and x position 20 and y position 6 and state o and day 0 
rank 0 and person 7 and x position 10 and y position 14 and state o and day 0 
rank 0 and person 8 and x position 1 and y position 17 and state o and day 0 
rank 0 and person 9 and x position 19 and y position 10 and state o and day 0 
rank 0 and person 10 and x position 7 and y position 20 and state o and day 0 
rank 0 and person 11 and x position 17 and y position 0 and state o and day 0 
rank 0 and person 12 and x position 16 and y position 6 and state o and day 0 
rank 0 and person 13 and x position 11 and y position 22 and state o and day 0 
rank 0 and person 14 and x position 17 and y position 29 and state o and day 0 
rank 0 and person 15 and x position 8 and y position 14 and state o and day 0 
rank 0 and person 16 and x position 14 and y position 12 and state o and day 0 
rank 0 and person 17 and x position 16 and y position 22 and state o and day 0 
rank 0 and person 18 and x position 25 and y position 6 and state o and day 0 
rank 0 and person 19 and x position 17 and y position 12 and state o and day 0 
rank 0 and person 20 and x position 17 and y position 16 and state o and day 0 
rank 0 and person 21 and x position 10 and y position 7 and state o and day 0 
rank 0 and person 22 and x position 23 and y position 20 and state o and day 0 
rank 0 and person 23 and x position 21 and y position 24 and state o and day 0 
rank 0 and person 24 and x position 29 and y position 3 and state o and day 0 
rank 0 and person 25 and x position 26 and y position 7 and state o and day 0 
rank 0 and person 26 and x position 15 and y position 13 and state o and day 0 
rank 0 and person 27 and x position 7 and y position 1 and state o and day 0 
rank 0 and person 28 and x position 12 and y position 18 and state o and day 0 
rank 0 and person 29 and x position 23 and y position 21 and state o and day 0 
rank 0 and person 30 and x position 9 and y position 1 and state o and day 0 
rank 0 and person 31 and x position 5 and y position 23 and state o and day 0 
rank 0 and person 32 and x position 13 and y position 13 and state o and day 0 
rank 0 and person 33 and x position 7 and y position 0 and state o and day 0 
rank 0 and person 34 and x position 12 and y position 17 and state o and day 0 
rank 0 and person 35 and x position 12 and y position 29 and state o and day 0 
rank 0 and person 36 and x position 3 and y position 15 and state o and day 0 
rank 0 and person 37 and x position 7 and y position 26 and state o and day 0 
rank 0 and person 38 and x position 27 and y position 20 and state o and day 0 
rank 0 and person 39 and x position 13 and y position 27 and state o and day 0 
rank 0 and person 40 and x position 15 and y position 9 and state o and day 0 
rank 0 and person 41 and x position 4 and y position 0 and state o and day 0 
rank 0 and person 42 and x position 23 and y position 11 and state o and day 0 
rank 0 and person 43 and x position 24 and y position 27 and state o and day 0 
rank 0 and person 44 and x position 21 and y position 17 and state o and day 0 
rank 0 and person 45 and x position 18 and y position 22 and state o and day 0 
rank 0 and person 46 and x position 11 and y position 23 and state o and day 0 
rank 0 and person 47 and x position 16 and y position 16 and state o and day 0 
rank 0 and person 48 and x position 29 and y position 15 and state o and day 0 
rank 0 and person 49 and x position 17 and y position 11 and state o and day 0

and after is

rank 0 and person 0 and x position 0 and y position 1 and state A and day 0 
rank 0 and person 1 and x position 1 and y position 2 and state  and day 0 
rank 0 and person 2 and x position 2 and y position 3 and state  and day 0 
rank 0 and person 3 and x position 3 and y position 4 and state  and day 0 
rank 0 and person 4 and x position 4 and y position 5 and state B and day 0 
rank 0 and person 5 and x position 5 and y position 6 and state  and day 0 
rank 0 and person 6 and x position 6 and y position 7 and state  and day 0 
rank 0 and person 7 and x position 7 and y position 8 and state  and day 0 
rank 0 and person 8 and x position 8 and y position 9 and state C and day 0 
rank 0 and person 9 and x position 9 and y position 10 and state  and day 0 
rank 0 and person 10 and x position 10 and y position 11 and state  and day 0 
rank 0 and person 11 and x position 11 and y position 12 and state  and day 0 
rank 0 and person 12 and x position 12 and y position 13 and state D and day 0 
rank 0 and person 13 and x position 13 and y position 14 and state  and day 0 
rank 0 and person 14 and x position 14 and y position 15 and state  and day 0 
rank 0 and person 15 and x position 15 and y position 16 and state  and day 0 
rank 0 and person 16 and x position 16 and y position 17 and state E and day 0 
rank 0 and person 17 and x position 17 and y position 18 and state  and day 0 
rank 0 and person 18 and x position 18 and y position 19 and state  and day 0 
rank 0 and person 19 and x position 19 and y position 20 and state  and day 0 
rank 0 and person 20 and x position 20 and y position 21 and state F and day 0 
rank 0 and person 21 and x position 21 and y position 22 and state  and day 0 
rank 0 and person 22 and x position 22 and y position 23 and state  and day 0 
rank 0 and person 23 and x position 23 and y position 24 and state  and day 0 
rank 0 and person 24 and x position 24 and y position 25 and state G and day 0 
rank 0 and person 25 and x position 25 and y position 26 and state  and day 0 
rank 0 and person 26 and x position 26 and y position 27 and state  and day 0 
rank 0 and person 27 and x position 27 and y position 28 and state  and day 0 
rank 0 and person 28 and x position 28 and y position 29 and state H and day 0 
rank 0 and person 29 and x position 29 and y position 30 and state  and day 0 
rank 0 and person 30 and x position 30 and y position 31 and state  and day 0 
rank 0 and person 31 and x position 31 and y position 32 and state  and day 0 
rank 0 and person 32 and x position 32 and y position 33 and state I and day 0 
rank 0 and person 33 and x position 33 and y position 34 and state  and day 0 
rank 0 and person 34 and x position 34 and y position 35 and state  and day 0 
rank 0 and person 35 and x position 35 and y position 36 and state  and day 0 
rank 0 and person 36 and x position 36 and y position 37 and state J and day 0 
rank 0 and person 37 and x position 37 and y position 38 and state  and day 0 
rank 0 and person 38 and x position 38 and y position 39 and state  and day 0 
rank 0 and person 39 and x position 39 and y position 40 and state  and day 0 
rank 0 and person 40 and x position 40 and y position 41 and state K and day 0 
rank 0 and person 41 and x position 41 and y position 42 and state  and day 0 
rank 0 and person 42 and x position 42 and y position 43 and state  and day 0 
rank 0 and person 43 and x position 43 and y position 44 and state  and day 0 
rank 0 and person 44 and x position 44 and y position 45 and state L and day 0 
rank 0 and person 45 and x position 45 and y position 46 and state  and day 0 
rank 0 and person 46 and x position 46 and y position 47 and state  and day 0 
rank 0 and person 47 and x position 47 and y position 48 and state  and day 0 
rank 0 and person 48 and x position 48 and y position 49 and state M and day 0 
rank 0 and person 49 and x position 49 and y position 50 and state  and day 0

I also printed out the pointer of “our_States” as well, and it remain the same before and after line 48…

What could have caused cudamemcpy to copy wrong part of the memory back to host?