I can access to only the first 8 elements of the array cannot acces to every element of the array

kubush · October 26, 2009, 4:13pm

Hi,

I am working on NVIDIA Tesla. I have a 1D array and I would like to assign every element to a thread, thus have number of threads = array size. Whatever the thread/block/grid structure I use, I can only access the first 8 elements, never the rest.

I wrote several CUDA programs with similar/different data structures on other platforms, never had something similar. What is the point I am missing?

Thanks in advance,

Best

avidday · October 26, 2009, 7:57pm

Some code might be useful. “Hello my program doesn’t work, how do I fix it?” is not an easy question to answer without at least a modicum of detail…

kubush · October 26, 2009, 9:33pm

This is really basic, I just try accessing the data. I even tried to put some stupid thread structure, the result is always the same.

main:
int hostX = (int)malloc(sizeof(int) * N);
for (i = 0; i < N; i++){
hostX[i] = i;
}

int deviceX = NULL;
CUDA_SAFE_CALL(cudaMalloc((void*) &deviceX, N));

CUDA_SAFE_CALL(cudaMemcpy(deviceX, hostX, N, cudaMemcpyHostToDevice));

//dim3 block(N/8, 1, 1); // whatever I put in
//dim3 threads(N, 1, 1); // whatever I put in
dim3 threads(N, 1, 1); // whatever I put in, let’s leave it this time

access<<<1, threads>>> (deviceX);
cudaThreadSynchronize();

kernel:
global void access(int *x){
printf(“I am reading x[%d] = %d\n”, threadIdx.x, x[threadIdx.x]); // I change array/thread index according to the block structure
}

cbuchner1 · October 26, 2009, 11:05pm

CUDA_SAFE_CALL(cudaMemcpy(deviceX, hostX, N, cudaMemcpyHostToDevice));

how about N * sizeof(int)

you got that right in the allocation - so why not in the memcpy too?

Jeroen · October 27, 2009, 8:22am

Same with the cudaMalloc call, you only allocate N bytes instead of sizeof(int)*N bytes.

kubush · October 27, 2009, 1:55pm

Sooo simple! Thanks :)

Topic		Replies	Views
number of threads and number of data CUDA Programming and Performance	2	1130	July 16, 2009
thread / block allocation in function of data size CUDA Programming and Performance	5	4284	November 9, 2009
Quick Thread Question Regarding Calling a kernel CUDA Programming and Performance	13	3635	June 26, 2008
Urgent help with threads please! CUDA Programming and Performance	21	10817	March 6, 2008
Weird behavior of CUDA CUDA Programming and Performance	6	5559	February 13, 2008
Problem with getting data from blocks CUDA Programming and Performance	3	2667	December 10, 2007
Unexpected behavior with varying number of threads per block CUDA Programming and Performance	2	3440	November 5, 2008
Multiplying two arrays CUDA Programming and Performance	6	5228	May 7, 2008
most general form for thread access? CUDA Programming and Performance	10	2549	February 21, 2010
Can some one check this for me please..... Newbie needs help learning CUDA Programming and Performance	2	2636	April 10, 2008

I can access to only the first 8 elements of the array cannot acces to every element of the array

Related topics