Simple Iterating Code -- HELP!

Hey everyone,
I have a code for a simple diffusion problem (with Q=5 iterations), which gives me an error message; at Line 15 (two times) and Line 18 (three times) it says, “error: expression must have pointer-to-object type”. Does anyone know what to do? (Again, I just want to run this diffusion equation for a given number of steps).
Thanks for looking!

include <stdio.h>

include <stdlib.h>

include <cuda.h>

global void incrementArrayOnDevice(float c, float q, int N)
int idx = blockIdx.x
blockDim.x + threadIdx.x;
int x;
int Q = 5;
for(x=0; x<Q; x++)
q[idx][x] = c[idx][x];
for(idx=1; idx<N-1; idx++)
c[idx][x] = 0.25
q[idx-1] + 0.5q[idx][x] + 0.25q[idx+1];
int main(void)
float *a_h, *b_h;
float *a_d, q;
int i;
int N = 10;
size_t size = N
a_h = (float *)malloc(size);
b_h = (float *)malloc(size);
cudaMalloc((void **) &a_d, size);
cudaMalloc((void **) &q, size);
for (i=0; i<1; i++)
a_h[i] = 5;
for (i=1; i<N; i++)
a_h[i] = 0;
cudaMemcpy(a_d, a_h, sizeof(float)*N, cudaMemcpyHostToDevice);
int blockSize = 4;
int nBlocks = N/blockSize + (N%blockSize == 0?0:1);
incrementArrayOnDevice <<< nBlocks, blockSize >>> (a_d, q, N);
cudaMemcpy(b_h, a_d, sizeof(float)*N, cudaMemcpyDeviceToHost);

for(i=0; i<N; i++)
free(a_h); free(b_h); cudaFree(a_d);

Well, I don’t see an issue in the first function. Which compiler/system are you building with/on?

c[idx] = 0.25q[idx-1][x] + 0.5q[idx] + 0.25*q[idx+1];

The declaration of q is float *, but your trying to reference it as a 2 dimensional array. The compiler wouldn’t know how big each of the rows are to calculation the offset.


Thanks, that puts me on the right track. Do you happen to know how I can specify the size of each row?

Compiler is “nvcc,” system is Mac OS X 10.5.

Simplest thing is to do your own index calculation:

So assuming a variable or define named W, this:

c[idx][x] = 0.25*q[idx-1][x] + 0.5*q[idx][x] + 0.25*q[idx+1][x];

would become:

c[idx*W+x] = 0.25*q[(idx-1)*W+x] + 0.5*q[idx*W+x] + 0.25*q[(idx+1)*W+x];

Since your looping you could get rid of the multiplication by just adding W to an index each time through the loop.