I got a problem with generating number through MTGP32 generator. I already worked with XORWOW or MG32k3a, so I proceed the same way. When I enter a kernel, I copy the state in a local variable and then I work on it. Here I try to do the same thing but the generator keep giving the same random numbers while when I work with a pointer, it’s all OK. Here’s the code with the copy:

```
__global__ void generate_kernel( curandStateMtgp32 *state,
int n )
{
int id = threadIdx.x + blockIdx.x * blockDim.x;
float x;
curandStateMtgp32 localState = state[blockIdx.x];
/* Generate pseudo-random normal variable */
for(int i = 0; i < n; i++) {
x = curand_normal( &localState );
printf("tid: %d x: %f\n", id, x);
}
}
```

And here is the output (tid 1 & 2 got same results each time):

```
tid: 0 x: 0.207837
tid: 1 x: -0.091346
tid: 2 x: 0.294019
tid: 0 x: 2.684819
tid: 1 x: -0.091346
tid: 2 x: 0.294019
tid: 0 x: 1.433268
tid: 1 x: -0.091346
tid: 2 x: 0.294019
```

While when I’m working with a pointer, the result is correct. Here’s the code:

```
__global__ void generate_kernel( curandStateMtgp32 *state,
int n )
{
int id = threadIdx.x + blockIdx.x * blockDim.x;
float x;
curandStateMtgp32 * localState = &state[blockIdx.x];
/* Generate pseudo-random normal variable */
for(int i = 0; i < n; i++) {
x = curand_normal( localState );
printf("tid: %d x: %f\n", id, x);
}
}
```

and the results are:

```
tid: 0 x: 0.207837
tid: 1 x: -0.091346
tid: 2 x: 0.294019
tid: 0 x: 2.684819
tid: 1 x: -1.183960
tid: 2 x: -0.621348
tid: 0 x: 1.433268
tid: 1 x: 0.571323
tid: 2 x: -0.735758
```

Can someone explain me what I’m doing wrong or if it’s a bug from the compiler? I don’t understand why when I’m working with a copy of the state, the first thread got different numbers while the others don’t.

Thank you.

I can post the whole code if you want to test it from yourself.

I’m working with RED HAT 6.x - GPU K20xm - CUDA 5.5 compilation line:

```
nvcc -arch=sm_35 -lcurand x.cu
```