curand_init() in the same kernel function

Hi guys:

I put a topic about Visual Cryptography earlier. Now I found some question about my code. I just want to generate random values for the encryption pixels. The code as below:

int ih = blockIdx.y * blockDim.y + threadIdx.y; // index for height of original image in device data array
  int iw = blockIdx.x * blockDim.x + threadIdx.x; // index for width of original image in device data array

  // Function scope variables
  int track1, track2;
  int count0 = 0;
  int count1 = 0;

	// Unique variables here
	float random;
	unsigned int seed = (unsigned int)clock64();
	curandState s[ih * iw];
	// random number generator block begin
	curand_init(seed, iw, 0, &s);

	random = curand_normal(&s);
	// Random number generator block end	
  
if (iCodecPath == ENCODE) {
	//Setup loop scan over entire original image array	
    //Scan loop begin
	if ((ih < iHeight) && (iw < iWidth)) {	
		if (pImage_d[ih * iWidth + iw] == BLACK) {
			for (track1 = 0; track1 < 2; track1++) {
				for (track2 = 0; track2 < 2; track2++) {
					if (count1 == 2) {
						pShare1_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 0;
						pShare2_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 1;
						count0 = count0 + 1;
					}
					else if (count0 == 2) {
						pShare1_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 1;
						pShare2_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 0;
						count1 = count1 + 1;
					}
					else {
						if (random <= 0.5) {
							pShare1_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 0;
							pShare2_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 1;
							count0 = count0 + 1;
						}
						else {
							pShare1_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 1;
							pShare2_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 0;
							count1 = count1 + 1;
						}
					}
					//__syncthreads();
				}
			}
		}
		else {		
			for (track1 = 0; track1 < 2; track1++) {
				for (track2 = 0; track2 < 2; track2++) {
					if (count1 == 2) {
						pShare1_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 0;
						pShare2_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 0;
						count0 = count0 + 1;
					}
					else if (count0 == 2) {
						pShare1_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 1;
						pShare2_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 1;
						count1 = count1 + 1;
					}
					else {
						if (random <= 0.5) {
							pShare1_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 0;
							pShare2_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 0;
							count0 = count0 + 1;
						}
						else {
							pShare1_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 1;
							pShare2_d[2*(ih*2 + track1)*iWidth + (iw*2 + track2)] = 1;
							count1 = count1 + 1;
						}
					}
					//__syncthreads();
				}
			}
		}		
	}
 }

Question:
Can I use only one time curand_init() for all threads? (Use one kernel function, don’t use two kernel functions of which one used for curand_init().) is that possible?

how many random numbers do you want: one per block, or one per thread?

curand seems to be device function and a sequence generator; it likely stores a sequence count, pointer, or increment in the state register passed to it
therefore, i would think that multiple threads can call curand with the same state register, but sequentially (separately, one at a time), and not concurrently (simultaneously); otherwise, you may very well end up with undefined behaviour
the state register should then be stored in shared memory, and should have the same effect as a single thread calling curand multiple times
something like that