Use of global memory

Hi guys!!

I hope you can help me with this question 'cause i don’t know what to do …

if we have the following code:

__global__ void function(float *A, int *count, int limit){
     int i = blockIdx.x * blockDim.x + threadIdx.x;
             A[*count]= randFunction();


More than one thread are going to put 12.0 ?, isn’t it?

if yes, what do i have to do if i just want to have 12.0 value when *count==0, i mean when A[0]=12.0f;

Doesn’t exist something like semaphore?

Note: sorry for my writting, but i just can speak a little of english so i hope u can understand… thanks a lot!!! :)

Do you need something like atomicMin(), atomicMax() or atomicCAS()?

Can you explain what is that please?

perhaps explain what int* count is/ does;

…and perhaps you are really just wishing to use A[i] or A[count[i]], rather than a[*count]…

Yes, assuming you launch more than 1 thread, then “More than one thread are going to put 12.0”. In fact, the code in this kernel is not conditioned on thread ID at all, so every thread will behave exactly the same way with this particular kernel.

I don’t really understand the next question. If you only want one thread in the whole grid to write the value of 12.0, you could do:

if ((i==0) && (*count == 0)) A[*count] = 12.0f;

however as already pointed out, this code is likely to have threads stepping all over each other.

this line will also probably not produce the result you want:


thanks all of you for your comments.

the thing es the next:

i’m programming multi-objective metaheuristics and i have troubles implementing the pareto front. So, in the implementation i have the fitness’ population ( A ), and a counter ( *count ) that means the number of element in the pareto front. When the pareto front is empty, i have to put the first element, when the pareto front isn’t empty i have to verify the pareto donimance, but this counter is verir important 'cause is the number of elements.

__global__ void EvaluarEnFrentePareto_kernel(float *A, float *ConjSolPareto, int *nElementosFrenteCont, float *FitnessPobl, float *x, int nObjetivos, int nDimenciones, int nParticulas, char *TipoOptimizacion, int nElementosFrente){
    unsigned int i = blockIdx.x * blockDim.x + threadIdx.x;
	if(*nElementosFrenteCont==0){//the front is empty
		memcpy(&A[(*nElementosFrenteCont) * nObjetivos], &FitnessPobl[i*nObjetivos],sizeof(float)*nObjetivos);
		memcpy(&ConjSolPareto[(*nElementosFrenteCont) * nDimenciones], &x[i* nDimenciones],sizeof(float)*nDimenciones);
	  int j;

	  for(j=0; j< *nElementosFrenteCont ; j++){
            //if is not empty, verify pareto dominance with each element in pareto front
		    memcpy(&A[j* nObjetivos], &FitnessPobl[i*nObjetivos],sizeof(float)*nObjetivos);
		    memcpy(&ConjSolPareto[j * nDimenciones], &x[i* nDimenciones],sizeof(float)*nDimenciones);


Thanks a lot!!