CUDA & modulos

davidv1992 · February 27, 2010, 4:44pm

I wrote some pieces of code using the modulo operators, and noticed that these, especially in comparisons, give weird results.

Example:

#include <stdio.h>

#include <assert.h>

#include <math.h>

#define BLOCKSIZE 512

__global__ void kfinddiv(int *d_a, int k)

{

	long long idx = gridDim.x*blockIdx.y+blockDim.x*blockIdx.x+threadIdx.x;

	if (idx+2 < k/2)

	{

		if (k%(idx+2) == 0)

			*d_a = idx+2;

	}

}

int main()

{

	int input;

	scanf("%d", &input);

	

	dim3 dimGrid(int(ceil(float(input/BLOCKSIZE))), int(ceil(float(input/BLOCKSIZE))));

	dim3 dimBlock(BLOCKSIZE);

	

	printf("%d\n", int(ceil(float(input/BLOCKSIZE))));

	

	int *d_a;

	int *h_a;

	h_a = (int*)malloc(sizeof(int));

	cudaMalloc(&d_a, sizeof(int));

	*h_a = 0;

	cudaMemcpy(d_a, h_a, sizeof(int), cudaMemcpyHostToDevice);

	

	kfinddiv<<< dimGrid, dimBlock >>>(d_a, input);

	

	cudaMemcpy(h_a, d_a, sizeof(int), cudaMemcpyDeviceToHost);

	printf("%d\n", *h_a);

	return 0;

}

This works fine when compiled with -deviceemu, but as soon as it starts running on the real deal the modulo seems to be wrongly evaluated.

Does anyone know whether, or when this will be fixed. (I’m 100% sure it is the modulo operator: add an if(idx == 0 && k%(idx+2)) at the end and use an even number as imput and you’ll see what I mean. then try removing the modulo portion)

LSChien · February 28, 2010, 1:06am

I am confused you index computation

long long idx = gridDim.x*blockIdx.y+blockDim.x*blockIdx.x+threadIdx.x;

If you want to sweep all elements of 2-D data, then it should be

long long idx = (gridDim.x*blockIdx.y+blockDim.x)*blockIdx.x+threadIdx.x;

second you have race condition on updating *d_a

if (idx+2 < k/2)

	{

		if (k%(idx+2) == 0)

			*d_a = idx+2;

	}

Could you provide CPU version, then we can check your parallelized version?

Topic		Replies	Views
errors in modulo incorrect results when % is used for a ciculair buffer CUDA Programming and Performance	4	774	July 21, 2011
division and modulo operations on indices CUDA Programming and Performance	9	957	September 6, 2017
I have a question about Cuda CUDA Programming and Performance	1	411	October 15, 2019
Odd code exits kernel CUDA Programming and Performance	10	4242	July 16, 2008
error in modulo operation CUDA Programming and Performance	12	16258	September 20, 2009
Possible bug with unsigned 64 bit int modulo CUDA Programming and Performance	8	9566	July 14, 2009
Throughput test (add, mul, mod) giving strange result CUDA Programming and Performance	2	1265	February 4, 2014
Inst_fp_32 and inst_fp_64 metrics CUDA Programming and Performance	9	1730	April 7, 2018
Handling of uint and the modulo operator CUDA Programming and Performance	4	7221	March 25, 2010
Speed of modulo operator in CUDA CUDA Programming and Performance	5	4598	September 13, 2019

CUDA & modulos

Related topics