FLOAT＆INT speed performance in cuda kernel test

haifengli · September 10, 2018, 12:12pm

hello,nvidia

Since I want to know the speed performance between float and int/long type operation in cuda. I made a test program(relese version) like this :

#include <opencv2/imgproc/imgproc.hpp>
#include “opencv2/core.hpp”
#include <opencv2/core/utility.hpp>
#include “opencv2/highgui.hpp”
#include “opencv2/cudaarithm.hpp”
#include “device_launch_parameters.h”

#include “cuda_runtime_api.h”
#include “helper_cuda.h”
#include <time.h>

#define mySIZE 10000

global void testFloat()
{
float sum = 0.f;
float tmp = 1.f;
for (int i = 0; i < mySIZE; i++)
{
sum += (tmp * i * 1.123f) / 2.f;

}
printf("================%f\n", sum);

}

global void testInt()
{
long long sum = 0;
long long tmp = 1;

for (int i = 0; i < mySIZE; i++)
{
	sum += (tmp * i * 1.123f ) / 2.f;

}
printf("+++++++++++++++++%lld\n", sum);

}

int main()
{

time_t str = clock();
testFloat << <1, 1 >> > ();
cudaDeviceSynchronize();

time_t end = clock();

printf("float time%fms\n", (double)(end - str) / 1000.f);

time_t str1 = clock();
testInt << <1, 1 >> > ();
cudaDeviceSynchronize();

time_t end1 = clock();

printf("long time%fms\n", (double)(end1 - str1) / 1000.f);


return 0;

}

do 10000 times add operation including muliply & divsion. And in case of speed optimazition, I printed the result finally.

but I feel the way to test is not correct. the FLOAT fast than INT 200 times. did I test correctly？

and the result:
================28072190.000000
float time248.643000ms
+++++++++++++++++28070078
long time0.902000ms

nvidia card P4

AastaLLL · September 11, 2018, 5:50am

Hi,

Some issue from testInt function:

sum += (tmp * i * 1.123f ) / 2.f;

The result is multiplied/divided with a float number.
The temporal results will be stored in the float format and rounded into integer in the end.

If you want to test an integer calculation, it’s recommended to use integer multiplier and divider.

Thanks.

haifengli · September 11, 2018, 7:30am

We’ll update our test code with you comment.

Thanks.

Topic		Replies	Views
float operation is better than int operation? CUDA Programming and Performance	2	1794	June 21, 2011
Float type performance comparisons CUDA Programming and Performance	2	5320	June 25, 2007
Speed of processing double, float, short CUDA Programming and Performance	1	1969	December 16, 2012
Is float computation really so slow? CUDA Programming and Performance	3	812	November 25, 2014
int32 Vs float32 performance difference and analysis advice CUDA Programming and Performance	2	6295	July 31, 2017
Gaah! What is CUDA doing with my int->float cast? CUDA Programming and Performance	0	1687	October 30, 2007
Measurements of different CUDA operator throughputs CUDA Programming and Performance	32	50271	August 24, 2009
About instruction throughputs CUDA Programming and Performance	9	5266	May 27, 2010
Seemingly insignificant changes result in a 100x kernel slowdown CUDA Programming and Performance	2	596	February 14, 2020
performance of integer vs float CUDA Programming and Performance	10	21931	June 15, 2009

FLOAT＆INT speed performance in cuda kernel test

Related topics