FP16AndFP32 speed question

454193977 · May 21, 2019, 8:50am

Why is the FP16 much slower than the FP32 on jetson axiver? I’m just doing simple addition。
Cuda 10.0
coda like this

global static void squaresSum_half2(half2 *data, half2 *sum,half2 des)
{
for (int i=0;i<51210;i++)
{
CUDA_KERNEL_LOOP(i,HLAF2_DATA_SIZE) {
des[i] = __hadd2(data[i],sum[i]);
}
}

}

global static void squaresSum(float *data, float *sum,float des)
{
for (int i=0;i<51210;i++)
{
CUDA_KERNEL_LOOP(i,FLOAT_DATA_SIZE) {
des[i] = sum[i]+data[i];
}
}
}

kayccc · May 23, 2019, 5:24am

Hi 454193977,

How are you compiling the code? (What is your exact compile command line?)
How are you running the code? (What is your exact execution command line?)
What is the exact output generated when you run it on your Xavier?
Which Jetpack is installed on your Xavier?

454193977 · May 23, 2019, 5:54am

I solved this problem because I used the debug compile parameter。

AastaLLL · May 27, 2019, 4:51am

Good to know this.
Thanks for the feedback.

Topic		Replies	Views
FP16 vs FP32 CUDA Programming and Performance	3	2372	May 23, 2019
Cufft2d FP16 and BF16 is slower than FP32 GPU-Accelerated Libraries cufft	1	665	June 9, 2023
CuFFT FP16 is slower that FP32 Jetson Xavier NX cuda	5	1127	April 5, 2023
Xavier: call of Cuda function is too slow Jetson AGX Xavier cuda	2	487	October 18, 2021
On Jetson Xavier, which is faster: pseudo FP16 or true FP16? Jetson AGX Xavier tensorrt	5	498	June 29, 2022
Jetson AGX xavier fp64 performance Jetson AGX Xavier performance	19	2027	September 19, 2021
Confirming expected performance of INT8 vs. FP16 vs. FP32 Jetson AGX Xavier	2	3661	October 18, 2021
2D-FFT Benchmarks on Jetson AGX with various precisions Jetson AGX Xavier cuda	6	2727	October 18, 2021
FP16 cudnnConvolutionForward cuDNN	1	507	June 14, 2019
TX2 with FP16 Running Slower than FP32 Jetson TX2	22	4202	October 18, 2021

FP16AndFP32 speed question

Related topics