Why FFMA instructions are still there even if --fmad=false is set?

502776420 · June 6, 2021, 3:37am

Hi, all.

I’m writing kernels for pytorch and I want to disable fused multiply-add for some reasons. I added ‘–fmad=false’ according to the official tutorial. However, when I checked SASS assembly of generated pyd by cuobjdump to make sure everything correct, I found only part of FFMA instructions replaced.

That’s wired. Did I do something wrong or is that the expected behavior?
I’m using win10, pytorch1.7 + cuda11.0, gencode=arch=compute_61,code=sm_61

njuffa · June 6, 2021, 4:13am

-fmad=false prevents the compiler from contracting an FMUL and a dependent FADD into an FMA. Calls to the standard math functions fma() and fmaf() will result in FMA instructions (FFMA, DFMA) being emitted regardless of the setting of this switch. Such calls can occur inside inlined standard math functions, for example.

Topic		Replies	Views
How to disable the CUDA FMA(Fused Multiply-Add) on pytorch CUDA Setup and Installation	0	581	June 11, 2020
Disable Fused Multiply-Add(FMA) with Numba CUDA Programming and Performance	7	2368	March 16, 2017
Preventing fused multiply-add with OpenCL CUDA Programming and Performance	0	10002	April 12, 2011
fma() CUDA Programming and Performance	2	8829	April 20, 2014
MAD or FMAD CUDA Programming and Performance	2	4186	June 30, 2010
Is there multiply-add instr. for interger or char? CUDA Programming and Performance	0	1533	March 13, 2008
instruction or operation CUDA Programming and Performance	16	3295	March 28, 2019
code complied with -g -G is different from that compiled with -O CUDA Programming and Performance	10	1387	March 8, 2014
Different results in Debug and Release mode compile CUDA Programming and Performance	9	4045	October 20, 2015
cuda double multiply error CUDA Programming and Performance	4	912	July 15, 2019

Why FFMA instructions are still there even if --fmad=false is set?

Related topics