Hello,
Since some recent version of CUDA, Visual Studio integration provides “Use Fast Math” property (CUDA C/C++ > Device > Use Fast Math) but this seems no effect for ptx generation (NVCC Compilation Type == Generate device-only .ptx file (-ptx)
).
I tested with CUDA file like:
#include <cstdio>
#include <cstdint>
extern "C" __global__ void test(float* values) {
uint32_t globalIdx = blockDim.x * blockIdx.x + threadIdx.x;
values[globalIdx] = std::sin(values[globalIdx]);
}
Generated ptx file doesn’t contain sin.approx.ftz.f32
instruction even if I set “Use Fast Math” property to true. When I manually specify “–use_fast_math” from the additional options for command line, it contains.
CUDA 11.1
Visual Studio Community 2019 16.7.6
Thanks