How to do -O3 optimization in visual Studio for CUDA code

I don’t know much about nvcc compilation and related stuff. I read that by setting the flag -O3 for nvcc compilation, the computational time can be improved.

PROBLEM: I am using Visual Studio 2013. I compile my code using it (by building). I can open the properties of compiler and linker by opening the properties of the project. Now, my question is that how can I set the -O3 flag in Visual studio to get the -O3 optimizations?

-O3 is the default; you don’t have to specify anything. Alternatively, goto CUDA C/C++ -> Command Line and put -Xptxas -O3.

http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#ptxas-options

Even if I put -Xptxas -O3 in “CUDA C/C++ -> Command Line”, I can not see any changes in the lines below. The lines written below are the things written at my “CUDA C/C++ -> Command Line”.

Driver API (NVCC Compilation Type is .cubin, .gpu, or .ptx)

set CUDAFE_FLAGS=–sdk_dir “C:\Program Files (x86)\Windows Kits\8.1”
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\nvcc.exe” --use-local-env --cl-version 2013 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64” --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -o x64\Release%(Filename)%(Extension).obj “%(FullPath)”

Runtime API (NVCC Compilation Type is hybrid object or .c file)

set CUDAFE_FLAGS=–sdk_dir “C:\Program Files (x86)\Windows Kits\8.1”
“C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.0\bin\nvcc.exe” --use-local-env --cl-version 2013 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin\x86_amd64” --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -Xcompiler "/EHsc /nologo /Zi " -o x64\Release%(Filename)%(Extension).obj “%(FullPath)”

Of course not; that’s why it’s called additional options. You have to see VS2013’s build output for the added options.

yes, in the build output I am able to see it. But I have set,

-Xptxas=-O3

instead of

-Xptxas -O3

. Are both the same things??

Moreover, one answer to this post https://devtalk.nvidia.com/default/topic/497960/is-o3-always-good-option-in-nvcc-compiling-with-nvcc-when-there-is-no-error-at-least-/#reply says to use

-Xopencc=-O3

. Could you please tell me the difference in it or may be suggest some literature for that.

They are the same. See
http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/#command-option-types-and-notation

The default optimization level for the PTXAS component of the CUDA compiler (the PTX-to-SASS compilation) is the same as -Xptxas -O3, so you don’t need to set anything. One would typically lower the optimization setting via this flag, e.g. -Xptxas -O2.

All -Xopencc settings only apply to the Open64 component of the CUDA compiler (for C++ -to-PTX compilation). Already several years ago, Open64 was relegated to the role of legacy compiler for sm_1x architectures only. With CUDA 7.0, sm_1x support and the Open64 compiler were removed entirely.

The CUDA compiler now uses NVVM for C++ -to-PTX compilation. NVVM is derived from LLVM, a modern compiler infrastructure. I am not aware of any -O optimization switches one can pass to NVVM, by default it applies full optimization.

To set the optimization level for the host compiler, one can use -Xcompiler, however the Visual C/C++ compiler does not seem to have an optimization level /O3 (at least not in MSVS 2010). I does have /Ox, though, so -Xcompiler /Ox is accepted.