Do global memory caching compiler options work on windows? I tried ca cg and see now difference in speed. Also ptx code is the same. I use runtime api. Has anybody get any difference with this option? Does it change ptx code with prefixes?
I think this was mentioned in another forum thread. When you use those options, the cache usage modification is applied by ptxas. You won’t see the modified instructions when you look at the PTX output of nvcc because ptxas has not run yet. Compile a .cubin and use cuobjdump to see what instructions were actually generated after ptxas ran.
Thanks! I was confused because of I see no difference in speed. Btw, if I use 1.2 target, but put this option, wonder, will this option be applyed if the code will run on Fermi card?
No. In order to apply this option to the code, it has to be run through ptxas. And if you run it through ptxas with compute capability 1.2 as target, it will not run on Fermi at all.
I put 1.2 in compiler options and it runs on fermi. It generates code for 1.2 but it runs of fermi.
Hm, can I use this option with runtime api?
It runs because it uses the PTX representation of the code, which is not influenced by the ptxas options. ptxas-compiled code for compute capability 1.x will not run on Fermi.
It will, it will be recoded etc. My program contains a few versions of ptx code, for different arhitectures. But I prefer to run 1.2 on fermi.
Here are my options
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\bin\nvcc.exe" --keep -Xptxas -dlcm=ca -gencode=arch=compute_11,code="sm_11,compute_11" --machine 32 -ccbin “C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin” -use_fast_math -Xcompiler “/EHsc /W3 /nologo /Ox /Zi /MT -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v3.2\include” -maxrregcount=32 --ptxas-options=-v --compile
would i put cg, will it make affect?
Sorry, can’t help you with that one. Maybe someone else can explain that [font=“Courier New”]code=sm_11,compute_11[/font] means that both the PTX code for compute capability 1.1 (which has not run through ptxas and thus is not influenced by [font=“Courier New”]-Xptxas -dlcm=ca[/font]) and the cubin binary (which has run through ptxas and thus uses the cache operator specified by -dlcm=…, but does not run on Fermi) will be included in the binary file, but neither will help getting the right cache operator on Fermi.
By the way, isn’t [font=“Courier New”]-Xptxas -dlcm=ca[/font] the default anyway?
Yes, it is default, i try to switch to another. This options are from visual studio. I put cuda options in a panel. Probably one more nvidia issue. In visual studio, you select which gpu to target. It creates a few different variants of ptx for different archtecture.