nvcc -O0 not working (CUDA 3.2)

CUDA 3.2.

This compiles without errors but clearly doesn’t work. I still get optimized code!
/usr/local/cuda/bin/nvcc -gencode=arch=compute_20,code="sm_20,compute_20" -m32 -O0 --ptxas-options -O0 --compiler-options -fno-strict-aliasing -I. -I/usr/local/cuda/include -I…/…/common/inc -I…/…/…/shared//inc -DUNIX -o vectorAdd.cuo -c vectorAdd.cu

Thanks a lot for responding.

Reza.

Okay, it sort-of works. I guess it’s eliminating completely unused variables/counters and such.
It apparently works for me now.

Thanks,
Reza.

Actually, I’m very sorry, my question still stands:

-O0 isn’t working. Why?

Reza.

I believe Nvidia made the compiler execute many optimization passes unconditionally since e.g. compute capability 1.x devices have to use inlining to make functions work at all.

I think -O applies to host code and (I think) it does not appply to GPU kernel code.

Bill

Yes, I suspect you’re right. I think I strace’d my nvcc and it’s just passing that to gcc. Presumably that’s all it’s doing(?)

Thanks.

Oops, yes. You need to pass [font=“Courier New”]–opencc-options=-O0[/font] to nvcc. Somehow I missed you were doing that for ptxas and for the host compiler, but not for nvopencc.

Oh thanks! nvcc.doc probably should make that more clear.

I appreciate it.

What does the number 0 following -O mean? To disable optimization? I didn’t find any information about this from NVCC manual. External Image

Heh… I would definitely believe so because that’s the tradition in gcc, but a quick glance at the nvcc manual didn’t turn up anything for me either.

You might try it and then look at the assembly code like this to see if it does what you want:

cuobjdump -sass test > machine-code.txt

(where test is an executable CUDA program.)

Do checkout other cuobjdump options by just running ‘cuobjdump’(with --help).