Options to optimize device code?

I have been looking through all of the compiler options for nvcc and I can’t find any about optimizing the code which executes on the GPU. I know there is the option for the host code by specifing -Ox but is there something similar for the device code? Also when I use gcc there a tons of other options I can use as well such as -funroll-loops. Are there options like this for cuda?

may be of interest:

http://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html#ptxas-options

Also how do I add options to my nvcc using nsight eclipse. I tried to modify the command option Under Project->Properties->Build->Settings->Tool Settings-> NVCC Compiler. I changed it from “nvcc” to “nvcc --someoption”. However when it compiles I see this output "/usr/local/cuda-7.0/bin/nvcc -O3 -ccbin gcc-4.9 -std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_52,code=sm_52 -odir “.” -M -o “binomial.d” “…/binomial.cu” notice that --someoption in not in it. How can I add option in eclipse?