| | Executing tool...
Usage : nvcc [options]
Options for specifying the compilation phase
============================================
More exactly, this option specifies up to which stage the input files must be compiled,
according to the following compilation trajectories for different input file types:
.c/.cc/.cpp/.cxx : preprocess, compile, link
.cu : preprocess, cuda frontend, ptxassemble,
merge with host C code, compile, link
.gpu : nvopencc compile into cubin
.ptx : ptxassemble into cubin.
--cuda (-cuda)
Compile all .cu input files to .cu.c output.
--cubin (-cubin)
Compile all .cu/.ptx/.gpu input files to device- only .cubin files. This
step discards the host code for each .cu input file.
--ptx (-ptx)
Compile all .cu/.gpu input files to device- only .ptx files. This step discards
the host code for each of these input file.
--gpu (-gpu)
Compile all .cu input files to device- only .gpu files. This step discards
the host code for each .cu input file.
--preprocess (-E)
Preprocess all .c/.cc/.cpp/.cxx/.cu input files.
--generate-dependencies (-M)
Generate for the one .c/.cc/.cpp/.cxx/.cu input file (more than one input
file is not allowed in this mode) a dependency file that can be included
in a make file.
--compile (-c)
Compile each .c/.cc/.cpp/.cxx/.cu input file into an object file.
--link (-link)
This option specifies the default behavior: compile and link all inputs.
--lib (-lib)
Compile all inputs into object files (if necessary) and add the results to
the specified output library file.
--run (-run)
This option compiles and links all inputs into an executable, and executes
it. Or, when the input is a single executable, it is executed without any
compilation or linking. This step is intended for developers who do not want
to be bothered with setting the necessary cuda dll search paths (these will
be set temporarily by nvcc).
File and path specifications
============================
--output-file (-o)
Specify name and location of the output file. Only a single input file is
allowed when this option is present in nvcc non- linking/archiving mode.
--pre-include ,... (-include)
Specify header files that must be preincluded during preprocessing.
--library ,... (-l)
Specify libraries to be used in the linking stage. The libraries are searched
for on the library search paths that have been specified using option '-L'.
--define-macro ,... (-D)
Specify macro definitions to define for use during preprocessing or compilation.
--undefine-macro ,... (-U)
Specify macro definitions to undefine for use during preprocessing or compilation.
--include-path ,... (-I)
Specify include search paths.
--system-include ,... (-isystem)
Specify system include search paths.
--library-path ,... (-L)
Specify library search paths.
--output-directory (-odir)
Specify the directory of the output file. This option is intended for letting
the dependency generation step (option '--generate-dependencies') generate
a rule that defines the target object file in the proper directory.
--compiler-bindir (-ccbin)
Specify the directory in which the compiler executable (Microsoft Visual
Studion cl, or a gcc derivative) resides. By default, this executable is
expected in the current executable search path.
Options for specifying behaviour of compiler/linker
===================================================
--host-compilation --host-compilation
Specify C vs. C++ language for host compilation.
Allowed values for this option: 'C','C++','c','c++'.
Default value: 'C'.
--profile (-pg)
Instrument generated code/executable for use by gprof (Linux only).
--debug (-g)
Generate debug information for host code.
--optimize (-O)
Specify optimization level for host code.
--shared(-shared)
Generate a shared library during linking. Note: when other linker options
are required for controlling dll generation, use option -Xlinker.
--machine (-m)
Specify 32 vs 64 bit architecture.
Allowed values for this option: 32,64.
Default value: 64.
Options for passing specific phase options
==========================================
These allow for passing options directly to the intended compilation phase. Using
these, users have the ability to pass options to the lower level compilation tools,
without the need for nvcc to know about each and every such option.
--compiler-options ,... (-Xcompiler)
Specify options directly to the compiler/preprocessor.
--linker-options ,... (-Xlinker)
Specify options directly to the linker.
--opencc-options ,... (-Xopencc)
Specify options directly to nvopencc.
--cudafe-options ,... (-Xcudafe)
Specify options directly to cudafe.
--ptxas-options ,... (-Xptxas)
Specify options directly to the ptx optimizing assembler.
Miscellaneous options for guiding the compiler driver
=====================================================
--dont-use-profile (-noprof)
This is intended for use during the cuda build, when no profile is present
yet.
--foreign (-foreign)
This option is for test purposes only. By default, on Gnu platforms gcc/g++
is assumed to be the compiler that is to be used. On pure windows platforms,
the compiler to be used is expected to be cl. This option reverses this assumption.
--dryrun(-dryrun)
Do not execute the compilation commands generated by nvcc. Instead, list
them.
--verbose (-v)
List the compilation commands generated by this compiler driver, but do not
suppress their execution.
--keep (-keep)
Keep all intermediate files that are generated during internal compilation
steps.
--save-temps (-save-temps)
This option is an alias of '--keep'.
--clean-targets (-clean)
This option reverses the behaviour of nvcc. When specified, none of the compilation
phases will be executed. Instead, all of the non- temporary files that nvcc
would otherwise create will be deleted.
--run-args ,... (-run-args)
Used in combination with option -R, to specify command line arguments for
the executable.
--input-drive-prefix (-idp)
On Windows platforms, all command line arguments that refer to file names
must be converted to Windows native format before they are passed to pure
Windows executables. This option specifies how the 'current' development
environment represents absolute paths. Use '-idp /cygwin/' for CygWin build
environments, and '-idp /' for Mingw.
--dependency-drive-prefix (-ddp)
On Windows platforms, when generating dependency files (option -M), all file
names must be converted to whatever the used instance of 'make' will recognize.
Some instances of 'make' have trouble with the colon in absolute paths in
native Windows format, which depends on the environment in which this 'make'
instance has been compiled. Use '-ddp /cygwin/' for a CygWin make, and '-ddp
/' for Mingw. Or leave these file names in native Windows format by specifying
nothing.
--drive-prefix (-dp)
Specifies as both input-drive-prefix and dependency-drive-prefix.
Options for steering GPU code generation
========================================
--gpu-name (-arch)
Specify the name of the nVidia GPU to compile for. This can either be a 'real'
GPU, or a 'virtual' ptx architecture. Ptx code represents an intermediate
format that can still be further compiled and optimized for. depending on
the ptx version, a specific class of actual GPUs.
The architecture specified with this option is the architecture that is assumed
by the compilation chain up to the ptx stage, while the architecture(s) specified
with the -code option are assumed by the last, potentially runtime compilation
stage.
Allowed values for this option: 'compute_10','compute_11','sm_10','sm_11'.
Default value: 'sm_10'.
--gpu-code ,... (-code)
Specify the name of nVidia gpu to generate code for.
Unless option -export-dir is specified (see below), nvcc will embed a compiled
code image in the executable for each specified 'code' architecture, which
is a true binary load image for each 'real' architecture (such as a sm_13),
and ptx code for each virtual architecture (such as compute_10). During runtime,
such embedded ptx code will be dynamically compiled by the cuda runtime system
if no binary load image is found for the 'current' GPU, and provided that
the ptx level is compatible with this current GPU.
Architectures specified for options -arch and -code may be virtual as well
as real, but the 'code' architectures must be compatible with the 'arch'
architecture.
For instance, 'arch'=compute_13 is not compatible with 'code'=sm_10, because
the earlier compilation stages will assume the availability of compute_13
features that are not present on sm_10.
This option defaults to the value of option '-arch'.
Allowed values for this option: 'compute_10','compute_11','sm_10','sm_11'.
--export-dir (-dir)
Specify the name of a file to which all 'external' code images will be copied,
intended as a device code repository that can be inspected by the cuda driver
at application runtime when it occurs in the appropriate device code search
paths.
This file can be either a directory, or a zip file. In either case, this
tool will maintain a directory structure in order to facilitate code lookup
by the cuda driver. When this option is not used, all 'external' images will
be silently discarded. When a directory is specified, but does not currently
exist, then it will be created as a common directory (not a zip file).
--extern-mode (-ext)
Specify which of the listed images will be copied into the directory specified
with option 'export-dir'.
If this option is not specified, the behavior is as follows: if option 'intern-mode'
is specified then all listed images that are not defined as intern will be
considered extern. Otherwise, if neither of these options are specified,
then all listed images will be considered as intern. Note that it is allowed
to both embed code images and keep them extern.
Allowed values for this option: 'all','none','real','virtual'.
--intern-mode (-int)
Specify which of the listed images will be copied into the embedded fat binary
structure (option 'embedded-fatbin').
If this option is not specified, the behavior is as follows: if option 'extern-mode'
is specified then all listed images that are not defined as extern will be
considered extern. Otherwise, if neither of these options are specified,
then all listed images will be considered as intern. Note that it is allowed
to both embed code images and keep them extern.
Allowed values for this option: 'all','none','real','virtual'.
--maxrregcount (-maxrregcount)
Specify the maximum amount of registers that GPU functions can use. Until
a function- specific limit, a higher value will generally increase the performance
of individual GPU threads that execute this function. However, because thread
registers are allocated from a global register pool on each GPU, a higher
value of this option will also reduce the maximum thread block size, thereby
reducing the amount of thread parallelism. Hence, a good maxrregcount value
is the result of a trade-off.
If this option is not specified, then no maximum is assumed. Otherwise the
specified value will be rounded to the next multiple of 4 registers until
the GPU specific maximum of 128 registers.
Options for steering cuda compilation
=====================================
--device-emulation (-deviceemu)
Generate code for the GPGPU emulation library.
--use_fast_math (-use_fast_math)
Make use of fast math library.
Generic tool options
====================
--help (-h)
Print this help information on this tool.
--version (-V)
Print version information on this tool.
--options-file ,... (-optf)
Include command line options from specified file.
|