The sample program 6_Advanced/ptxjit code doesn’t capture the wall time for a link. Partial code is
CUjit_option options[6];
void *optionVals[6];
float walltime = 0; /* I added this initial value myself */
char error_log[8192], info_log[8192];
unsigned int logSize = 8192;
void *cuOut;
size_t outSize;
int myErr = 0;
std::string module_path, ptx_source;
// Setup linker options
// Return walltime from JIT compilation
options[0] = CU_JIT_WALL_TIME;
optionVals[0] = (void *)&walltime;
...
checkCudaErrors(cuLinkComplete(*lState, &cuOut, &outSize));
printf("CUDA Link Completed in %fms. Linker Output:\n%s\n", walltime,
info_log);
After executing this code, the value of walltime is unchanged.
What actually happens…
I looked at the optionVals[0] with a debugger.
Each of the three Link calls (create, add, complete) stores a floating point value in optionVals[0] lower 32 bits, not affected by its value before that call. I also added several more AddFile calls for dummy PTX files, and each of these produced an optionVals[0] value slightly greater than the previous time.
The documentation for CU_JIT_WALL_TIME is wrong. It should note that the time is written separately for each cuLink call, and only the low 32 bits are overwritten (the high 32 bits on a 64-bit host are ignored). It should also state (and I don’t know the answer here) whether the times are for the individual steps or cumulative for the entire link. If it is cumulative, then state that it does not include any time spent between steps, only the steps themselves.
The sample code is wrong. Line 3 and line 14 are irrelevant. Line 17 ‘walltime’ should read something like ‘* reinterpret_cast< float * > (&optionVals[0]’.