I compiled sample.cu file to sample.ptx, then I changed something in sample.ptx. How can I generate executable from sample.ptx and sample.cu?
You can use nvcc to directly compile ptx input to a cubin file which you can load with the driver API, or generate output for a fatbinary file.
Thank you for the reply. Could you please explain a bit more? How can I generate output for a fat binary file?
As stated, one option is to use the CUDA Driver API to explicitly load PTX sources as modules. This presents some challenges as you might not already be familiar with the CUDA Driver API, or your existing host application may need to be rewritten.
An alternative is to use GPU Ocelot as the CUDA Runtime API implementation instead of libcudart. Ocelot includes additional API functions to register PTX on the fly and execute kernels. See the following example (from a source file in Ocelot’s unit test suite):
... // // Ocelot API function: // register a PTX module explicitly and bind to string 'indirectCall.cu' // std::ifstream file("ocelot/cuda/test/functions/indirectCall.ptx"); registerPTXModule(file, "indirectCall.cu"); // // configure call using existing CUDA Runtime API functions // dim3 block(32, 1); dim3 grid((N + block.x - 1) / block.x, 1); cudaConfigureCall(grid, block); cudaSetupArgument(&A_gpu, sizeof(A_gpu), 0); cudaSetupArgument(&P, sizeof(P), sizeof(A_gpu)); // // Ocelot API function: // launch kernel referenced by module name and kernel name // launch("indirectCall.cu", "kernelEntry"); ...