- Linux/Mac “common.mk”
removed -m32 for CUBIN output
Added support for PTX compilation for CUDA source files
the following specifies that matrixMul_kernel.cu will output PTX
Cuda source files (compiled with cudacc)
PTXFILES := matrixMul_kernel.cu
Windows Cuda.Rules file (no long uses -m32) to generate CUBIN/PTX output, Cuda.rules adds an option to generate PTX and to inline CUDA source with PTX generated assembly
CUDA Driver API samples: simpleTextureDrv, matrixMulDrv, and threadMigration have been modified to reflect the following changes:
Previously when compiling these CUDA SDK samples, gcc would generate a compilation error when building on a 64-bit Linux OS if the 32-bit glibc compatibility libraries are not installed. CUDA Driver API samples have been modified to solve the problem. Device memory pointers are cast and aligned correctly before being passed as parameters into CUDA kernels.
When setting parameters for CUDA kernel functions, the address offset calculation is now properly aligned so that CUDA code and applications will be compatible on 32-bit and 64-bit Linux platforms.
The new CUDA Driver API samples by default build CUDA kernels with the output as PTX instead of CUBIN. The CUDA Driver API samples now use PTXJIT to load and launch CUDA kernel functions.
Using PTXJIT, device memory pointers, and parameter alignments ensure that CUDA programs are compatible with future GPUs.
- Added pitchLinearTexture SDK sample. This sample illustrates how CUDA kernels can texture from pitch linear memory. CUDA kernels can write to pitch linear textures.