64 bits device code

Hi all

What is exactly the point of using 64 bits pointers on the device ? I mean, with 32 bits you can address memory up to 4GB, which is the maximum you can find on a card.

Am I missing something or is the 32 bits code sufficient ?

Thanks for your answers.

There are, or will be very soon, 6Gb Fermi based Telsa cards coming, and the GPU has the ability to map host memory, which could potentially be an aperture larger than 4Gb.

There are, or will be very soon, 6Gb Fermi based Telsa cards coming, and the GPU has the ability to map host memory, which could potentially be an aperture larger than 4Gb.

This means that if I do not want to use more than 4Gb memory or access host memory using 64 bits will just waste registers. Thanks.

This means that if I do not want to use more than 4Gb memory or access host memory using 64 bits will just waste registers. Thanks.

It is worth pointing out that if you are using a 64 bit host operating system, existing versions of CUDA use 64 bit values for pointers anyway.

It is worth pointing out that if you are using a 64 bit host operating system, existing versions of CUDA use 64 bit values for pointers anyway.

In this case, I’m screwed…

Many thanks anyway for your help.

In this case, I’m screwed…

Many thanks anyway for your help.

So don’t use pointers if you have code that needs to build trees or graphs or whatever. Use array indexing and store the indices using a 32 bit type instead.

So don’t use pointers if you have code that needs to build trees or graphs or whatever. Use array indexing and store the indices using a 32 bit type instead.

Alternatively, use the driver API and compile device code with -m32.

Alternatively, use the driver API and compile device code with -m32.

Loading 32-bit kernels in a 64-bit host app does not work with new applications in CUDA 3.2 (for various reasons that should be pretty obvious once it’s out). You shouldn’t be doing that now.

Loading 32-bit kernels in a 64-bit host app does not work with new applications in CUDA 3.2 (for various reasons that should be pretty obvious once it’s out). You shouldn’t be doing that now.

Ok. Thanks.

Ok. Thanks.

I’m slightly concerned about this. I have an application with a hand-coded sm_13 cubin targetting G90/G92/GT200. These are all fundamentally 32-bit GPUs right? They are not for instance suddenly going to be able to address >4GB of system memory with CUDA 3.2 are they? Device pointers returned from cudaMalloc() will still be in the 0-4GB range won’t they? Given that none of the kernels take pointers as parameters (they take pointers but cast as 32-bit integers) how will CUDA 3.2 actually know that my cubin is 32-bit when I try to load it from a 64-bit application? I’m not trying to be awkward but I really can’t afford the expense in terms of registers, constant memory and instructions of doing a full 64-bit treatment of pointers on 32-bit GPUs.

I’m slightly concerned about this. I have an application with a hand-coded sm_13 cubin targetting G90/G92/GT200. These are all fundamentally 32-bit GPUs right? They are not for instance suddenly going to be able to address >4GB of system memory with CUDA 3.2 are they? Device pointers returned from cudaMalloc() will still be in the 0-4GB range won’t they? Given that none of the kernels take pointers as parameters (they take pointers but cast as 32-bit integers) how will CUDA 3.2 actually know that my cubin is 32-bit when I try to load it from a 64-bit application? I’m not trying to be awkward but I really can’t afford the expense in terms of registers, constant memory and instructions of doing a full 64-bit treatment of pointers on 32-bit GPUs.

From what I remember from the CUDA Toolkit 3.2 Readiness TechBrief that was recently posted on the developer site, it stated that a compatibility layer allows you to still use 32 bit kernels on 64 bit architectures. The new API features from SDK versions 3.2 and later will not be added to this layer. Which means if you insist on using 32 bit kernels in 64 bit apps, you won’t be getting all the goodies from CUDA SDK 3.2 and above.

Old 64 bit apps using 32 bit kernels will continue to run.

Just out of curiosity: what field of science or technology requires hand coded cubins? ;) I usually beat nvcc with a stick until it gives me the low register count I need.