64 bits device code

Magorath · August 31, 2010, 10:47am

Hi all

What is exactly the point of using 64 bits pointers on the device ? I mean, with 32 bits you can address memory up to 4GB, which is the maximum you can find on a card.

Am I missing something or is the 32 bits code sufficient ?

Thanks for your answers.

avidday · August 31, 2010, 10:58am

There are, or will be very soon, 6Gb Fermi based Telsa cards coming, and the GPU has the ability to map host memory, which could potentially be an aperture larger than 4Gb.

avidday · August 31, 2010, 10:58am

There are, or will be very soon, 6Gb Fermi based Telsa cards coming, and the GPU has the ability to map host memory, which could potentially be an aperture larger than 4Gb.

Magorath · August 31, 2010, 1:45pm

This means that if I do not want to use more than 4Gb memory or access host memory using 64 bits will just waste registers. Thanks.

Magorath · August 31, 2010, 1:45pm

This means that if I do not want to use more than 4Gb memory or access host memory using 64 bits will just waste registers. Thanks.

avidday · August 31, 2010, 1:59pm

It is worth pointing out that if you are using a 64 bit host operating system, existing versions of CUDA use 64 bit values for pointers anyway.

avidday · August 31, 2010, 1:59pm

It is worth pointing out that if you are using a 64 bit host operating system, existing versions of CUDA use 64 bit values for pointers anyway.

Magorath · August 31, 2010, 2:10pm

In this case, I’m screwed…

Many thanks anyway for your help.

Magorath · August 31, 2010, 2:10pm

In this case, I’m screwed…

Many thanks anyway for your help.

avidday · August 31, 2010, 2:22pm

So don’t use pointers if you have code that needs to build trees or graphs or whatever. Use array indexing and store the indices using a 32 bit type instead.

avidday · August 31, 2010, 2:22pm

So don’t use pointers if you have code that needs to build trees or graphs or whatever. Use array indexing and store the indices using a 32 bit type instead.

tera · August 31, 2010, 2:41pm

Alternatively, use the driver API and compile device code with -m32.

tera · August 31, 2010, 2:41pm

Alternatively, use the driver API and compile device code with -m32.

tmurray · August 31, 2010, 4:38pm

Loading 32-bit kernels in a 64-bit host app does not work with new applications in CUDA 3.2 (for various reasons that should be pretty obvious once it’s out). You shouldn’t be doing that now.

tmurray · August 31, 2010, 4:38pm

Loading 32-bit kernels in a 64-bit host app does not work with new applications in CUDA 3.2 (for various reasons that should be pretty obvious once it’s out). You shouldn’t be doing that now.

Magorath · September 1, 2010, 6:32am

Ok. Thanks.

Magorath · September 1, 2010, 6:32am

Ok. Thanks.

shawkie · September 1, 2010, 8:54am

I’m slightly concerned about this. I have an application with a hand-coded sm_13 cubin targetting G90/G92/GT200. These are all fundamentally 32-bit GPUs right? They are not for instance suddenly going to be able to address >4GB of system memory with CUDA 3.2 are they? Device pointers returned from cudaMalloc() will still be in the 0-4GB range won’t they? Given that none of the kernels take pointers as parameters (they take pointers but cast as 32-bit integers) how will CUDA 3.2 actually know that my cubin is 32-bit when I try to load it from a 64-bit application? I’m not trying to be awkward but I really can’t afford the expense in terms of registers, constant memory and instructions of doing a full 64-bit treatment of pointers on 32-bit GPUs.

shawkie · September 1, 2010, 8:54am

I’m slightly concerned about this. I have an application with a hand-coded sm_13 cubin targetting G90/G92/GT200. These are all fundamentally 32-bit GPUs right? They are not for instance suddenly going to be able to address >4GB of system memory with CUDA 3.2 are they? Device pointers returned from cudaMalloc() will still be in the 0-4GB range won’t they? Given that none of the kernels take pointers as parameters (they take pointers but cast as 32-bit integers) how will CUDA 3.2 actually know that my cubin is 32-bit when I try to load it from a 64-bit application? I’m not trying to be awkward but I really can’t afford the expense in terms of registers, constant memory and instructions of doing a full 64-bit treatment of pointers on 32-bit GPUs.

cbuchner1 · September 1, 2010, 10:29am

From what I remember from the CUDA Toolkit 3.2 Readiness TechBrief that was recently posted on the developer site, it stated that a compatibility layer allows you to still use 32 bit kernels on 64 bit architectures. The new API features from SDK versions 3.2 and later will not be added to this layer. Which means if you insist on using 32 bit kernels in 64 bit apps, you won’t be getting all the goodies from CUDA SDK 3.2 and above.

Old 64 bit apps using 32 bit kernels will continue to run.

Just out of curiosity: what field of science or technology requires hand coded cubins? ;) I usually beat nvcc with a stick until it gives me the low register count I need.

Topic		Replies	Views
64 bits device code CUDA Programming and Performance	4	6185	September 2, 2010
32-bit nvcc makes faster GPU code than 64-bit variant In CUDA version 2.1 CUDA Programming and Performance	9	10488	February 14, 2009
Mixed 32/64 compilation on 0.9 CUDA Programming and Performance	7	7799	August 4, 2007
GTX TITAN X and other >4GB VRAM cards CUDA Programming and Performance	10	2574	June 16, 2015
64-bit versus 32-bit CUDA code Any benefit at all? CUDA Programming and Performance	5	12949	November 3, 2009
Can a Kernel be too big?? CUDA_ERROR_NO_BINARY_FOR_GPU error 209 CUDA Programming and Performance	11	3054	November 13, 2017
64 vs 32 bit Why 64 bit code is significantly slower than 32 bit code? CUDA Programming and Performance	19	4250	October 11, 2010
Can a CUDA kernel read "mapped, pinned" host memory through a "Device Pointer"? CUDA Programming and Performance	10	2841	November 20, 2012
Say goodbye to 32-bit kernels! CUDA Programming and Performance	2	1076	July 16, 2015
Using Shared Memory in CUDA C/C++ Technical Blog	36	2003	October 8, 2020

64 bits device code

Related topics