Changing from 64-bit to 32-bit addressing mode Is it possible to force my code to use 32-bit address

kleboeuf · October 7, 2011, 8:28pm

Hello,

I was wondering if it is possible to compile my host code to pass 32-bit addresses to my GPU code. nvcc has an option to compile for a 32-bit machine, but that causes all my host code to compile for a 32-bit architecture, instead of simply using a 32-bit address space, which does not solve my problem.

Ultimately I could install a 32-bit OS, but I figured there should be an easier way.

The reason I am interested is because I’m going to end up using several more registers than necessary for my address pointers, while not coming anywhere close to addressing anything close to 4GB of space. Also, although not a major concern, the 64-bit addresses are probably going cost a little more in terms of address arithmetic.

Any thoughts?

njuffa · October 8, 2011, 1:18am

To allow tight integration of host and device code and the portability of data types, CUDA makes the sizes of device data types equal to the corresponding host data types. This affects the sizes of pointers, “long”, and size_t in particular. When you build for a 64-bit platform, all pointers will be 64-bit pointers, on the host and on the device.

The compiler may be able to optimize out some of the 64-bit operations (and free the corresponding registers) in the device code. This mostly helps on sm_1x platforms where the GPU’s device memory is known not to exceed 4 GB. For sm_2x platforms, GPUs with more than 4 GB of device memory exist, and thus there is comparatively little the compiler can do in the way of optimizing the pointer operations.

You are correct that due to the fundamentally 32-bit nature of the architecture the use of 64-bit address arithmetic costs additional registers and additional instructions. In my experience the increased register usage typically has a larger impact on the performance than the increased dynamic instruction count (on sm_2x relatively few applications are strictly bound by instruction throughput), although it will depend on the specifics of the application.

kleboeuf · October 8, 2011, 2:43pm

Alright, looks like I will be compiling for a 32-bit architecture for now to keep the less expensive 32-bit addressing.

It would be nice if in a future version we’re allowed some nvcc option to disable the 64-bit unified addressing (I’m sure the to-do list isn’t long enough as it is…)

Also, it would be nice if there were a note in the api reference indicating that the datatype for the cuDevicePtr is either unsigned or unsigned long long depending on what architecture you are compiling for. As it stands, it makes no mention of the 64-bit cuDevicePtr.

Anyway, thank you for your reply.

Topic		Replies	Views
64-bit versus 32-bit CUDA code Any benefit at all? CUDA Programming and Performance	5	12942	November 3, 2009
32-bit nvcc makes faster GPU code than 64-bit variant In CUDA version 2.1 CUDA Programming and Performance	9	10482	February 14, 2009
Fermi (2.0) cuda device on 64-bit Linux with 32bit device code CUDA Programming and Performance	3	10305	February 13, 2011
Mixed 32/64 compilation on 0.9 CUDA Programming and Performance	7	7799	August 4, 2007
cuda 3.2 slower than cuda 2.0 ? CUDA Programming and Performance	11	4345	November 3, 2010
Combining g++ and NVCC, the pathway to hell is paved with padding! Alignment issues between host CUDA Programming and Performance	4	1680	June 29, 2010
Does the use of 16-bit, __restrict__ const kernel arguments hurt performance? CUDA Programming and Performance	4	4291	May 24, 2018
64 bits device code CUDA Programming and Performance	22	3789	September 1, 2010
CUDA pointers: 64-bit or not? CUDA Programming and Performance	2	3414	June 19, 2008
Say goodbye to 32-bit kernels! CUDA Programming and Performance	2	1075	July 16, 2015

Changing from 64-bit to 32-bit addressing mode Is it possible to force my code to use 32-bit address

Related topics