32-bit and 64-bit compiled codes

ckaragoz · October 26, 2019, 9:03am

I have a question about performance comparison of 32-bit and 64-bit compiled codes running on Tesla V100 32GB GPU.

I compiled V10.0 samples firstly on Win32 configuration. Apps run on Tesla V100 32GB Gpu. Then I compiled same samples on x64 configuration and run them on same GPU.

The performance results dissappointed me. Because the apps compiled with win32 configuration run faster than the apps compiled with x64 configuration. I attached the results of MatrixMul app to this mail.

The Development environment summary is:
Microsoft Windows 10 Pro(x64) Build 18362.175, MS Visual Studio 2012, Cuda Version 10.0.130, Cuda Driver Version 412.29
Hardware Summary :
Intel Xeon Silver 4114 CPU @ 2.20GHz, 32 GB RAM, Tesla V100 32GB GPU

Robert_Crovella · October 26, 2019, 12:14pm

32-bit app development isn’t supported any more in CUDA. (Yes it is still supported in a limited way on VS2012, but not any newer toolchain than that.) It may still work in some cases, but there are plenty of use-cases that won’t work, such as use of CUDA libraries such as CUBLAS

https://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html#x86-32-bit-support

That has often been the case historically when comparing 32-bit and 64-bit versions of the same app. An expectation of identical performance is unrealistic and has never been the case, generally. Usage of 64-bit pointers, which is mandatory for 64-bit apps, causes at least some reduction in performance, generally speaking, to pick one example of possible differences between 32-bit and 64-bit apps.

Any sort of serious CUDA development work currently will need to acknowledge that 64-bit app development is the only sensible path at this time. There is no use in looking back at 32-bit app developement, in spite of whatever benefits there may have been.

There were certainly some downsides to 32-bit app development. For example it would have been impossible to use more than 4GB of the 32GB memory on that V100 card.

Topic		Replies	Views
Running multiple CUDA apps on same GPU card Serious performance drop CUDA Programming and Performance	1	1132	March 14, 2011
New to Tesla/CUDA questions Just a few questions. CUDA Programming and Performance	7	7916	October 24, 2007
Possible memory limits of GPU on 32 bit system ? CUDA Programming and Performance	4	6560	June 25, 2008
Mixed 32/64 compilation on 0.9 CUDA Programming and Performance	7	7796	August 4, 2007
Limitations on CUDA of 32bit app running on Windows XP 64bit Perfomance loss? how much of gpu memory CUDA Programming and Performance	3	2331	March 17, 2009
cuda 3.2 slower than cuda 2.0 ? CUDA Programming and Performance	11	4345	November 3, 2010
Help me... Cuda program execution is slower than CPU...Did I miss any settings?? CUDA Programming and Performance	5	1189	September 24, 2015
Computational speed of High-End and Medium-End Nvidia Tesla cards CUDA Programming and Performance	1	502	November 20, 2019
Different running time on same GPU with same code CUDA Programming and Performance	10	2598	March 22, 2014
32 or 64-bit cuda toolkit Which to download, 32 or 64-bit for cuda development toolkit? CUDA Programming and Performance	13	10187	October 21, 2011

32-bit and 64-bit compiled codes

Related topics