32-bit CUDA WinXP app on WinXP 64-bit Deployment considerations!

Sarnath · November 18, 2008, 9:45am

Guys,

I have a library written on top of CUDA 2.0 on a WinXP 32-bit platform.

I want to deploy this library on WinXP 64-bit platform having the same CUDA and VS version. (VS version being the same rules out any CRT related issues).

But I am wondering if this would work seamlessly OR Will I hit any CUDA related issues? I am not sure if CUDA libraries link statically or dynamically. I think it loads dynamically… So, In such a case, will a 64-bit CUDA library work fine with my DLL??

Can some1 help?

Thanks

Best Regards,
Sarnath

Larry35 · November 18, 2008, 10:00am

Hi,

If you have any answers, I’ll highly be interrested :)

I have a CUDA app compiled on a WinXP 32-bits platform. A user told me that this application runs on a winXP 64 bits machine but much more slower. ( with roughly the same GPU )

Are there any issues using a 32-bits Cuda program with the 64-bits version of the display drivers?

Larry

Sarnath · November 18, 2008, 10:05am

Diss what I found in the net: 64-bits could mean 2 things in Intel

Intel Xeon 64-bit – which can run native 32-bit apps as quick as any fast CPU
Intel Itanium – Runs 32-bit applications using a hardware-software combo emulator @ 400 Mhz speed.

I am not interested in Itanium at all…

I am interested to know what kind of dynamic linking problems can be expected @ run-time – especially CUDA related.

Since my app is linekd with 32-bit CUDA DLL, will it work fine on 64-bit machine with “CUDA on 64-bit” ??
OR
Is it possible to isntall “CUDA for Windows XP” on a Windows XP 64-bit box? I understand that these 64-bit boxes NEED 64-bit drivers. So, Is it possible to install 64-bit driver for WinXp-64 and install toolkit and SDK from Normal WinXP??

Any help is greatly appreciated.

Best Regards,
Sarnath

tmurray · November 19, 2008, 4:56am

Driver API: CUDA will just work seamlessly, no problem.
Runtime API: for now, redistribute cudart.dll and place in the same directory as the executable–we are making improvements to this.

Sarnath · November 19, 2008, 5:14am

Vow! Thanks for this! I am delighted to know that 32-CUDART would work seamlessly with the 64-bit driver.

At the momment, this is a boon.

Thank you,

Good luck on the improvements!

Best Regards,

Sarnath

Larry35 · November 19, 2008, 8:53am

Thanks for this answer tmurray :). And for performance issues? Does anybody have encounter such a thing ?

My app is a OpenGL/Cuda program widely using fonctions like cudaGLMapBufferObject. I’ve seen in the forum that this fonction could be slow. Can it be more or less efficient according to the GPU used ?

My development config : GeForce 8800 GTX → 20 fps
User config : Quadro FX 4600 → less than 3 fps :(

I’m using Cuda 2.0 by the way.

Regards,
Larry

Sarnath · November 20, 2008, 8:35am

My programs are running normal. I use 32-bit cudart.dll over Windows XP 64-bit.

However,

I get “Memory allocation issues” after a few invocations. My app never cudaFrees @ the end as I was told that cudaFree happens automatically after program exit. This used to work fine before…

but now,

this is getting to be a messier problem! The exact error I am getting is 30 == “cudaErrorUnknown”

I am going to use the latest driver from tmurray’s driver update post… Lets see if that fixes the issue…

btw, tmurray – cAn you tell me what is the reason behind this strange behaviour of cudaMalloc()??

I even tried with CUDA apps that allocate very very less (not even an MB). It fails after 3 to 4 invocations… Very strange…

Recently noted that this failure occurs exactly on the 15th time (regardless of the application and the memory request size)

Where can I download the old drivers? I would probably try moving to the 177 series…

Sarnath · November 21, 2008, 6:38am

tumrray or any nvidia person,
Can you comment on the issues i have stated in the post above?

Thank you

Sarnath · November 21, 2008, 3:30pm

Well, I see this “cudaMalloc()” problem even with the driver that “tmurray” posted off late (the one with the watchdog fix)…

Hmm… I have no clue whats going on… Did any of you guys try the XP 64-bit driver listed in CUDA website OR the one that tmurray posted off late? Are things going well for you guys??

If so – this must be a 32-bit DLL on a 64-bit driver problem… Hmm… Sigh… Hope gets resolved sooon.

malang · November 25, 2008, 2:39pm

I have the same problem here. Even if I cudafree all memory mallocs, my Apps stops working after it was run several times.( one after each other, not simultansiosly).
CudaMalloc gives back NULL and also cudaGLMapBufferObjectstarts to fail. If this happens no other Cuda App (even apps from the SDK) are working anymore and i have to restart.

I am now going to test it with cuda sdk apps, too. Howver even if it is a programming mistack in my app it should not affect proccesses which are called later when the faulty process is already gone.

Win32 bit App with 32bit cuda on a Windows XP x64 Platform.

Sarnath · November 25, 2008, 2:44pm

Thank you Malang!!

I hope atleast now NVIDIA people would wake up and look into this…

I am looking @ deploying a test version to a customer in 10 days time and most likely a production version in a month.

Appreciate, if NVIDIA people look into this annoying problem!

Best REgards,
Sarnath

jack · November 25, 2008, 7:37pm

If you don’t mind rewriting a bit of your app, you could take a slightly different approach…

I’m mostly a .NET programmer, so I interface all my CUDA dll’s into .NET via Interop Services. I’ve been doing something like the following:

Compile CUDA code to PTX
Embed PTX strings for various kernels into my .NET application.
Load whatever data is necessary onto the card using driver functions + .NET interop services
Call the kernel via the driver API + .NET interop services (there is function that you can just pass a string containing PTX instructions, which will run that kernel)
Retrieve results from card memory via driver API + .NET interop

This makes things a bit simpler, since you don’t need to recompile .NET apps for different platforms, and if you only compile your kernels to PTX code, that is portable as well (across any OS/architecture). You also get the benefit of .NET technology, so you can easily do things like retrieve/store data from a database or SOAP services, etc.

Sarnath · November 26, 2008, 1:46am

If you don’t mind rewriting a bit of your app, you could take a slightly different approach…

I’m mostly a .NET programmer, so I interface all my CUDA dll’s into .NET via Interop Services. I’ve been doing something like the following:

Compile CUDA code to PTX

Embed PTX strings for various kernels into my .NET application.

Load whatever data is necessary onto the card using driver functions + .NET interop services

Call the kernel via the driver API + .NET interop services (there is function that you can just pass a string containing PTX instructions, which will run that kernel)

Retrieve results from card memory via driver API + .NET interop

This makes things a bit simpler, since you don’t need to recompile .NET apps for different platforms, and if you only compile your kernels to PTX code, that is portable as well (across any OS/architecture). You also get the benefit of .NET technology, so you can easily do things like retrieve/store data from a database or SOAP services, etc.

Thanks for your inputs. I have a long standing question on .NET… What is .NET? Why was it introduced? What problem does it solve? – Can you give a small gist? Appreciate your time on this.

I am not conversant with the driver API. I am more OK for the RunTime API. It saves time and its cool. I wont mind using cudaRT for production code. I dont see any change in speedups despite cudaMalloc, memcpy etc… But what you have said is a very very valid point for beginners.

Do you use GASS for your .NET - CUDA interoperability?

btw, I do the following to please my .NET master… →

My DLL code is in C++ and my customer expects it in C#. So, I ship

C++ DLLs

A C# bridge DLL that bridges my DLL entry points with c# using Interop services

A C# application

Best Regards,

Sarnath

alex_dubinsky · November 26, 2008, 2:15am

If you don’t mind rewriting a bit of your app, you could take a slightly different approach…

I’m mostly a .NET programmer, so I interface all my CUDA dll’s into .NET via Interop Services. I’ve been doing something like the following:

Compile CUDA code to PTX

Embed PTX strings for various kernels into my .NET application.

Load whatever data is necessary onto the card using driver functions + .NET interop services

Call the kernel via the driver API + .NET interop services (there is function that you can just pass a string containing PTX instructions, which will run that kernel)

Retrieve results from card memory via driver API + .NET interop

This makes things a bit simpler, since you don’t need to recompile .NET apps for different platforms, and if you only compile your kernels to PTX code, that is portable as well (across any OS/architecture). You also get the benefit of .NET technology, so you can easily do things like retrieve/store data from a database or SOAP services, etc.

Is interop services also portable across windows/linux/mac os?

alex_dubinsky · November 26, 2008, 2:25am

And what about using Managed C++? It does everything that C# can, and it doesn’t have any problem interfacing to dlls.

Maybe it’s even possible to get nvcc to compile managed code?

Sarnath · November 26, 2008, 7:42am

I just dipped my hands in C++… What is this managed C++?? Definitely first time I am hearing…

I dont think C# is portable to linux n all… SO, its a pain, I guess… Appreciate if some1 could enlighten.

alex_dubinsky · November 26, 2008, 6:33pm

C# is portable to linux via the Mono framework. New stuff like WPF doesn’t work, but a lot of things do. I’m just wondering if Interop Services also works.

Managed C++ is a lot like C#, but backward-compatible with C++. You can recompile any ordinary C program as .NET code, and you can add .NET features to a C program. The .NET code can then run on Mono.

It’d be very cool if you could tell nvcc to tell visual c++ to compile the C code to .NET bytecode instead of x86, and basically automatically create cross-platform CUDA programs.

Sarnath · November 27, 2008, 2:58am

Thanks! Do you know any good online books for C# and .NET?

alex_dubinsky · November 27, 2008, 9:47am

Online books? Not really. But you can bittorrent real books online… :)

Simon_Green · November 27, 2008, 9:53am

We have a bug filed on this issue and are looking into it. Thanks!

Topic		Replies	Views
Multiple users running CUDA WinXP CUDA Programming and Performance	22	6944	June 10, 2008
/MT versus /MD DLLs for CUDA CUDA Programming and Performance	34	42307	May 19, 2009
How to write an application that optionally uses CUDA? without actually requiring any CUDA related CUDA Programming and Performance	16	19304	March 25, 2009
Problem report cudaMalloc() returning "cudaErrorUnknown" CUDA Programming and Performance	16	8846	January 22, 2009
OpenCL or CUDA? CUDA Programming and Performance	16	10962	October 26, 2011
Window 32 and 64 bit installations. how to go around? CUDA Programming and Performance	6	9861	March 5, 2009
CUDA Toolkit and SDK v2.2 released CUDA Programming and Performance	59	64623	January 25, 2011
VS just doesn't work with CUDA on my computer! CUDA Programming and Performance	11	4164	April 25, 2011
CUDA 2.1 discussion CUDA Programming and Performance	71	63941	February 17, 2009
Wishlist Place your considered suggestions here CUDA Programming and Performance	201	204319	April 13, 2009

32-bit CUDA WinXP app on WinXP 64-bit Deployment considerations!

Related topics