cuda initialization

santyhyammer · February 19, 2007, 7:08pm

Hiya! I’m completely new to CUDA.
I have a GeForce 6800 and WinXP SP2 with VS2005. I am trying to initialize CUDA in “emulation mode” ( I’m waiting March to get a gf8300 because I don’t have the $$$ to get a 8800 yet :P ).

Can I debug and use CUDA with that configuration?

Also have other problem… I installed the 97.73 drivers and CUDA and the SDK, but when I call the cuInit() it gives me an error ( CUDA_ERROR_NO_DEVICE, and yep, I’m using the DEVICE_EMULATION directive ).
How can I program and debug this in emulation model until I get a 8300?

Also I have a doubt… Imagine I want to do a program to perform 1million dot products… I write the CPU app using VS2005… Then I init CUDA and write the .cu to perform the dot products in the GPU… Then I load the .cu compiled module using cuModuleLoad and then I execute the module, sync threads and read back the data from the GPU to the CPU? Or ALL my program need to be compiled using nvcc.exe?

Also other doubt… I know CUDA defines float4, float3, etc… vectors like HLSL/GLSL. However, can’t find the built-in function intrinsics in the docs… Can I use the dot(), cross(), normalize() ones? Do I need BLAS for this? I got some errors with -,+, += operators…

Also an observation… the SDK is a little confusing atm. The examples and .H headers really need much more comments. Will be good to add more simple examples like, for example, to perform 1million of sequential dot-products and read back the result to the CPU ( the matrix_drv example is good but a little complicated )

thx

nwilt · February 19, 2007, 9:48pm

cuInit() is the initialization function for the driver API, which only supports actual hardware at the moment.

I’d suggest you try compiling one of the samples in device emulation mode. Most of them are written against the CUDA runtime (CUDART), not the driver API - the driver API ones end in _drv, e.g. matrixmul_drv.

santyhyammer · February 19, 2007, 10:10pm

Yep yep! That’s what I want, to run it using the emulation mode.

Oh I understand now… There are two options ( correct me if i’m wrong please )

Use the CUDA driver API ( starts with cuXXXX and uses the HW).
Use the CUDA runtime API ( starts with cudaXXXXX and uses cudart.dll to interact with the driver and is easier to use ).

The emulation mode is only available using the cudart.dll.

The CUDA C files can operate basically at:

Device. Inside the hardware(gf8)
Host. Inside the main CPU ( pentium, etc )

But here there is a thing I don’t understand… The NVCC.exe allows to compile BOTH host and GPU(device) parts… The GPU part is a HLSL/GLSL shader, but using C features like pointers, etc…

The question is… Do I need really the HOST part? Thats the thing I dont understand. GCC/VS2005 have C++, CLR/CLI, templates and other things that NVCC could not do easy… Looking at the example I see you use NVCC to compile a OBJ using “custom build steps”… Do you make that to provide the emulation mode when a GF8 is not present?

Won’t be easier to provide a cuda_softwareReferenceDevice.dll with the SDK in case the developer has no GF8 installed? I really don’t like to compile my “host” program using NVCC… I just want to use NVCC to compile the GPU part and then call cuLaunch() and read back the results to the system memory…

Other thing that I think is confusing is the proposed syncronization ( shared, synchtreads, etc ) and parallelization ( threadIDx.x, blockIDx.x, banks ) . Have you considered to use a standard subset of OpenMP #pragma omp for/block to hide all that complexity?

Mark_Harris · February 20, 2007, 10:42am

NVCC is a compiler driver. It can invoke the CUDA compiler or gcc or the MS compiler & linker depending on the options and the source files passed to it. It works much like gcc does in this respect – both are compiler drivers. Please see the NVCC manual for more information.

Yes, you need a host portion of your application in order to load data onto the GPU and to invoke CUDA (GPU) kernels.

We use custom build steps to invoke NVCC to compile the CUDA code for both emulation and device (aka non-emu) configurations.

You have to use the CUDA runtime API to get device emulation. The driver API doesn’t support it. That said, you only have to compile the portion of your application that calls cuda* functions and invokes kernels (<<<>>>) using nvcc. You can separate other portions into other files / compilation units and compile those using the MS compiler or another compiler and then link them with the objects generated by nvcc. For an example of this, see the cppIntegration sample in the SDK.

Unfortunately, going with a less-explicit method of exposing parallelism and synchronization would make it harder to access all the performance of the GPU. Not all problems can be parallelized efficiently with OpenMP pragmas. We chose to develop extensions to C that map very closely to the way the hardware works. This has a cost in terms of the learning curve, but we feel the benefits in performance and flexibility are well worth it.

Thanks,

Mark

santyhyammer · February 20, 2007, 12:36pm

thx Mark, all is clear now :)

Topic		Replies	Views
Emulation on Linux: basic questions CUDA Programming and Performance	9	12939	June 4, 2009
64 bit Windows 10, gtx 1060, CUDA kernel startup time? CUDA Programming and Performance	12	2834	October 10, 2017
first install of cuda CUDA Setup and Installation	6	7621	February 12, 2017
C++ and CUDA using CUDA with C++ CUDA Programming and Performance	11	10768	August 27, 2008
Debugging cuda code using visual studio CUDA Programming and Performance	23	73676	December 20, 2011
CUDA SDK 2.1 breaks emulation features when no CUDA hardware is installed CUDA Programming and Performance	8	8475	April 15, 2009
starting witih CUDA How to start to coding ? CUDA Programming and Performance	12	12714	November 20, 2008
NVCC forces c++ compilation of .cu files CUDA Programming and Performance	11	25458	December 11, 2011
Error in my code... CUDA Programming and Performance	11	2534	December 19, 2014
newbie needs help please help CUDA Programming and Performance	10	6923	October 3, 2008

cuda initialization

Related topics