Examples in SDK are too difficult.

Skybuck · July 9, 2011, 9:09pm

Hello,

The examples in the GPU Computing 4.0 are way too difficult for programmers trying to use the cuda driver api from other languages like Pascal/Delphi.

These examples assume that everything is already working, and this assumption is false.

Getting CUDA working from other languages is already difficult enough, especially with passing in parameters to cuda kernel calls.

There should be some examples which focus on getting cuda working.

For example:

Passing in:

Integers

Pointers

Passing out:

Integers

for example:

Kernel( int ParameterIn, int *ParameterOut )

{

*ParameterOut = ParameterIn;

}

is already difficult enough to get working.

^ This is probably the most basic cuda example that can be made working fully.

Half-full would also be usefull if it weren’t for the compiler scratching useless code away.

Therefore the code above is probably the most minimal… there must be output, otherwise all code is “scratched” away.

Furthermore:

What does “kernel parameters” except on host side ?

Pointers to host values ?

Or pointers to cuda pointers ?

It’s a bit confusing.

So far I am seeing cuda pointers being passed in to my complexer test… it’s strange.

Now I will have to fall back to a really simple test like above to see what’s happening.

I am now unsure what to pass in exactly from the host side… since there are no simple examples from the SDK available.

Just finding the “launch call site” is already difficult enough with all that complex clutter code around it.

I will now try to get this working:

extern "C" 

{ // extern c begin

__global__ void Kernel( int ParaIn, int *ParaOut )

{

	*ParaOut = ParaIn;

}

} // extern c end

Skybuck · July 9, 2011, 10:01pm

Very strange…

I don’t know how to pass parameters correctly, it seems to work, but then free cuda memory doesn’t work anymore ?

Some questions:

Does the parameter pointer array have to be cuda memory perhaps ?
Does the parameter pointer array have to be host memory ?
Can host values be passed for integers ?
Must host pointers be passed for cuda integer references (int arrays) ?
Should perhaps pointers to pointers be used ?!?

^ The way to pass data to cuda kernels is totally unknown and badly documented ? Unless somebody can point me to some better documentation, than just the few comments in cuda.h

A full basic example needed how to make the above example work without causing free memory to fail ?

So far it seems as if the cuda pointer is being changed which leads to problems ? Very odd.

Perhaps there is a mistake in my code somewhat but I don’t think so…

There seems to be some complexity with all of this…

what does cuda malloc actually return ?

Is it a host pointer to some cuda pointer ?

The statemens in C seems to be (void **).

Where can I find more information about all of this ???

Skybuck · July 9, 2011, 10:14pm

Ok,

I see I had little bug in my code… it was returning false while the result was ok…

code was:

if

which needed to be:

if not

But some more documentation would still help…

Now I can go back to my trail and error runs External Image

Skybuck · July 9, 2011, 10:28pm

I seem to have figured it out and it goes like this:

There is an inconsistency in the way the parameters are passed to kernels.

The inconsistency is this:

input integers can simply be passed as host memory.
output integers must be passed as cuda memory.

^ Big inconsistency.

It would have been better if input integers must also be cuda memory.

Skybuck · July 9, 2011, 10:35pm

Example:

ParameterCount := 2;
Parameter[0] := vParameterIn.Address; // input integer parameter must be passed as host pointer to host memory.
Parameter[1] := @vParameterOut.Handle; // output integer parameter must be passed as a host pointer to cuda memory pointer.

Address returns host address of host memory.
Handle returns cuda memory pointer.

Skybuck · July 9, 2011, 11:03pm

Now I am still having problems with multiple parameters and arrays, so moving on to next somewhat larger example…

Skybuck · July 9, 2011, 11:08pm

Array kernel example:

extern “C”

{ // extern c begin

// para4 is array of 3 integers

// para5 is array of 4 integers

// return some values in them

global void Kernel( int Para1, int Para2, int Para3, int *Para4, int *Para5 )

{

Para4[0] = 111;

Para4[1] = 222;	

Para4[2] = 333;	

Para5[0] = Para1;

Para5[1] = Para2;	

Para5[2] = Para3;	

Para5[3] = 666;

}

} // extern c end

extern "C" 

{ // extern c begin

// para4 is array of 3 integers

// para5 is array of 4 integers

// return some values in them

__global__ void Kernel( int Para1, int Para2, int Para3, int *Para4, int *Para5 )

{

	Para4[0] = 111;

	Para4[1] = 222;	

	Para4[2] = 333;	

	Para5[0] = Para1;

	Para5[1] = Para2;	

	Para5[2] = Para3;	

	Para5[3] = 666;	

}

} // extern c end

Skybuck · July 9, 2011, 11:17pm

Using the same technique as above now doesn’t work… I wonder why ?!?

Skybuck · July 9, 2011, 11:38pm

Ok, I spotted the problem.

The size parameter to the devic to host copy function was zero, little programming mistake in calculating the size somewhere… it wasn’t being assigned/stored.

These kinds of programming mistakes are hard to spot !

Glad I found it !

Now everything is working with above techniques ! External Image

I was already thinking about giving up on cuda and trying opencl… I’m glad it’s working now with cuda ! External Image =D

Topic		Replies	Views
Passing data to and from kernel. CUDA Programming and Performance	7	5005	July 9, 2011
pointer as function parameters CUDA Programming and Performance	1	939	September 11, 2009
sending parameters to kernel CUDA Programming and Performance	1	2738	June 12, 2011
__global__ function parameters mangled? CUDA Programming and Performance	1	2826	May 17, 2007
fortran cuda interface passing pointer from fortran and allocating memory on device CUDA Programming and Performance	8	10096	May 14, 2010
Parameter Passing to Device CUDA Programming and Performance	6	5002	June 11, 2008
Pointer as formal parameter in kernel call CUDA Programming and Performance	8	6529	March 13, 2009
passing pointers using driver API CUDA Programming and Performance	0	3752	July 10, 2008
Parameters passed to a CUDA kernel exceed 256 bytes. CUDA Programming and Performance	13	7142	September 21, 2009
How to pass large arguments in CUDA kernels Kernel arguments CUDA Programming and Performance	10	19262	December 18, 2009

Examples in SDK are too difficult.

Related topics