Pass function and parameters to another function to execute later in GPU using CUDA

msharif42 · March 2, 2022, 5:38am

I am trying to pass a device function and its parameters to another host function so that I can execute it later in GPU using CUDA. The idea is to write different device functions and use the same host and global functions for all of them. For example,

__device__ void DeviceFunction(int* arr1, int* arr2)
{
    //Do something...
}

int main()
{
    HostFunction(DeviceFunction, arr1, arr2);
}

Lets assume our host function will call a __global__ function. And, finally, the global function will call the device function.

template<typename Tf, typename... T>
__host__ void HostFunction(Tf func, T... args)
{
    GlobalFunction <<< dimGrid, dimBlock >>> (func, args...);
}

template<typename Tf, typename... T>
__global__ void  GlobalFunction(Tf func, T... args)
{
    func(args...);
}

I can achieve this goal by using a device function pointer. But the problem is that the performance of the function pointer is very low in GPU. The next thing I tried using the Lambda expression.

//Not an extended Lambdas
auto DeviceFunction = [] __device__ (int* arr1, int* arr2)
{
    //Do something...
};

int main()
{
    HostFunction(DeviceFunction, arr1, arr2); //Not working
}

If I use the Lambda expression inside a function it will work but I need to define my device function outside of other functions independently.

int main()
{
    auto DeviceFunction = [] __device__ (int* arr1, int* arr2)
    {
        //Do something...
    };
    HostFunction(DeviceFunction, arr1, arr2); //Working
}

I am thinking about whether there are other ways to achieve the above goal and get almost the same performance when I use the device function call directly from the global function using DeviceFunction(args...); instead of func(args...);

Topic		Replies	Views
A pointer to a function CUDA Programming and Performance	7	1330	May 13, 2016
Function pointers in CUDA? CUDA Programming and Performance	7	9527	February 10, 2010
sharing function between Host and Device CUDA Programming and Performance	3	2762	September 24, 2009
pointer as function parameters CUDA Programming and Performance	1	884	September 11, 2009
How to copy a host function pointer to device in CUDA CUDA Programming and Performance cuda	6	587	July 10, 2024
Moving nvstd::function object (with __device___) to another class CUDA Programming and Performance cuda	3	1714	October 6, 2022
Writing on an local array? CUDA Programming and Performance	1	1766	July 4, 2008
Parameter Passing to Device CUDA Programming and Performance	6	4846	June 11, 2008
a kernel call within another kernel CUDA Programming and Performance	16	11422	January 23, 2018
can I transfer an short array of pointer to __global__ function? CUDA Programming and Performance	2	778	June 1, 2016

Pass function and parameters to another function to execute later in GPU using CUDA

Related topics