How to return object in device function?

hadesmajesty · December 11, 2018, 3:03pm

If I want to return an object from device function as in the case below, an assignment operator will be called to make a deep copy from A to A2. Is there any way on CUDA like the rvalue reference mechanism in C++ to avoid this?

__host__ __device__
NRmatrix<T> trans(const NRmatrix<T>& A){
	NRmatrix<T> A;
	return A;
}

template<typename T>
__global__ void RunGLS_OnGPU(NRmatrix<T> const& A,some arguments){
NRmatrix<T> A2=trans(A);
//or something like
NRmatrix<T> &&A2=trans(A);
}

Robert_Crovella · December 11, 2018, 3:29pm

I’d be mildly surprised if whatever you’re wanting to do in C++ doesn’t work in CUDA device code.

For example, && as a c++11 rvalue-reference should work fine in CUDA device code.

[url]https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cpp11-language-features[/url]

hadesmajesty · December 12, 2018, 9:48am

Thank you. But when I use rvalue reference in CUDA device code it takes 56.8 seconds. Without rvalue, it takes 56.651 seconds. Using rvalue reference is a bit more slowly.For my cpu version of the code, rvalue reference can make around 4 seconds faster.

Topic		Replies	Views
device function arg by reference CUDA Programming and Performance	1	4973	January 4, 2010
Global variables Across Threads CUDA Programming and Performance	4	2914	February 4, 2010
Chrono errors CUDA Programming and Performance	0	707	October 11, 2014
Difference between raw pointer and reference CUDA Programming and Performance	5	1747	September 25, 2023
__CUDA_ARCH__ in object methods not working CUDA Programming and Performance	3	1143	October 30, 2019
passing pointers by reference passing pointers by reference to an object CUDA Programming and Performance	0	797	November 12, 2009
Possible to automatically create __device__ (rvalue/copy reference/...) constructors? CUDA Programming and Performance	4	813	October 12, 2021
CUDA vector type reference CUDA Programming and Performance	2	2836	January 31, 2008
Warp Invalid PC, device function pointer CUDA Programming and Performance	4	1047	May 29, 2019
Device function pointers: Is it possible to use them in a useful way? CUDA Programming and Performance	16	8997	May 20, 2020

How to return object in device function?

Related topics