invalid argument error while using cudaDeviceSetLimit in cuda kernel

Laxmi · August 14, 2014, 7:50am

I have a kernel with array allocated using malloc as

__global__ static void CalcSTLDistance_Kernel(Integer ComputeParticleNumber)
{
	
	const Integer ID  =CudaGetTargetID(); 
	
	
	CDistance NearestDistance;
	Integer NearestID = -1;
	NearestDistance.Magnitude = 1e8;
	NearestDistance.Direction.x = 0;
	NearestDistance.Direction.y = 0;
	NearestDistance.Direction.z = 0;//make_Scalar3(0,0,0);
	

	

	Integer TriangleID;		
	Integer CIDX, CIDY, CIDZ;
	Integer CID = GetCellID(&CONSTANT_BOUNDINGBOX,&c_daParticlePosition[ID],CIDX, CIDY, CIDZ);
	int len=0;
	int* td = (int*)malloc(100);

}

I have called this kernel with

cudaDeviceSetLimit(cudaLimitMallocHeapSize, 10*1024*1024);
			CalcSTLDistance_Kernel<<<TS.Grids(),TS.Blocks(),0,Stream>>>(ComputeParticleNum);

during runtime it shows argument error, I want to define heap size explictly and used cudaDeviceSetLimit but it shows invalid argument error

Robert_Crovella · August 14, 2014, 1:04pm

are you calling cudaDeviceSetLimit in a loop, i.e. multiple times in your application?

Laxmi · August 15, 2014, 12:56am

yes it is called inside a loop.

Robert_Crovella · August 15, 2014, 3:33am

I guess you should read the documentation.

[url]http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#heap-memory-allocation[/url]

“The device memory heap has a fixed size that must be specified before any program using malloc() or free() is loaded into the context.”

“Heap size cannot be changed once a module load has occurred”

So when you call the cudaDeviceSetLimit function before any kernel call, it will succeed. Then you call a kernel that does a malloc operation. After that you cannot call the cudaDeviceSetLimit function again in your program.

Once you run a kernel that does a malloc operation, you can no longer call this function. If you do, it will return an error. Try a simple test case and you will see that the description in the documentation is accurate.

So decide what size you want the heap to be, taking into account all the needs of all the kernels in your program. Then set it once, at the beginning of the program, before any kernel calls.

After that, you cannot call it again. If you do, it will return an error.

Topic		Replies	Views
Allocating memory from device and cudaLimitMallocHeapSize CUDA Programming and Performance	1	818	February 14, 2019
Unable to allocate more than 2MB using malloc in CUDA kernel CUDA Programming and Performance cuda , kernel	4	1583	April 8, 2020
How can I set a malloc heap size greater than 4GB? CUDA Programming and Performance	13	7703	April 10, 2014
cudaDeviceSetLimit with cudaLimitMallocHeapSize and OpenGL interop CUDA Programming and Performance	0	1507	July 24, 2011
CUDA fails to allocate large chunk of memory CUDA Programming and Performance cuda	2	1166	March 23, 2022
malloc in kernel CUDA Programming and Performance	3	1128	September 20, 2011
cudaMemcpy returns "invalid argument" for in-kernel malloc-ed memory CUDA Programming and Performance	0	1424	February 13, 2013
cudaDeviceSetLimit on V100 w/32GB limited at ~17GB? CUDA Programming and Performance	9	1562	January 28, 2020
How to increase dynamically allocatable memory in device function? CUDA Programming and Performance	2	3086	November 20, 2018
CUDA in-kernel malloc CUDA Programming and Performance	4	10000	July 19, 2011

invalid argument error while using cudaDeviceSetLimit in cuda kernel

Related topics