cudaMalloc KILLED on tx2, and the memory can not be cudaFree real

337229260 · July 17, 2018, 9:08am

Hi everyone!

I have an issue when I run a little test about “cudaMalloc” and “cudaFree” on TX2.

Each time I apply for memory on GPU, and it returns “success”. Then, I “cudaFree” it, and its value is “success” also. But, when I use “tegrastats” to see the GPU usage, I found that the usage didn’t reduce. To be more straightforward, suppose I apply for 1GB memory by “cudaMalloc”, and the RAM usage is 1GB; then after my “cudaFree” success, the RAM usage is still 1GB, which is unreasonable I think.

Even worse, if I repeat these two steps: cudaMalloc and then cudaFree memory(1GB), for 7 times. then ,the program is KILLED.

By the way, when I apply for memory on CPU by “malloc” and “free”, it’s all correct.

So, my question is why I can’t cudaFree the memory on GPU really ?

Thanks!!!

cbuchner1 · July 17, 2018, 3:21pm

Please post your code so we can assess if your use of cudaMalloc and cudaFree is correct.

337229260 · July 18, 2018, 1:37am

following is my code. Just the most simple test.
define DATASIZE 1048576
define blocks 1024
define blocks 8

int main()
{

 cudaError_t err;
 err = cudaMalloc((void**) &gpudata, sizeof(float) * DATASIZE);
 err = cudaFree(gpudata);     
// Datanumber<<<blocks, threads>>>(gpudata);

}

global void Datanumber(int num)
{
int tid = threadIdx.x;
int bid = blockIdx.x;
for ( int i = bidblockDim.x + tid ; i<DATASIZE ; i+=blocks * threads)
{
num[i] = i ;
}
}

I am sure that it cudaFree successfully, because if I add a kernel after it, I found “gpudata” can not be used anymore.But the GPU usage didn’t reduce, and the free area is still unavailable. After repeating several times, it will be out of memory.
(The platform I use is Nsight eclipse 9.2, and all data is obtained in its debug mode.My GPU is NVIDIA Jetson TX2.)
Thank you !

cbuchner1 · July 18, 2018, 8:54am

I assume you declared gpudata as int* gpudata; ? Otherwise the kernel call would complain about incompatible pointer types in the arguments.

Just a minor nitpick: you need sizeof(int) * DATASIZE for the allocation.

There could be platforms where sizeof(int) is 8 bytes, but for CUDA I am pretty sure sizeof(int) is 4, same as sizeof(float).

So the code should not cause undefined behavior when invoked repeatedly.

Do you get the process killed when repeatedly starting the binary from a console, or do you have to put a for() loop around the cudaMalloc/cudaFree() part to make it crash?

Christian

337229260 · July 18, 2018, 9:10am

Thank you for your answer!Sorry for the trouble that the above is not my source code. You can use the following code.
I must say, when I use the following code to RUN, there is NO error. Its problem is only reflected in the DEBUG mode.When I used the STEP OVER, it happened.

#define DATASIZE (16384*8192)
int main()
{
float gpudata1,gpudata2,gpudata3,gpudata4,gpudata5,gpudata6,gpudata7;
checkCudaErrors(cudaMalloc((void) &gpudata1, sizeof(float) * DATASIZE * 2));
checkCudaErrors(cudaFree(gpudata1));
checkCudaErrors(cudaMalloc((void) &gpudata2, sizeof(float) * DATASIZE * 2));
checkCudaErrors(cudaFree(gpudata2));
checkCudaErrors(cudaMalloc((void) &gpudata3, sizeof(float) * DATASIZE * 2));
checkCudaErrors(cudaFree(gpudata3));
checkCudaErrors(cudaMalloc((void*) &gpudata4, sizeof(float) * DATASIZE * 2));
checkCudaErrors(cudaFree(gpudata4));
checkCudaErrors(cudaMalloc((void**) &gpudata5, sizeof(float) * DATASIZE * 2));
checkCudaErrors(cudaFree(gpudata5));
checkCudaErrors(cudaMalloc((void**) &gpudata6, sizeof(float) * DATASIZE * 2));
checkCudaErrors(cudaFree(gpudata6));
checkCudaErrors(cudaMalloc((void**) &gpudata7, sizeof(float) * DATASIZE * 2)); //SIGKILL
checkCudaErrors(cudaFree(gpudata7));
checkCudaErrors(cudaMalloc((void**) &gpudata8, sizeof(float) * DATASIZE * 2));
checkCudaErrors(cudaFree(gpudata8));
return 0;
}

When I step to this line “cudaMalloc(gpudata7)”, my GPU usage has been 7197MB, and I can’t step over it, because it will turn off the display due to insufficient memory.

cbuchner1 · July 18, 2018, 9:18am

This is bizarre. You could file a bug at https://developer.nvidia.com/ to have nVidia look into it.

They will probably need to know the exact driver and CUDA toolkit versions you are using, and the above repro code plus a description how to reproduce it exactly (i.e. which compiler arguments you used)

337229260 · July 18, 2018, 9:21am

Thanks! I will ask them for some advice.

cbuchner1 · July 18, 2018, 9:26am

The exact bug report URL is this: https://developer.nvidia.com/nvidia_bug/add

“File attachments are currently not supported on this form - please send as attachments via email to NVSDKIssues@nvidia.com referencing the Bug ID listed on the My Bugs section of My Account.”

what strikes me as odd is this statement:

For Jetson Platform issues post your question on the NVIDIA Developer Forums.

Does this mean Jetson support issues are generally not handled on developer.nvidia.com?

The FAQ for Jetson states this:

How can I get support for my Jetson Developer Kit or module?

See this link for available support: https://developer.nvidia.com/embedded/support

Christian

337229260 · July 18, 2018, 9:34am

OK, I’ll try. Thanks!

Topic		Replies	Views
cudaMalloc KILLED on tx2, and the memory can not be cudaFree real Jetson TX2	10	1122	December 2, 2019
cudaFree is returning an unrecognised error code CUDA Programming and Performance	10	7910	March 13, 2009
SOLVED (sort of): cudaMalloc fails where cudaMallocManaged succeeds CUDA Programming and Performance	1	546	July 7, 2019
CudaMalloc on Vista : strange behaviour Works on XP, Fails on Vista CUDA Programming and Performance	6	12258	July 1, 2009
Cuda Memory Usage TX1 Jetson TX1	8	4526	December 16, 2015
cudaMalloc() leads to segment fault Jetson TX1	9	4508	June 30, 2017
Slow cudaMalloc (~1.5s) and slow mem access there, allocating nearly whole memory, with WDDM CUDA Programming and Performance	0	1090	June 18, 2014
cudaMalloc error in big loop CUDA Programming and Performance	12	15587	May 21, 2008
using cudaMalloc and cudaFree within a loop unspecified launch failure! CUDA Programming and Performance	21	37671	April 23, 2009
GPU out of memory when the total ram usage is 2.8G Jetson TX2	28	18503	October 18, 2021

cudaMalloc KILLED on tx2, and the memory can not be cudaFree real

Related topics