I have an issue when I run a little CUDA test about “cudaMalloc” and “cudaFree” on TX2.
Each time I apply for memory on GPU, and it returns “success”. Then, I “cudaFree” it, and its value is “success” also. But, when I use “tegrastats” to see the GPU usage, I found that the usage didn’t reduce, which means the area just freed is not available. To be more straightforward, suppose I apply for 1GB memory by “cudaMalloc”, and the RAM usage is 1GB; then after my “cudaFree” success, the RAM usage is still 1GB, which is unreasonable I think.
Even worse, if I repeat these two steps: cudaMalloc and then cudaFree memory(1GB), for 7 times. then ,the program is KILLED.
By the way, when I apply for memory on CPU by “malloc” and “free”, it’s all correct.
So, my question is why I can’t cudaFree the memory on GPU really ?
Following is my code. Thanks!!!
#define DATASIZE 1048576
#define blocks 1024
#define blocks 8
err = cudaMalloc((void**) &gpudata, sizeof(float) * DATASIZE);
err = cudaFree(gpudata);
// Datanumber<<<blocks, threads>>>(gpudata);
global void Datanumber(int num)
int tid = threadIdx.x;
int bid = blockIdx.x;
for ( int i = bidblockDim.x + tid ; i<DATASIZE ; i+=blocks * threads)
num[i] = i ;
I am sure that it cudaFree successfully, because if I add a kernel after it, I found “gpudata” can not be used anymore.But the GPU usage didn’t reduce, and the free area is still unavailable.
(The platform I use is Nsight eclipse 9.2, and all data is obtained in its debug mode.My GPU is NVIDIA Jetson TX2.)