VRam Leak when calling npp functions

oippoo · December 29, 2023, 12:36pm

when calling functions such as:

nppiCopyWrapBorder_16u_C1R_Ctx
nppiFilterMedian_16u_C1R_Ctx
nppiFilterGauss_32f_C1R_Ctx
…
for different threads, and for each thread i set the Ctx the same.
I found some some functions return with success but not what i want, then i notice the leak .

with compute-sanitizer, i just got CUDA ERROR #719 (using FilterMedian kernel #2): unspecified launch failure.

i am running my code with RTX A4000, driver 522.06, cuda 11.8 and ecc on

Robert_Crovella · December 29, 2023, 7:10pm

running your code under compute-sanitizer and getting error 719 means you are doing something wrong, perhaps in the arrangement of arguments you are passing to the function.

Given that, declaring a leak or not isn’t sensible, in my opinion. If your code is performing illegal behavior, you should fix that first.

oippoo · December 30, 2023, 1:29am

Thanks, but without compute-sanitizer, all functions return npp no error, and the results are correct.
What should I do then?

Robert_Crovella · December 30, 2023, 2:22am

That seems to contradict:

I have to admit I’m not really sure what your situation is. You’ve given no indication of how you are determining there is a leak, and you seem to be making contradicting statement about whether you are getting the results you expect, or not.

In any event, if you run a particular npp function call, and under compute-sanitizer you get an error 719, then you should double-check all arguments you are passing for correctness.

oippoo · December 31, 2023, 3:24am

here is what i did

uint16_t *x,*y;
int h;
int w;
int step;
NppStatus s;
...

# malloc with cuda malloc 
cudaMalloc(x, h*w*sizeof(uint16_t));
cudaMalloc(x, (h+2)*w*sizeof(uint16_t));
s = nppiCopyWrapBorder_16u_C1R_Ctx(x, w*sizeof(uint16_t), {w,h}, y,w*sizeof(uint16_t),{w,h+2}, 1,0,Ctx);

# malloc with nppiMalloc
x = nppiMalloc_16u_C1(w,h,&step);
y = nppiMalloc_16u_C1(w,h+2,&step);
s = nppiCopyWrapBorder_16u_C1R_Ctx(x, step, {w,h}, y,step,{w,h+2}, 1,0,Ctx);

when malloc with cuda, y is what i want,
but malloc with nppi, y is not what i want.
I check the meminfo with cudaMemGetInfo before and after the calling of nppiCopyWrapBorder_16u_C1R_Ctx.

Thanks for your help

Robert_Crovella · December 31, 2023, 7:08pm

I wouldn’t be able to comment further without a complete example.

Topic		Replies	Views
nppiCopy_16s_C1R_Ctx in side threads causes memory leak? GPU-Accelerated Libraries	0	519	October 15, 2019
CUDA memory copy (cudaMemcpy) fails after NPP sum function (nppiSum_8u_C3R) GPU-Accelerated Libraries npp	0	696	February 16, 2023
Using nppiMean_StdDev_8u_C1R after setNppStream returns NPP_RANGE_ERROR GPU-Accelerated Libraries	2	1671	March 20, 2018
NPP with non-default streams GPU-Accelerated Libraries	7	869	August 8, 2019
Compute Sanitizer for OpenAcc and OpenMPI Compute Sanitizer	2	1333	March 9, 2023
NPP Stream crash GPU-Accelerated Libraries	5	2458	March 21, 2017
Problem when using npp median filter with application managed cuda stream GPU-Accelerated Libraries npp	0	83	June 26, 2024
nppiFilterMedian_32f_C1R - cuda-memcheck Invalid Access GPU-Accelerated Libraries	0	468	August 10, 2020
Compute Sanitizer not able to detect memory leak when using cuMemAlloc and OpenACC Compute Sanitizer	3	1847	November 27, 2023
NPP & stream problems? GPU-Accelerated Libraries npp	1	1661	October 12, 2021

VRam Leak when calling npp functions

Related topics