How to share cuda memory in 2 threads?

I am using cuda to decode video, to not effect my main process, I have use a seperate thread to decode with a while loop. In my case, I have allocate some gpu memery with cudaMalloc in main thread, and use them in the decode thread to store the result. and I will use the result in main thread, the result is ok but the gpu memory will keep increasing without any malloc operation. I have used cuda-memcheck tool to detect the leak, there is no leak. I have searched a lot, seems cuda can not used like this way, is it right?

CUDA can be used that way. You’re doing something in the decode thread that is allocating GPU memory. It may be a library call, or some other routine such as cudaEventCreate or cudaStreamCreate or something else. You’ll need to discover what it is.

Thanks, I just found a bug in my code