CUDA Multiple Host threads

Hi,

I’ve an application capturing images from several cameras. One thread pr camera.

I’m trying to apply a median filter with nppiFilterMedian_16u_C1R() for each frame, which works fine until i shut down the application.
When I call cudaFree()on the image buffer for one camera, the subsequent filter calls on the other cameras fails with code 700, i.e. illegal memory access.

Pseudo code

camera thread:

cudaMalloc(&imageBufferPointer, frameSize);

while(running) {
  fetchImage(imageBufferPointer);
  applyFilter(imageBufferPointer);
}

cudaFree(imageBufferPointer);

applyFilter():

std::mutex mtx;
void applyFilter(uint16_t *imageBufferPointer ){
 std::scoped_lock(mtx);
 nppiFilterMedian_16u_C1R(params);

}

I’m struggling to see why freeing the the imageBufferPointer in one thread affects the nppiFilterMedian_16u_C1R call in another,

Thanks

Hi,

Plase check our example below:

The NPP image should be released by nppiFree.
Do you wrap it from a CUDA array?

Are you able to share a complete source for us to check?

Thanks.

Hi @AastaLLL

I was passing the imageBufferPointer directly to the nppiFilterMedian_16u_C1R() function.
So like:

cudaMalloc(&imageBufferPointer, frameSize);
cudaMalloc(&dstPointer, frameSize);
nppiFilterMedian_16u_C1R(imageBufferPointer, pitch, dstPointer, dstPitch, ....);

But I’ve should probably used npp::ImageNPP_16u_C1 in some way, and then use nppifree()

I’ll look into it again when I’ve got time to do so.

Thanks

Hi,

Sorry that we just double-checked the API, it should be Npp16u rather than ImageNPP_16u_C1.

https://docs.nvidia.com/cuda/archive/11.4.0/npp/group__image__filter__median.html#ga6065eede18df92ca7f0e4329513b5d06


NppStatus nppiFilterMedian_16u_C1R( const Npp16u * 	pSrc,
                                    Npp32s 	nSrcStep,
                                    Npp16u * 	pDst,
                                    Npp32s 	nDstStep,
                                    NppiSize 	oSizeROI,
                                    NppiSize 	oMaskSize,
                                    NppiPoint 	oAnchor,
                                    Npp8u * 	pBuffer 
)	

Do you get the expected result before closing the app?
It’s more like when you release the buffer, it somehow by accident, frees the memory of the other buffer.

Could you update your data type from uint16_t to unsigned short and try it again?

Thanks.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.