I have an application which required that I take some histograms of 4 channel data, and rather than code it up myself, I thought I might try to use NPP, and in particular the nppiHistogram_Even_16u_C4R function. But it’s the first time I’ve used these functions, and the example code for doing histogram equalization only does single channel images, which have a somewhat simpler model.
In particular, I’m having difficulty figuring out how to allocate the histogram arrays (pHist in the prototypes). They are declared as Npp32s *phist[4], and my understanding is that all the arguments passed to NPP functions must live on device memory. But I’ve tried a few things, and I can’t seem to get it defined properly.
I’ve tried:
- defining a buffer like:
device Npp32s *histData[4] ;
and then doing a loop to call cudaMalloc() for each entry…
- defining a buffer like:
device Npp32s histData[4][256] ;
- using cudaMalloc to create a pointer to a space that can contain 4 pointers, and then looping, cudaMallocing each of the four entries.
None of this seems to work? Can someone point me at a working example of multi-channel histograms?