hi there.
I’ve started by doing the histogram part (or at leat, trying to)… and I’ve tried unsuccessfully in 2 different ways:
1st approach:
was to reuse the histogram code example presented in SDK. I’ve tried to adapt it into my problem, compiled and executed it. the problem I encountered was that in the given example, the 256bins histogram is computed from an “image” (random values, actually) which are like a gray scale image (continuous values). But I have a RGB color image and I’ve failed to adapt it. why? because by comparing the results fom GPU and CPU, all I have is different values (very!). how it’s being done:
inline __device__ void addByte(volatile uint *s_WarpHist, uint dataR, uint dataG, uint dataB, uint threadTag)
{
uint count, H,S,V, quantiz;
// Normalization of bins
int factor=0,ibinwert;
float binwert;
factor=0x7ff; //NoBitsProBin=11 // factor = 2047 (decimal)
do
{
RGB_To_HSV(dataR, dataG, dataB, &H, &S, &V); // convert given rgb values to hsv format. as it is done in CPU
quantiz = QuantScalableUniform1(H,S,V); // quantize the hsv values, in order to get the pretended index
cuPrintf("[H S V]: [%d %d %d]\n", H,S,V); // actually, this is not printing anything at the moment :S
count = s_WarpHist[quantiz] & TAG_MASK;
count = threadTag | (count + 1);
s_WarpHist[quantiz] = count;
}while(s_WarpHist[quantiz] != count);
}
and in
__global__ void histogram256Kernel(uint *d_PartialHistograms, uint *d_Data, uint dataCount)
im trying to do:
....
....
for(uint pos = UMAD(blockIdx.x, blockDim.x, threadIdx.x); pos < dataCount; pos += UMUL(blockDim.x, gridDim.x))
{
uint dataR = d_Data[pos];
uint dataG = d_Data[pos+1];
uint dataB = d_Data[pos+2];
addWord(s_WarpHist, dataR, dataG, dataB, tag);
}
.....
.....
the exact missing code is the SAME as the histogram sdk example. how can i transform it to perform correct RGB histogram?
2nd approach:
ok. since i’ve failed to get the 256bins histogram from a rgb video frame converted to RGB, i’ve tried to use NPP histogram… once again, without luck. why? because the given example is, once again, for grayscale images!
in here, the image is read from a file.
// declare a host image object for an 8-bit grayscale image
npp::ImageCPU_8u_C1 oHostSrc;
// load gray-scale image from disk
npp::loadImage(fileName, oHostSrc);
// declara a device image and copy construct from the host image,
// i.e. upload host to device
npp::ImageNPP_8u_C1 oDeviceSrc(oHostSrc);
as my image is in memory already, i cant perform npp::loadImage(...)... another problem is that, probably i'll have to use ImageNPP_8u_C3 type (since its rgb), but either i have nppiHistogramEven_8u_C1R or nppiHistogramEven_8u_C4R... and nothing else between!
the steps i'm making are:
npp::ImageCPU_8u_C1 oDeviceSrc((unsigned int)*pFrameRGB->data[0],2u);
// pFrameRGB->data[0] is my RGB image source (coming from ffmpeg linux usage)
NppiSize oSizeROI = {oDeviceSrc.width(), oDeviceSrc.height()};
int nDeviceBufferSize;
nppiHistogramEvenGetBufferSize_8u_C1R(oSizeROI, levelCount ,&nDeviceBufferSize);
Npp8u * pDeviceBuffer;
NPP_CHECK_CUDA(cudaMalloc((void **)&pDeviceBuffer, nDeviceBufferSize));
// compute levels values on host
Npp32s levelsHost[levelCount];
NPP_CHECK_NPP(nppiEvenLevelsHost_32s(levelsHost, levelCount, 0, binCount));
// compute the histogram
NPP_CHECK_NPP(nppiHistogramEven_8u_C1R(oDeviceSrc.data(), oDeviceSrc.pitch(), oSizeROI,
histDevice, levelCount, 0, binCount,
pDeviceBuffer));
// copy histogram and levels to host memory
Npp32s histHost[binCount];
NPP_CHECK_CUDA(cudaMemcpy(histHost, histDevice, binCount * sizeof(Npp32s), cudaMemcpyDeviceToHost));
after this, “histHost” is different from the host version. :S any tips / help, please?
[EDIT]:
i’ve already solved the first attempt to compute the histogram, using the example code from SDK. the trick was to convert the image (2D structure) into a 1D vector. after that (and after being aware of the data types), i’ve made it.
But i would seriously like to accomplish the same with nvidia performance primitives. Can ANYONE help, please? I cant find proper examples, because there are just a few of them and… it seems that not everyone is capable of explain npp properly.