I am using CUDA SDK’s histogram256 function to calculate a 256 bin histogram. For example the histogram I want to get is like this:
0 234 0 2 34 345 0 0 4 …
But SDK’s histogram function gives me the results like below:
234 2 34 345 4 0 0 0 0 …
which means this code only puts the bins which have value bigger than zero to an array. But I need to get all the bins. Is there a way to deactivate the rearrangement of bins?
I am unaware of this problem with the SDK’s histogram code. You can have a look at the histogram whitepaper written by V. Podlozhnyuk, which provides some details about the implementation is the SDK: CUDA Toolkit Documentation
Maybe that will help you to understand the code and repair the problem.
Otherwise, you could also try out a new method of histogramming, which is presented on the website: PARsE | Research | Algorithms and tools (scroll down to the bottom)
It will probably not contain the same bug and might give you a speed increase, but might be a little trickier to get working for arbitrary inputs, as it is part of a research project.