This is my problem:
I got a very long array A of float values (eg. 1Mio, or 10Mio). All float values are randomly in the range of [0…N]. N can be about 1000 - 4000.
I have a second array B , this one is much smaller, with just N floats. All values of the first array have to be “put into” the second array, with linear weighting.
That means for example:
One element of the array A is “100.2”. So element “100” of the second array B is incremented by 0.8 and element “101” is incremented by 0.2.
Another element of the array A is “132.5”. So element “132” of the second array B is incremented by 0.5 and element “133” is incremented by 0.5.
Until all 10Mio Elements of array A are done.
Does anyone have a good idea, how to manage this (fast) with cuda?
Thanks for any help!