Dynamic memory allocation problem

Hi,

I’m new to CUDA programming. I want to store some <key, value> pairs in the GPU memory. I don’t know the number of distinct keys in advance. While processing the input data my kernel will discover the keys, and I want to maintain the number of data elements belong to each key (Something similar to using the std::map container if I implement it in CPU). Is there any possible way of doing it in GPU.

Thanks

The problem is generally known as stream compaction.

If you don’t produce many results, the most efficient way probably is to allocate a fixed size array with plenty of space, and restart the kernel if the space isn’t sufficient. You’ll then use atomicAdd() to allocate space inside the array.

Thanks for the response.

I looked into stream compaction, so, as I understood, what you have suggested is allocate space for all the possible values of the keys, then use atomicAdd() to get the number of elements for each key, and use stream compaction to get only the keys which have non-zero number of elements.

Problem with above approach is my “keys” range is very large (ex: 1 to 10^10), so I cannot allocate space for all the possible keys. Is there any possible way of maintaining only the keys which have non-zero number of elements (using thrust ??). I’m only interested in the number of elements for each key (no need to store which elements).

For example, if my input data is set of locations (mapped to integer keys), desired output is number of records for each location

input : 1, 3, 10, 1, 10, 10, 2, 400, 3, 850, 1, 400, 400, 5000

output: key, value
1 3
3 2
10 3
2 1
400 3
850 1
5000 1

One option is to just allocate a “reasonable” amount of memory, and re-run the kernel if that turns out not to be enough.