Check if element in buffer exists in any hit program

Hello,

I have created a 1D input/output Buffer to store elements of a custom struct.
I need to fill this buffer inside the any hit program in a multi-thread save way, such that no elements will be overwritten. Currently I am solving this problem by using an Optix Variable hit_counter, which I increment every time the any_hit program is executed (unsigned int index = atomicAdd(&any_hit_counter, 1)). This index is then unique and will be used for storing the data in a save way in the buffer.

My goal is to decrease the necessary buffer size based on the fact, that most of the entries in the buffer are duplicates after tracing the rays. This is currently solved by deleting those duplicates afterwards on the CPU. As you can see, this way I depend on a creation of a way to large buffer.

What I am instead searching for is a (relatively fast and multi-thread save) way to figure out, if the entry is already stored in the buffer. If it is, do nothing. If it isn’t, store (I think the atomicAdd function will still be useful to figure out the correct index then).

I tried to solve this problem by the use of a hash function with the data to be stored being the keys. For that I added an additional hash_buffer, where the current hash is added (and the data is stored in the data buffer), if the hash doesn’t already exist in the hash_buffer. The problem is, that I can not find any fast (enough) way to determine, if the hash already exists in hash_buffer. Just iterating over the buffer and comparing every hash with the current hash takes way to much time because the time it takes is proportional to the number of hashes in the hash_buffer.

I’m happy for any advice. Thank you very much in advance!

Please give some more information about your custom struct.
How complex (how much values/floats/bytes) is your custom struct? What needs to be compared? The entire struct or portions of it?
How deep is your max ray depth level? How many average different new results within one ray?

And let me know why a large buffer is a problem. To decrease the buffer size, you could split the launches into multi tiles, see: “Bidirectional Path Tracing Implementation using NVIDIA OptiX” page 52 by Kalampokis

So is higher speed the goal (accepting higher memory usage) ?
Or is lower memory buffer size (accepting lower speed) the goal?