How can I use __syncthreads() in ray gen program?

568246876 · February 6, 2019, 8:37am

I’m trying to achieve a bitonic sort algorithm in my application. But it seems that using _syncthreads() will get a run-time error. So how to do the thread sync? Or, can I just use CUDA with my OptiX context to sort an OptiX Buffer?
Here is my code:

RT_PROGRAM void bitonic_sort() {
	output_buffer[launch_index] = intput_buffer[launch_index];
	__syncthreads();
	int sum = intput_buffer.size();

	for (unsigned int i = 2; i <= sum; i <<= 1) {
		for (unsigned int j = i >> 1; j>0; j >>= 1) {
			unsigned int tid_comp = launch_index ^ j;
			if (tid_comp < sum) {
				if (tid_comp > launch_index) {
					if ((launch_index & i) == 0) { //ascending
						if (output_buffer[launch_index]>output_buffer[tid_comp]) {
							swap(output_buffer[launch_index], output_buffer[tid_comp]);
						}
					}
					else { //desending
						if (output_buffer[launch_index]<output_buffer[tid_comp]) {
							swap(output_buffer[launch_index], output_buffer[tid_comp]);
						}
					}
				}
			}
			__syncthreads();
		}
	}
}

Can someone give me a hand?

droettger · February 6, 2019, 9:04am

No, syncthreads is not allowed in OptiX device code. OptiX abstracts all parts of the internal scheduling.
[url]http://raytracing-docs.nvidia.com/optix/guide/index.html#caveats#caveats[/url]

You could do that with a standard CUDA kernel outside of OptiX instead.
You would either need to copy the buffer to CUDA or use CUDA-OptiX interop to get the output buffer’s device pointer.
See here: [url]http://raytracing-docs.nvidia.com/optix/guide/index.html#cuda#interoperability-with-cuda[/url]