Weird behavior of atomicAdd in Optix

alex.nevidomsky · November 18, 2022, 9:16am

Optix is said to be per-ray model, so in examples __raygen__rg() function calls are nicely isolated in their memory access by the pixel position.
I’m trying to implement some global statistics. I changed the image buffer into stats buffer; the access is now overlapping, so I clean the buffer on creation and use atomicAdd(float*, float) calls.
This leads to weird results, though: I get NaNs - that is, until I try checking the incoming values explicitly using isnan(), in which case it magically starts working fine. Looks like some kind of a race condition. I there some obvious mistake that I’m making?

droettger · November 18, 2022, 9:38am

How exactly are you clearing that stats buffer on the device?
With some explicit kernel launch or one of the cuMemset calls (recommended) or some host to device memcopy ?
https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__MEM.html#group__CUDA__MEM_1g983e8d8759acd1b64326317481fbf132

Are you using an asynchronous call for that?
Is that using the same CUDA stream as the following optixLaunch?

In that case the following optixLaunch should find the initialized values inside the device buffer and the first atomicAdd should always get the defined clear result.

This should usually work. I’ve done this before for scattered color accumulations.
https://forums.developer.nvidia.com/t/best-strategy-for-splatting-image-for-bidir/111000/2

It will obviously not work if buffer entries are cleared in the same optixLaunch call which do the atomicAdd because the single ray-programming model doesn’t allow any assumptions about neighboring launch indices.

Some more information would be required to determine what could have gone wrong.

What is your system configuration?
OS version, installed GPU(s), VRAM amount, display driver version, OptiX (major.minor.micro) version, CUDA toolkit version (major.minor) used to generate the input PTX, host compiler version.

How did you allocate that stats buffer?
Where does it reside (device, pinned memory, etc.)
Are there multiple GPUs involved?

Maybe just post the exact code excerpts which create and initialize the buffer and all device code accessing the buffer.

alex.nevidomsky · November 18, 2022, 9:42am

I do cudaMemset(m_device_pixels, 0, size) before calling optixLaunch

Ubuntu 22.04, 2070, Driver Version: 520.56.06 CUDA Version: 11.8

What looks especially weird is that I tried checking the values, and when the check is active, the problem disappears, so it looks more like a race.

droettger · November 18, 2022, 10:04am

That’s not enough information to be able to help further.
I understand that there might be some race condition. I just can’t say why that could happen, yet, and that’s why I’m asking all these things.

Please answer all my questions (including the OptiX version number) or provide a minimal and complete reproducer in failing state, best by changing one of the OptiX SDK examples.

Is the optixLaunch using a different stream than the default?
Does the behavior change if you put a synchronization call between the cudaMemset and the optixLaunch?
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#explicit-synchronization

alex.nevidomsky · November 20, 2022, 8:37am

My deepest apologies. The strange behavior was caused by my miscalculation that caused a local buffer overrun - it caused the weird results. Thanks again for your swift feedback!

Topic		Replies	Views
Is this a bug in OptiX? OptiX	5	1953	June 14, 2022
Race condition? CUDA Programming and Performance	6	8211	December 5, 2009
Question about handling buffers when using multiple GPUs? OptiX	14	3867	June 15, 2022
Atomic buffer operations OptiX	5	4456	June 14, 2022
OptiX 6.0.0 performance loss? OptiX	13	1414	June 14, 2022
Implementing image filter kernels at runtime OptiX	18	1147	June 14, 2022
atomicAdd introduces error even when not executed CUDA Programming and Performance	2	646	September 11, 2022
atomic functions CUDA Programming and Performance	2	770	July 9, 2015
Host-device transfer bottleneck OptiX	4	1073	June 14, 2022
To use atomic add Legacy PGI Compilers	13	11778	June 30, 2012

Weird behavior of atomicAdd in Optix

Related topics