I’ve been observing some behaviour concerning
RT_BUFFER_INPUT_OUTPUT | RT_BUFFER_GPU_LOCAL which I struggle to explain, given what I found in related posts:
- Progressive photon mapping sample with multiple GPUs - #2 by yashiz
- CUDA/Optix GPU Utilisation - #3 by abcthomas
To quote the explanation in the posts and the documentation, the above flags result in the following behaviour wrt buffers:
the host to only write, and the device to read and write data
Which I had assumed meant that it should be possible to write into the buffer between the map/unmap calls on the host side. The issue we’re having atm is what I can only vaguely (sorry!) describe as incorrect rendering, which happens simply when placing
map/unmap calls - nothing in-between, just these two calls, that as far as I understand, shouldn’t be doing anything.
To give a better example, I tried to do the same for
optixWhitted optixWhitted.cpp (20.1 KB) rendering sample that comes with Optix 6.5 SDK. Namely, the changes I introduced are as follows:
accum_bufferinto global scope to line 85, ie simply define
- line 164: remove
Bufferin front of
- the buffer itself is defined (didn’t change anything here) as:
accum_buffer = context->createBuffer( RT_BUFFER_INPUT_OUTPUT | RT_BUFFER_GPU_LOCAL, RT_FORMAT_FLOAT4, width, height );
- inside glutDisplay() function:
** line 421: add
** line 422: add
The screen turns dark (a little bit more extreme than our incorrect rendering), but renders as usual when I move the camera around, and then turns dark again when I stop.
Could you please let me know what am I missing? In particular, I was wondering where is my understanding lacking, wrt the fact that the host should be able to write into such buffers (I assume in-between map/unmap calls).
And another related question. I realized that we have a couple of buffers defined with
RT_BUFFER_INPUT_OUTPUT | RT_BUFFER_GPU_LOCAL flags, which we are both writing to (from host) and reading from (into host) using a cuda pointer obtained with
buffer->getDevicePointer(optix_device_ordinal). Doesn’t appear that there are any issues there. Are there any caveats in reading from such a buffer using the above-mentioned approach for a single-GPU configuration?
I would greatly appreciate any pointers in he right direction. Thanks a lot for your time.
Setup: Win 10, Nvidia Quadro P4000, 451.48, Optix 6.5/6.0 (tested both)