Hi,
I’ve been observing some behaviour concerning RT_BUFFER_INPUT_OUTPUT | RT_BUFFER_GPU_LOCAL
which I struggle to explain, given what I found in related posts:
- Progressive photon mapping sample with multiple GPUs - #2 by yashiz
- CUDA/Optix GPU Utilisation - #3 by abcthomas
To quote the explanation in the posts and the documentation, the above flags result in the following behaviour wrt buffers:
the host to only write, and the device to read and write data
Which I had assumed meant that it should be possible to write into the buffer between the map/unmap calls on the host side. The issue we’re having atm is what I can only vaguely (sorry!) describe as incorrect rendering, which happens simply when placing map/unmap
calls - nothing in-between, just these two calls, that as far as I understand, shouldn’t be doing anything.
To give a better example, I tried to do the same for optixWhitted
optixWhitted.cpp (20.1 KB) rendering sample that comes with Optix 6.5 SDK. Namely, the changes I introduced are as follows:
- pull
accum_buffer
into global scope to line 85, ie simply defineBuffer accum_buffer;
- line 164: remove
Buffer
in front ofaccum_buffer
- the buffer itself is defined (didn’t change anything here) as:
accum_buffer = context->createBuffer( RT_BUFFER_INPUT_OUTPUT | RT_BUFFER_GPU_LOCAL, RT_FORMAT_FLOAT4, width, height );
- inside glutDisplay() function:
** line 421: addaccum_buffer->map();
** line 422: addaccum_buffer->unmap();
The screen turns dark (a little bit more extreme than our incorrect rendering), but renders as usual when I move the camera around, and then turns dark again when I stop.
Could you please let me know what am I missing? In particular, I was wondering where is my understanding lacking, wrt the fact that the host should be able to write into such buffers (I assume in-between map/unmap calls).
And another related question. I realized that we have a couple of buffers defined with RT_BUFFER_INPUT_OUTPUT | RT_BUFFER_GPU_LOCAL
flags, which we are both writing to (from host) and reading from (into host) using a cuda pointer obtained with buffer->getDevicePointer(optix_device_ordinal)
. Doesn’t appear that there are any issues there. Are there any caveats in reading from such a buffer using the above-mentioned approach for a single-GPU configuration?
I would greatly appreciate any pointers in he right direction. Thanks a lot for your time.
Setup: Win 10, Nvidia Quadro P4000, 451.48, Optix 6.5/6.0 (tested both)