Using cudaMemcpy() to copy data from host to OptiX buffer object

Hi,
I found a weird behavior regarding CUDA OptiX interop.
I got a device pointer from rtBufferGetDevicePointer() then tried to use cudaMemcpy() to copy data from host to OptiX buffer, but the buffer doesn’t seem to be updated. The code is like this:

optix::Buffer buf;
float data[numElements]; // set up somewhere
float *d_ptr;
float *h_ptr;
rtBufferGetDevicePointer( buf->get(), 0, &d_ptr);
cudaMemcpy( d_ptr, data, sizeof(float)*numElements, cudaMemcpyHostToDevice);
h_ptr = (float *)buf->map();  // not the same as data

Is it legit to do such kind of operation?
What kind of CUDA buffer OptiX utilizes?
Is constantly updating buffer through buffer mapping efficient?

OptiX doesn’t know that the data on the device is dirty, so it isn’t going to copy the data back to the host until you either tell OptiX that you dirtied the buffer (see rtBufferMarkDirty) or perhaps after a launch.

Thank you for for response.

I found out that I need to create OptiX buffer as RT_BUFFER_INPUT_OUTPUT to make cudaMemcpy() copy from host to device work. It was RT_BUFFER_INPUT. Don’t know if I misread the document, but doesn’t it read: “RT_BUFFER_INPUT - Only the host may write to the buffer. Data is transferred from host to device and device access is restricted to be read-only.”?

As for rtBufferMarkDirty(), I thought the term “dirty” is in sense of multi-GPU environment. When it’s marked dirty, it copy across devices. Does it also has effect on how host may get updated data from?

Finally, are following two statements equivalent?

memcpy( buf->map(), data, sizeof(float)*numElements);
buf->unmap();
rtBufferGetDevicePointer( buf->get(), 0, &d_ptr);
cudaMemcpy( d_ptr, data, sizeof(float)*numElements, cudaMemcpyHostToDevice);
buf->markDirty();  // if needed

Thank you!

Oh, yes, if you have input only buffers, the data from the device is never copied back to the host. You can map the host buffer, but you will only get what you put in it last.

Marking something dirty simply tells OptiX that you changed the data on the device outside of OptiX, and you are right that this has the most impact for multi-GPU setups where the buffer from one device needs to be copied to the other devices.

Your two code blocks are generally equivalent from the device’s point of view. Are you seeing behavior you weren’t expecting?