Hi Guys,
I am working on a project to simulate camera’s capturing projector light–we’ve written the simulation code using optix but are finding the throughput of the output from the GPU to be unacceptably slow.
I’m working with the sample code in Optix 7.2 running the optixHello program with a buffer width:2560 and height:1920 which produces an single color image thats 191kbs.
Adding timestamps to each stage of the program shows that the output step is taking >400ms to pull the data from the GPU, and we’ve seen similar times when storing images in memory. And this is with both a Nvidia Quadro P1000, and a render using Geforce RTX 2080
I am working on implementing suggestions in these documents:
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#maximize-memory-throughput
https://developer.nvidia.com/blog/how-optimize-data-transfers-cuda-cc/
Is there anything else I can try?