First, you can see exactly how much memory OptiX will use for acceleration structures because you are allocating the device memory for that depending on the optixAccelComputeMemoryUsage results.
Similar with all other CUDA malloc calls you do inside your application.
The only thing in OptiX where the memory requirements aren’t known to the developer a-priori is the OptiX internal stack size which depends on various parameters of the modules, programs, pipeline, recursion depth, traversal depth and the underlying GPU. The more cores, the more memory is required.
(Note that one must calculate the OptiX stack size explicitly when using callable programs and it’s always recommended anyways.)
On top of that there is some memory allocated by CUDA on the device for the internal resource management when creating a CUDA context.
If you just want some overall memory usage statistics, you can use the CUDA call cudaMemGetInfo() (resp. cuMemGetInfo for the CUDA Driver API).
Another simple method is to run an nvidia-smi command in a command prompt which prints out memory statistics like this: