I am building an application based on LaneNet in DriveWorks. In this application, I wish to control the number of CUDA cores dynamically. Is there a way to allocate a certain number of threads/CUDA cores to the process?
I searched the DriveWorks API but couldn’t find anything relevant.
Any help/advice would be greatly appreciated.
Dear Vishak Nathan,
DW is high level API which does not provide control to change CUDA kernel launching parameters. However, it uses optimal parameters. If you plan to write a new CUDA kernel if you can do this and use that kernel inside DW sample. What is your use case? May I know for why you want to control CUDA kernel launch parameters?
Thanks for your reply.
I am currently doing a research project where I am developing a scenario-based controller. This uses different no. of cameras for different scenarios.
I am trying to vary the number of CUDA kernels used based on the workload (scenario) to use the optimal mapping for every scenario.
TensorRT model used by DW while performing inference is optimal. The model makes use of CUDA kernels with optimal parameters already. We are providing high level APIs in DW by performing all the necessary optimizations. But If you want to control CUDA kernel launching parameters, you may have to write your own CUDA kernel as I said earlier.
How do I see the optimal parameters used by the model? I wish to see the number of CUDA cores my application uses.
The used kernel launch parameter details are not exposed via APIs. If you want to know the GPU utilization, You can use Tegrastats tool(https://docs.nvidia.com/drive/active/188.8.131.52L/nvvib_docs/index.html#page/NVIDIA%2520DRIVE%2520Linux%2520SDK%2520Development%2520Guide%2FUtilities%2FAppendixTegraStats.html%23).
Just for clarification, even CUDA API also does not provide provision to set affinity for your process. All these will be handled by GPU scheduler dynamically.