The profiling tool with Triton Inference Server
We would like to monitor the following status on A100 with Triton Inference Server.
- TensorCore is working or not
- The whole execution time
- The GPU execution time
- Memory utilization
Do you have tools to monitor these status?
We know DLProf can not be used on Triton, so we’re looking for the solution.