How to allocate specific resource when running profile engine model?

I have 2 engine models and I want to compare inference time of the 2 engine models from profiling. But I recognize that results of profiling runs are different with large fluctuation. So I want to allocate specific resource for each run. Is there any way to do it? If no, so it is difficult to compare 2 engine models by using profiling.

Hi,

Sorry for the delayed response.
You can customize “CUDA_VISIBLE_DEVICES” for GPUs.
Example: You can make single GPU visible to the TensorRT NGC container and try profiling multiple times.

Thank you.

@spolisetty
Thanks.

You can customize “CUDA_VISIBLE_DEVICES” for GPUs.

Yes, I know this option. I mean that is there any way to allocate RAM, no. CPU core when profiling.