Maximizing Deep Learning Inference Performance with NVIDIA Model Analyzer

Originally published at: https://developer.nvidia.com/blog/maximizing-deep-learning-inference-performance-with-nvidia-model-analyzer/

Figure 1. Screenshot of Model Analyzer. You’ve built your deep learning inference models and deployed them to NVIDIA Triton Inference Server to maximize model performance. How can you speed up the running of your models further? Enter NVIDIA Model Analyzer, the soon-to-be-released tool for gathering the compute requirements of your models. Without this information, there…

We’re really excited to share this work with you – we know this will help you get the most out of your inference models! We have plans to make this open-source shortly. The post will be updated when this happens.

If you have any questions or comments, or want to share how you’re using this for your models, please let us know!

Hi,

The project is so exciting and looks so promising! That said, I wanted to try out by creating the model analyzer Docker container using the Dockerfile in it’s github page. However, I have couple problems using it.

I am currently trying to run model analyzer following the steps mentioned in this link: Maximizing Deep Learning Inference Performance with NVIDIA Model Analyzer

However, when I download both chest_xray and segmentation_liver models from NGC and give their paths to model analyzer, it throws the following error:

Failed to load chest_xray on inference server: skipping model
Failed to load segmentation_liver on inference server: skipping model

Here is the docker command that I use to start model analyzer with.

docker run --gpus all -v /var/run/docker.sock:/var/run/docker.sock \
-v /home/$USER/server/docs/examples/model_repository:/home/models \
-v /home/$USER/results:/results --net=host model-analyzer:latest \
--batch 1,2,4 \
--concurrency 1,2,4 \
--model-names chest_xray,segmentation_liver \
--triton-version 20.02-py3 \
--model-folder /home/models \
--export --export-path /results/

How can I use this tool properly? Documentation is not enough.

Hi Doruk,

Thanks for giving Model Analyzer a try and asking for clarification!

Since Model Analyzer is specifically meant to be used on models prepared for Triton, it expects them in the same format as Triton does.

If you’re looking to try it with pre-trained Clara models from NGC, the best bet is to install Clara Deploy and pull that model’s pipeline. That will install a folder with the model to your chosen directory, and you can then use it for Model Analyzer. Alternatively, you can download the model directory directly from the pipeline in NGC (e.g. app_chestxray-model_v1.zip under the chest x-ray pipeline). Also, please ensure the name matches the model name in the configuration file used (config.pbtxt). For example, I believe the chest x-ray you are using is named classification_chestxray_v1. Hope that helps!

2 Likes

Hi David,

Thanks for your reply. I actually wanted to try out Model Analyzer on Triton Server’s example models first but I got the same error with those models too. Since you already have a post where you try out Model Analyzer on Clara models, I wanted to give it a shot. So the Model Analyzer supports models in the following formats right?

  • model.plan for TensorRT models
  • model.graphdef for TensorFlow GraphDef models
  • model.savedmodel for TensorFlow SavedModel models
  • model.onnx for ONNX Runtime ONNX models
  • model.pt for PyTorch TorchScript models
  • model.netdef and init_model.netdef for Caffe2 Netdef models

However, when I try to give two of Triton Server’s example models to the Model Analyzer with following two methods, Model Analyzer throws the same error.

Method 1
Renaming the model file names and config.pbtxt file names to model’s own name and gathering all of them under the same directory.

Method 2
Leaving file names as what they already are and seperating them under the respective directories.

Screenshot from 2020-09-09 09-44-20
Screenshot from 2020-09-09 09-44-35 Screenshot from 2020-09-09 09-44-29

None of the above two methods working for me right now. I am using the below command to run Model Analyzer’s Docker image.

docker run --gpus all -v /var/run/docker.sock:/var/run/docker.sock \
-v /home/$USER/models:/home/models \
-v /home/$USER/results:/results --net=host \
model-analyzer:latest \
--batch 1,4,8 --concurrency 2,4,8 \
--model-names inception_graphdef,resnet50_netdef \
--triton-version 20.02-py3 \
--model-folder /home/models \
--export --export-path /results/

And this is the final output:

Hi Doruk,

Thanks for providing such detail. Your method 2 is correct. Essentially, if the repository works with Triton Inference Server via the --model-repository flag, it works with Model Analyzer. So a good test is to load it into Triton.

I went through the steps you provided. The issue is happening from the mapped model folder not being the same as the absolute path to it on your computer, due to how Docker opens other Docker containers. The container tries to access /home/models on your local machine and cannot find the models there, so it skips them.

Apologies if the sample command did not make this point explicit. We are updating the documentation as feedback is provided.

If you run the below command, it should work.

docker run --gpus all -v /var/run/docker.sock:/var/run/docker.sock \
-v /home/$USER/models:/home/$USER/models \
-v /home/$USER/results:/results --net=host \
model-analyzer:latest \
--batch 1,4,8 --concurrency 2,4,8 \
--model-names inception_graphdef,resnet50_netdef \
--triton-version 20.02-py3 \
--model-folder /home/$USER/models \
--export --export-path /results/

Hi all,

Thanks for providing useful information and instructions.

I built an image from Dockerfile of Github.

Steps

git clone https://github.com/NVIDIA/model-analyzer
docker build -f Dockerfile -t model-analyzer .

Then I used this command that I got error.

docker run  -v /var/run/docker.sock:/var/run/docker.sock \                                                                                                                                 
-v /home/chieh/model_repository:/home/chieh/model_repository \
-v /home/chieh/results:/workspace/results --net=host \
model-analyzer:latest --batch 1,4,8 --concurrency 2,4,8 \ 
--model-names model_od_onnx,model_onnx\
--triton-version 20.03.1-py3 \
--model-folder /home/chieh/model_repository \
--export --export-path /workspace/results/

Error message:

Error: Failed to initialize NVML
Unhandled exception. System.TypeInitializationException: The type initializer for 'ModelAnalyzer.Metrics.GpuMetrics' threw an exception.
 ---> System.InvalidOperationException: Error starting embedded DCGM engine. DCGM initialization error.
   at ModelAnalyzer.Metrics.GpuMetrics..cctor()
   --- End of inner exception stack trace ---
   at ModelAnalyzer.Metrics.GpuMetrics..ctor()
   at ModelAnalyzer.MetricsCollector..ctor(MetricsCollectorConfig config)
   at ModelAnalyzer.Program.<>c__DisplayClass7_0.<Main>b__1(CLIOptions options)
   at CommandLine.ParserResultExtensions.MapResult[T1,T2,TResult](ParserResult`1 result, Func`2 parsedFunc1, Func`2 parsedFunc2, Func`2 notParsedFunc)
   at ModelAnalyzer.Program.Main(String[] args)

Here is my normal TRTIS command:

docker run --runtime nvidia \
    --rm --shm-size=1g \
    --ulimit memlock=-1 \
    --ulimit stack=67108864 \
    -p 8000:8000 -p 8001:8001 -p 8002:8002 \
    --name trt_serving7 \
    -v  /home/chieh/model_repository:/models \
    nvcr.io/nvidia/tritonserver:20.03.1-py3 \
    tritonserver --model-store=/models

It can work very well. (It can also do inference on client side)

Questions:

  1. If I launch model analyzer, can I also launch Triton-Inference-Server? (BTW, I haven’t launched the TRTIS for the situation of error message above.)
  2. How can I solve this issue? Is there any hint?

Thank you so much!

BR,
Chieh

Hi Chieh,

Apologies, I’m seeing this a few weeks late.

This error means that your system is not compatible with DCGM. Model Analyzer uses DCGM under the hood to capture the metrics. Please find system requirements in the documentation here: https://docs.nvidia.com/datacenter/dcgm/latest/dcgm-user-guide/getting-started.html#supported-platforms

Kind regards,
David

Hi Chieh,

Apologies, I’m seeing this a few weeks late.

This error means that your system is not compatible with DCGM. Model Analyzer uses DCGM under the hood to capture the metrics. Please find system requirements in the DCGM documentation. I would post a link, but the forum is flagging it.

Kind regards,
David

Dear @david.yastremsky,

Thanks for your reply and your important information!
I saw the link of your provided from previous comment that I will study more about that.

Thank you again!

Sincerely,
Chieh

1 Like