Are there binary utilities for TensorRT Engine?

Description

Are there binary utilities for TensorRT Engine?

I couldn’t find the binary utilities for TensorRT Engine.
Is there any plan to release it?

Environment

TensorRT Version:
GPU Type:
Nvidia Driver Version:
CUDA Version:
CUDNN Version:
Operating System + Version:
Python Version (if applicable):
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag):

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi,
Please check the below link, as they might answer your concerns

Thanks!

Dear AakankshaS

Thank you for your information.
I checked your link.
However, I couldn’t find the information about binary utilities.

Thanks!

Hi @Hiromitsu.Matsuura ,
Let me check on this and get back to you.
Thank you for your patience.

Dear @AakankshaS ,

Do you have any update?

Regards,
hiro

Yes, there are binary utilities available for working with TensorRT Engine. TensorRT is a deep learning inference optimizer and runtime library developed by NVIDIA. It is commonly used to optimize and deploy deep learning models for inference on NVIDIA GPUs.

TensorRT provides several binary utilities that can be useful in different stages of the TensorRT workflow. Here are a few notable ones:

  1. trtexec: This utility allows you to run TensorRT engines from the command line. It provides options to specify the network model, input data, precision modes, batch sizes, and other runtime configurations.
  2. trtconvertpy: This utility allows you to convert trained models in various popular deep learning frameworks (such as TensorFlow and ONNX) to TensorRT engines. It simplifies the process of optimizing the models for efficient inference on NVIDIA GPUs.
  3. trtexec (with --export) and trtoptimize: These utilities enable you to serialize a TensorRT engine to a binary file (usually with the .engine extension). This binary representation can be later loaded and executed directly without the need for re-optimization. This is particularly useful when deploying the model in production environments.
  4. trtserver: TensorRT Inference Server is a scalable, production-ready inference serving solution from NVIDIA. It provides a server infrastructure for hosting and serving TensorRT engines over the network, allowing multiple clients to make inference requests simultaneously.

Dear @matiashayes03,

Thank you for your information.
Unfortunately, your information is not what I look for.

I think there are following four CUDA binary tools.

  • cuobjdump
  • nvdisasm
  • cu++filt
  • nvprune

CUDA Binary Utilities (nvidia.com)

And I want to know similar tools for TensorRT Engine.

Regards,
hiro

yeah, may be I have tried to give detail overview of binary utilities and was thinking it may be helpfull for you.

Hi @Hiromitsu.Matsuura ,
Yes we do have Polygraphy for Tensorrt.
You can find the details here.
https://docs.nvidia.com/deeplearning/tensorrt/polygraphy/docs/index.html

Thanks.

Dear @AakankshaS,

Thank you for your information.
I will check it.

Regards,
hiro