Trtexec vs builder.build_serialized_network

Description

I converted custom model for multimodal service using tensorrt-llm. and I convert this model(vision model only converted) using two types.

  1. trtexec (from TensorRT 10.0.0)
  • It can be used async_v3 functions
  • It can works fine.
  1. builder.build_serialized_network(from TensorRT-LLM 0.10.0)

So, my questions is I don’t know what’s difference with trtexec and builder.build_serialized_network functions. It can be make almost same outputs but I don’t know why it can not use same functions. Please refer me I want to understand why it works different.

and If I want to get the perfectly same output using TensorRT, but have small amount of performance advantage, can it be possible?

Thanks.

Environment

TensorRT Version: 10.1.0
TensorRT-LLM Version: 0.10.0 (Stable)
GPU Type: L40s
Nvidia Driver Version: 550.90.07
CUDA Version:12.4
Operating System + Version: Ubuntu
Python Version (if applicable): 3.10.12
Baremetal or Container (if container which image + tag): nvidia/cuda:12.4.0-devel-ubuntu22.04

Relevant Files

Please attach or include links to any models, data, files, or scripts necessary to reproduce your issue. (Github repo, Google Drive, Dropbox, etc.)

Steps To Reproduce

Please include:

  • Exact steps/commands to build your repro
  • Exact steps/commands to run your repro
  • Full traceback of errors encountered

Hi @xjavalov ,
Request you to raise teh issue here