when run engine_from_bytes(bytes_from_path(self.engine_path)) to get OutOfMemory on L40 with 1gpu with flux-dev model
Hi @2623875587 ,
Can you please share more details, like model, logs, repro setup?
Thanks
the transformer model of flux-dev model with transformer.plan(23G)
@AakankshaS Hi, I run GitHub - NVIDIA/TensorRT: NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT. with flux-dev demo on L40s, the tensorrt files are generated, but when load transformer.plan of tensorrt engine files(clip.plan, t5.plan, transformer.plan, vae.plan), it occur the issue.