Flux model engine_from_bytes(bytes_from_path(self.engine_path)) OutOfMemory

2623875587 · October 17, 2024, 10:50am

when run engine_from_bytes(bytes_from_path(self.engine_path)) to get OutOfMemory on L40 with 1gpu with flux-dev model

AakankshaS · October 21, 2024, 3:33pm

Hi @2623875587 ,
Can you please share more details, like model, logs, repro setup?

Thanks

2623875587 · October 22, 2024, 9:33am

the transformer model of flux-dev model with transformer.plan(23G)

2623875587 · October 22, 2024, 9:39am

@AakankshaS Hi, I run GitHub - NVIDIA/TensorRT: NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT. with flux-dev demo on L40s, the tensorrt files are generated, but when load transformer.plan of tensorrt engine files(clip.plan, t5.plan, transformer.plan, vae.plan), it occur the issue.

Topic		Replies	Views
Using TensorRT3.0 to convert tensorflow model to create TensorRT engine Jetson TX1	3	659	March 8, 2018
Can't load trt engine and throwing an instance of 'nvinfer1::MyelinError' TAO Toolkit	17	2912	October 12, 2021
LLVM ERROR : out of memory(onnx to tensorRT engine) TensorRT tensorrt , onnx	6	1312	May 30, 2024
TF-TRT no engine generated TensorRT tensorrt , tensorflow	4	991	October 18, 2022
[TensorRT] OutOfMemory Error when building engine from ONNX model TensorRT tensorrt	6	4068	January 2, 2024
TensorRT Python API builder build_engine faiure - Error Code 2: OutOfMemory (no further information) TensorRT	1	1058	March 24, 2022
Tensorflow inference using TRT converted model TensorRT	10	1147	May 25, 2021
Run out of memory when creating TensorRT engine from onnx model Jetson Xavier NX tensorrt	7	2926	October 18, 2021
Convert tensorrt engine from version 7 to 8 TAO Toolkit tensorrt	67	4808	October 12, 2021
Cuda OutOfMemory when creating tensor with 2^29 (~0.5 G) elements TensorRT tensorrt , cuda , onnx	6	1836	March 9, 2022