I want to split a network into two parts: the first part runs with int8 mode, and the second part runs with float mode. So two engines should be created, something like:
engine_1 = trt.lite.Engine(framework="caffe",
deployfile="model_1.prototxt",
modelfile="model_1.caffemodel",
max_batch_size=1,
max_workspace_size=(10 << 25),
input_nodes={"data_1":(CHANNEL,HEIGHT,WIDTH)},
output_nodes=["prob_1"],
data_type=trt.infer.DataType.INT8,
calibrator=int8_calibrator,
logger_severity=trt.infer.LogSeverity.INFO)
engine_2 = trt.lite.Engine(framework="caffe",
deployfile="model_2.prototxt",
modelfile="model_2.caffemodel",
max_batch_size=1,
max_workspace_size=(10 << 25),
input_nodes={"data_2":(CHANNEL,HEIGHT,WIDTH)},
output_nodes=["prob_2"],
logger_severity=trt.infer.LogSeverity.INFO)
It generates segfault.
I want to know whether it is possible to create two engines.
If not, what is the proper way to support mixed precision?
Thanks!