ONNX and tensorRT: ERROR: Network must have at least one output

Hi, xtao.
This may be caused by another Long type size.
Looks like it’s used by Unsqueeze.
So I think you should try to find this in pytorch code.

hi, the same error. I convert my pytorch model to onnx file, then build to get trt engine file. I think there are some errors with the depthwise convolution layer, since it worked before I change the convolution layer ‘groups’ attribute.
If 3D feature map with shape [channels, h, w],my depthwise convolutional layer should be with shape [channels, kernel_size, kernel_size]. Is that the same in tensorrt?

how to check my pytorch layers contain INT64?

I have the same error. I make a very simple model which has only one conv3d layer.

class Model(torch.nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        self.conv = torch.nn.Conv3d(in_channels=1, out_channels=16, kernel_size=3, stride=1, padding=1)

    def forward(self, x):
        return self.conv(x)

Then I generate an onnx model without any error.

def generate_onnx():
    model = Model().cuda()
    dummy_input = torch.randn(1, 1, 112, 112, 112, device='cuda:0', dtype=torch.float)
    torch.onnx.export(model, dummy_input, 'easy_model.onnx', verbose=True, input_names=['input'], output_names=['output'])

After that, I run the following code just same as the sample code in onnx_resnet50.py in introductory_parser_samples.

# The Onnx path is used for Onnx models.
def build_engine_onnx(model_file):
    with trt.Builder(TRT_LOGGER) as builder, builder.create_network() as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
        builder.max_workspace_size = common.GiB(1)
        # Load the Onnx model and parse it in order to populate the TensorRT network.
        with open(model_file, 'rb') as model:
            parser.parse(model.read())
        return builder.build_cuda_engine(network)

if __name__ == '__main__':
    model_path = './easy_model.onnx'
    build_engine_onnx(model_path)

I get [TensorRT] ERROR: Network must have at least one output. Please help. I use pytorch 1.3 and TensorRT 6.0.1.

I could not get any of these solutions to work. I installed a previous version of trt and it is working again.

Hi,

I use this repository https://github.com/bpinaya/AlexNetRT as a minimal example of using TRT for classification.
When in Debug mode the code runs without problems.
But when I switch to Release suddenly I get

ERROR: Network must have at least one output

I tried to use:

network->markOutput(*network->getLayer(network->getNbLayers()-1)->getOutput(0));

But it haven’t helped.
Could you help me to resolve the issue?

Regards

This specific issue is arising because the ONNX Parser isn’t currently compatible with the ONNX models exported from Pytorch 1.3 - If you downgrade to Pytorch 1.2, this issue should go away.

Hi,
You can use couple of approaches here:

  • Iterating through tensors and finding it’s dtype. eg: print(type(tensor))
  • Using visualizing tools like Tensorboardx.

Thanks

Hi,

We have the same problem for an object detection model converted from Tensorflow:

(tensorflow1.15env) svetlana@svetlana-desktop:/srv/ai-scripts$ python3 getplan.py 
[TensorRT] WARNING: onnx2trt_utils.cpp:217: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[TensorRT] WARNING: onnx2trt_utils.cpp:243: One or more weights outside the range of INT32 was clamped
[TensorRT] ERROR: Equal__2701:0: shape tensor must have type Int32.
[TensorRT] ERROR: Equal__2846:0: shape tensor must have type Int32.
[TensorRT] ERROR: Equal__1023:0: shape tensor must have type Int32.
[TensorRT] ERROR: Equal__1032:0: shape tensor must have type Int32.
[TensorRT] ERROR: Equal__1039:0: shape tensor must have type Int32.
[TensorRT] ERROR: Equal__1071:0: shape tensor must have type Int32.
[TensorRT] ERROR: Builder failed while analyzing shapes.
Traceback (most recent call last):
  File "getplan.py", line 20, in <module>
    eng.save_engine(engine, engine_name) 
  File "/srv/demo/Demos/scripts/tensorrt_uff/ai-scripts/engine.py", line 26, in save_engine
    buf = engine.serialize()
AttributeError: 'NoneType' object has no attribute 'serialize'

I have added the line

‘network.mark_output(network.get_layer(network.num_layers-1).get_output(0))’

suggested but that did not help.

Any ideas? Thank you

Svetlana

My onnx 1.7.0, torch 1.5.1 + cuda10.1, tensorrt 6.0.1.5, onnx to tensorrt error: network must have at least one output, what is the reason? The code is as follows:

import pycuda.autoinit
import numpy as np
import pycuda.driver as cuda
import tensorrt as trt
import torch
import os
import time
from PIL import Image
import cv2
import torchvision

filename = ‘bus.jpg’
max_batch_size = 1
onnx_model_path = ‘resnet50.onnx’
TRT_LOGGER = trt.Logger() # This logger is required to build an engine

def get_img_np_nchw(filename):
image = cv2.imread(filename)
image_cv = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image_cv = cv2.resize(image_cv, (224, 224))
miu = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])
img_np = np.array(image_cv, dtype=float) / 255.
r = (img_np[:, :, 0] - miu[0]) / std[0]
g = (img_np[:, :, 1] - miu[1]) / std[1]
b = (img_np[:, :, 2] - miu[2]) / std[2]
img_np_t = np.array([r, g, b])
img_np_nchw = np.expand_dims(img_np_t, axis=0)
return img_np_nchw

class HostDeviceMem(object):
def init(self, host_mem, device_mem):
“”“Within this context, host_mom means the cpu memory and device means the GPU memory
“””
self.host = host_mem
self.device = device_mem

def __str__(self):
    return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)

def __repr__(self):
    return self.__str__()

def allocate_buffers(engine):
inputs =
outputs =
bindings =
stream = cuda.Stream()
for binding in engine:
size = trt.volume(engine.get_binding_shape(binding)) * engine.max_batch_size
dtype = trt.nptype(engine.get_binding_dtype(binding))
# Allocate host and device buffers
host_mem = cuda.pagelocked_empty(size, dtype)
device_mem = cuda.mem_alloc(host_mem.nbytes)
# Append the device buffer to device bindings.
bindings.append(int(device_mem))
# Append to the appropriate list.
if engine.binding_is_input(binding):
inputs.append(HostDeviceMem(host_mem, device_mem))
else:
outputs.append(HostDeviceMem(host_mem, device_mem))
return inputs, outputs, bindings, stream

def get_engine(max_batch_size=1, onnx_file_path=“”, engine_file_path=“”,
fp16_mode=False, int8_mode=False, save_engine=False,
):
“”“Attempts to load a serialized engine if available, otherwise builds a new TensorRT engine and saves it.”“”

def build_engine(max_batch_size, save_engine):
    """Takes an ONNX file and creates a TensorRT engine to run inference with"""
    with trt.Builder(TRT_LOGGER) as builder, \
            builder.create_network() as network, \
            trt.OnnxParser(network, TRT_LOGGER) as parser:

        builder.max_workspace_size = 1 << 30  # Your workspace size
        builder.max_batch_size = max_batch_size
        # pdb.set_trace()
        builder.fp16_mode = fp16_mode  # Default: False
        builder.int8_mode = int8_mode  # Default: False
        if int8_mode:
            # To be updated
            raise NotImplementedError

        # Parse model file
        if not os.path.exists(onnx_file_path):
            quit('ONNX file {} not found'.format(onnx_file_path))

        print('Loading ONNX file from path {}...'.format(onnx_file_path))
        with open(onnx_file_path, 'rb') as model:
            print('Beginning ONNX file parsing')
            parser.parse(model.read())
        last_layer = network.get_layer(network.num_layers - 1)
        network.mark_output(last_layer.get_output(0))
        print('Completed parsing of ONNX file')
        print('Building an engine from file {}; this may take a while...'.format(onnx_file_path))

        engine = builder.build_cuda_engine(network)
        print("Completed creating Engine")

        if save_engine:
            with open(engine_file_path, "wb") as f:
                f.write(engine.serialize())
        return engine

if os.path.exists(engine_file_path):
    # If a serialized engine exists, load it instead of building a new one.
    print("Reading engine from file {}".format(engine_file_path))
    with open(engine_file_path, "rb") as f, trt.Runtime(TRT_LOGGER) as runtime:
        return runtime.deserialize_cuda_engine(f.read())
else:
    return build_engine(max_batch_size, save_engine)

def do_inference(context, bindings, inputs, outputs, stream, batch_size=1):
# Transfer data from CPU to the GPU.
[cuda.memcpy_htod_async(inp.device, inp.host, stream) for inp in inputs]
# Run inference.
context.execute_async(batch_size=batch_size, bindings=bindings, stream_handle=stream.handle)
# Transfer predictions back from the GPU.
[cuda.memcpy_dtoh_async(out.host, out.device, stream) for out in outputs]
# Synchronize the stream
stream.synchronize()
# Return only the host outputs.
return [out.host for out in outputs]

def postprocess_the_outputs(h_outputs, shape_of_output):
h_outputs = h_outputs.reshape(*shape_of_output)
return h_outputs

img_np_nchw = get_img_np_nchw(filename)
img_np_nchw = img_np_nchw.astype(dtype=np.float32)

fp16_mode = False
int8_mode = False
trt_engine_path = ‘resnet50.trt’
engine = get_engine(max_batch_size, onnx_model_path, trt_engine_path, fp16_mode, int8_mode)
context = engine.create_execution_context()
inputs, outputs, bindings, stream = allocate_buffers(engine) # input, output: host # bindings
shape_of_output = (max_batch_size, 1000)
inputs[0].host = img_np_nchw.reshape(-1)
t1 = time.time()
trt_outputs = do_inference(context, bindings=bindings, inputs=inputs, outputs=outputs, stream=stream)
t2 = time.time()
feat = postprocess_the_outputs(trt_outputs[0], shape_of_output)

print(‘TensorRT ok’)

model = torchvision.models.resnet50(pretrained=True).cuda()
resnet_model = model.eval()

input_for_torch = torch.from_numpy(img_np_nchw).cuda()
t3 = time.time()
feat_2= resnet_model(input_for_torch)
t4 = time.time()
feat_2 = feat_2.cpu().data.numpy()
print(‘Pytorch ok!’)

mse = np.mean((feat - feat_2)**2)
print(“Inference time with the TensorRT engine: {}”.format(t2-t1))
print(“Inference time with the PyTorch model: {}”.format(t4-t3))
print(‘MSE Error = {}’.format(mse))

print(‘All completed!’)

Hi,
I am converting my Custom model from ONNX to TRT.

I am using the below code to convert from ONNX to TRT:

import tensorrt as trt
TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
trt_runtime = trt.Runtime(TRT_LOGGER)
def build_engine(onnx_path, shape = [1,1,224,224]):
with trt.Builder(TRT_LOGGER) as builder, builder.create_network(1) as network, trt.OnnxParser(network, TRT_LOGGER) as parser:
builder.max_workspace_size = (256 << 20)
with open(onnx_path, ‘rb’) as model:
parser.parse(model.read())
network.get_input(0).shape = shape
engine = builder.build_cuda_engine(network)
return engine
def save_engine(engine, file_name):
buf = engine.serialize()def load_engine(trt_runtime, plan_path):
with open(engine_path, ‘rb’) as f:
engine_data = f.read()
def load_engine(trt_runtime, plan_path):
with open(engine_path, ‘rb’) as f:
engine_data = f.read()
engine = trt_runtime.deserialize_cuda_engine(engine_data)
return engine
from onnx import ModelProto

engine_name = “xyz.plan”
onnx_path = “xyz.onnx”

model = ModelProto()
with open(onnx_path, “rb”) as f:
model.ParseFromString(f.read())

engine = build_engine(onnx_path)
save_engine(engine, engine_name) `

The error I am facing is:
[TensorRT] ERROR: Network must have at least one output [TensorRT] ERROR: Network validation failed

TensorRT Version : 7.0.0-1.
Cuda Version : 10.2

May I know how to get rid of this Issue? would be a great help!

Thanks