TensorRT produces all zero outputs when running in reduced precision

anna.buchele · November 18, 2022, 4:27pm

Description

Hi,

I’m trying to run mobilenet_v2 with reduced precision on an emulated jetson nano 4gb.

The model seems to work when I run in full precision. However, if I convert to trt using the --best flag (or --int8 or --fp16 flags) which enable lower precision, the model stops working.

Environment

TensorRT Version: 8.4.1.5-1+cuda11.4
GPU Type: Jetson Orin 4gb
Nvidia Driver Version: 35.1.0
CUDA Version: 11.4.14-1
CUDNN Version: 8.4.1.50-1+cuda11.4
Operating System + Version: linux 5.10.104-tegra
Python Version (if applicable): 3.8.10
TensorFlow Version (if applicable):
PyTorch Version (if applicable): 1.11.0 (nvidia build with cuda for arm)
Baremetal or Container (if container which image + tag): jetson orin developer board, emulated as 4gb nano

Steps To Reproduce

I’m creating the model like this:

from torchvision.models import mobilenet_v2
import torch

net = mobilenet_v2().cuda()
    net.eval()
    x = torch.randn(1, 3, 224, 224, requires_grad=True).cuda()
    torch_out = net(x)

    # Export the model
    torch.onnx.export(net,               # model being run
                      x,                         # model input (or a tuple for multiple inputs)
                      "mobilenet_v2.onnx",   # where to save the model (can be a file or file-like object)
                      export_params=True,        # store the trained parameter weights inside the model file
                      opset_version=10,          # the ONNX version to export the model to
                      do_constant_folding=True,  # whether to execute constant folding for optimization
                      input_names = ['input'],   # the model's input names
                      output_names = ['output'], # the model's output names
                      dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                                    'output' : {0 : 'batch_size'}}
                      )

And converting it like this:

trtexec --onnx=mobilenet_v2.onnx --saveEngine=mobilenet_v2.trt

Which seems to work fine.

However, if I convert using a flag to enable reduced precision:

trtexec --onnx=mobilenet_v2.onnx --saveEngine=mobilenet_v2.trt --best

Then the model outputs only zeros.

I also get this when using the --fp16 or --int8 flags instead of --best.

I have tried folding the model with polygraphy surgeon, which doesn’t seem to affect the output in this way.

I’m running the model like this:

import tensorrt as trt
from datetime import datetime, timedelta
import pycuda.driver as cuda
import pycuda.autoinit
import numpy as np
import os
from PIL import Image
import random

def load_dataset(path):
    dataset = []
    for i, f in enumerate(os.listdir(path)):
        if f.endswith(".jpg"):
            im = Image.open(os.path.join(path, f))
            im = np.asarray(im)
            im = im.astype(np.float32)/255
            im = np.moveaxis(im, 2, 0)
            im = np.ascontiguousarray(np.expand_dims(im, axis=0))
            dataset.append(im)
    return dataset

def main(dataset, model_name, PRECISION):
    print("STATS FOR: " + str(PRECISION))
    BATCH_SIZE = 1

    dummy_input_batch = dataset[0]
    output = np.empty([BATCH_SIZE, 1000], dtype = PRECISION)

    start = datetime.now()
    f = open(model_name, "rb")
    runtime = trt.Runtime(trt.Logger(trt.Logger.INFO))

    engine = runtime.deserialize_cuda_engine(f.read())
    context = engine.create_execution_context()

    # allocate device memory
    d_input = cuda.mem_alloc(1 * dummy_input_batch.nbytes)
    d_output = cuda.mem_alloc(1 * output.nbytes)

    bindings = [int(d_input), int(d_output)]

    stream = cuda.Stream()

    def predict(batch): # result gets copied into output
        # transfer input data to device
        cuda.memcpy_htod_async(d_input, batch, stream)
        # execute model
        context.execute_async_v2(bindings, stream.handle, None)
        # transfer predictions back
        cuda.memcpy_dtoh_async(output, d_output, stream)
        # syncronize threads
        stream.synchronize()

        return output

    out = predict(dataset[0])
    print(out)

Any ideas?

Thanks!

NVES · November 18, 2022, 5:07pm

Hi, Please refer to the below links to perform inference in INT8

Thanks!

spolisetty · November 28, 2022, 2:15pm

Hi,

Could you please share with us the issue repro ONNX model and command/logs to try on our side to better debug.
Thank you.

anna.buchele · November 28, 2022, 9:46pm

Sure - looking at the docs linked in the other comment, it looks like the issue on my end is possibly that I’m not running INT8 calibration (Is it expected that calibration would be required to get any results? Or does calibration simply help to produce more accurate results?).

Either way, I’ve implemented a script that would run the calibration:

lass EntropyCalibrator(trt.tensorrt.IInt8EntropyCalibrator2):
    def __init__(self, input_layers, output_layers, stream):
        trt.tensorrt.IInt8EntropyCalibrator2.__init__(self)       
        self.input_layers = input_layers
        self.output_layers = output_layers
        self.stream = stream
        self.d_input = cuda.mem_alloc(self.stream.calibration_data.nbytes)
        stream.reset()

    def get_batch_size(self):
        return self.stream.batch_size

    def get_batch(self, bindings):
        batch = self.stream.next_batch()
        if not batch.size:   
            return None
        
        cuda.memcpy_htod(self.d_input, batch)
        bindings[0] = int(self.d_input)
        return bindings

    def read_calibration_cache(self):
        return None

    def write_calibration_cache(self, memview):
        with open('calibration_cache.bin', 'wb') as f:
            f.write(memview.tobytes())
        return None

class ImageBatchStream():
    def __init__(self, batch_size, calibration_files, preprocessor=None):
        self.batch_size = batch_size
        self.max_batches = (len(calibration_files) // batch_size) + \
            (1 if (len(calibration_files) % batch_size) else 0)
        self.files = calibration_files
        self.calibration_data = np.zeros((batch_size, CHANNEL, HEIGHT, WIDTH), dtype=np.float32)
        self.batch = 0
        self.preprocessor = preprocessor

    @staticmethod
    def read_image_chw(path):
        img = Image.open(path).resize((WIDTH, HEIGHT), Image.Resampling.NEAREST)
        im = np.array(img, dtype=np.float32, order='C')
        im = im[:,:, ::-1]
        im = im.transpose((2, 0, 1))
        return im

    def reset(self):
        self.batch = 0

    def next_batch(self):
        if self.batch < self.max_batches:
            imgs = []
            files_for_batch = self.files[self.batch_size * self.batch : self.batch_size * (self.batch + 1)]
            for f in files_for_batch:
                img = ImageBatchStream.read_image_chw(f)
                if self.preprocessor:
                    img = self.preprocessor(img)
                imgs.append(img)
            for i in range(len(imgs)):
                self.calibration_data[i] = imgs[i]
            self.batch += 1
            return np.ascontiguousarray(self.calibration_data, dtype=np.float32)
        else:
            return np.array([])
    
    def get_batch_size(self):
        return self.batch_size

def build_engine(onnx_file_path):
    # Create logger to capture errors, warnings, and other information during the build and inference phases
    TRT_LOGGER = trt.Logger()
    # initialize TensorRT engine and parse ONNX model
    builder = trt.Builder(TRT_LOGGER)

    network = builder.create_network(1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH))
    parser = trt.OnnxParser(network, TRT_LOGGER)

    # Create optimization profile
    profile = builder.create_optimization_profile()
    profile.set_shape('input', (1, 3, 224, 224),  (1, 3, 224, 224), (1, 3, 224, 224))

    # parse ONNX
    with open(onnx_file_path, 'rb') as model:
        print('Parsing ONNX file')
        parser.parse(model.read())

    config = builder.create_builder_config()

    # Set workspace size
    workspace_size = 3 # GB
    config.max_workspace_size = workspace_size * 1 << 30
    config.set_memory_pool_limit(trt.MemoryPoolType.WORKSPACE, workspace_size << 30)

    # Create calibration 
    NUM_IMAGES_PER_BATCH = 1
    calibration_files = create_calibration_dataset()
    batchstream = ImageBatchStream(NUM_IMAGES_PER_BATCH, calibration_files)
    Int8_calibrator = EntropyCalibrator(["input"], ["output"], batchstream)

    # Set calibration
    config.set_flag(trt.BuilderFlag.INT8)
    config.int8_calibrator = Int8_calibrator
    config.add_optimization_profile(profile)

    # generate TensorRT engine optimized for the target platform
    print('Building an engine...')
    engine = builder.build_serialized_network(network, config)
    print('Built an engine...')
    context = engine.create_execution_context()
    print("Completed creating Engine")
 
    return engine, context

def create_calibration_dataset():
  calibration_files = [os.path.join(CALIBRATION_DATASET_LOC, f) for f in os.listdir(CALIBRATION_DATASET_LOC)]
  shuffle(calibration_files)
  return calibration_files[:100]

def main():
    print(ONNX_FILE_PATH)
    onnx_model = onnx.load(ONNX_FILE_PATH)
    onnx.checker.check_model(onnx_model)
    engine, context = build_engine(ONNX_FILE_PATH)

The issue I’m having now is that I’m running out of memory. I’ve tried setting the workspace size and adding a swapfile, but neither of those seemed to help.

The warning logs I’m getting are:

[11/28/2022-16:17:04] [TRT] [W] Calibration Profile is not defined. Running calibration with Profile 0
[11/28/2022-16:17:05] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:34:01] [TRT] [W] Missing scale and zero-point for tensor input.416, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[11/28/2022-16:34:01] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer* 99) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[11/28/2022-16:34:01] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer* 100) [Matrix Multiply]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[11/28/2022-16:34:01] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer* 101) [Constant]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[11/28/2022-16:34:01] [TRT] [W] Missing scale and zero-point for tensor (Unnamed Layer* 102) [Shuffle]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[11/28/2022-16:35:03] [TRT] [W] Tactic Device request: 204MB Available: 174MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:03] [TRT] [W] Skipping tactic 13 due to insufficient memory on requested size of 204 detected for tactic 0x0000000000000074.
[11/28/2022-16:35:03] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:05] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:05] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:05] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:05] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:05] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:05] [TRT] [W] Tactic Device request: 338MB Available: 156MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:05] [TRT] [W] Skipping tactic 13 due to insufficient memory on requested size of 338 detected for tactic 0x0000000000000074.
[11/28/2022-16:35:05] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:07] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:08] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:09] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:14] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:14] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:14] [TRT] [W] Tactic Device request: 340MB Available: 312MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:14] [TRT] [W] Skipping tactic 3 due to insufficient memory on requested size of 340 detected for tactic 0x0000000000000004.
[11/28/2022-16:35:14] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:14] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:14] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:14] [TRT] [W] Tactic Device request: 340MB Available: 311MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:14] [TRT] [W] Skipping tactic 8 due to insufficient memory on requested size of 340 detected for tactic 0x000000000000003c.
[11/28/2022-16:35:14] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:14] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:14] [TRT] [W] Tactic Device request: 340MB Available: 313MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:14] [TRT] [W] Skipping tactic 13 due to insufficient memory on requested size of 340 detected for tactic 0x0000000000000074.
[11/28/2022-16:35:14] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Tactic Device request: 678MB Available: 314MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:16] [TRT] [W] Skipping tactic 3 due to insufficient memory on requested size of 678 detected for tactic 0x0000000000000004.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Tactic Device request: 678MB Available: 314MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:16] [TRT] [W] Skipping tactic 8 due to insufficient memory on requested size of 678 detected for tactic 0x000000000000003c.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Tactic Device request: 678MB Available: 312MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:16] [TRT] [W] Skipping tactic 13 due to insufficient memory on requested size of 678 detected for tactic 0x0000000000000074.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:16] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Tactic Device request: 902MB Available: 312MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:18] [TRT] [W] Skipping tactic 3 due to insufficient memory on requested size of 902 detected for tactic 0x0000000000000004.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Tactic Device request: 902MB Available: 313MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:18] [TRT] [W] Skipping tactic 8 due to insufficient memory on requested size of 902 detected for tactic 0x000000000000003c.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Tactic Device request: 902MB Available: 313MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:18] [TRT] [W] Skipping tactic 13 due to insufficient memory on requested size of 902 detected for tactic 0x0000000000000074.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:18] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Tactic Device request: 2820MB Available: 310MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:20] [TRT] [W] Skipping tactic 3 due to insufficient memory on requested size of 2820 detected for tactic 0x0000000000000004.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Tactic Device request: 2820MB Available: 301MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:20] [TRT] [W] Skipping tactic 8 due to insufficient memory on requested size of 2820 detected for tactic 0x000000000000003c.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Tactic Device request: 2820MB Available: 299MB. Device memory is insufficient to use tactic.
[11/28/2022-16:35:20] [TRT] [W] Skipping tactic 13 due to insufficient memory on requested size of 2820 detected for tactic 0x0000000000000074.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:20] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:21] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:21] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:21] [TRT] [W] Unknown embedded device detected. Using 2779MiB as the allocation cap for memory on embedded devices.
[11/28/2022-16:35:23] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
[11/28/2022-16:35:23] [TRT] [W] The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
Built an engine...

AttributeError: 'tensorrt.tensorrt.IHostMemory' object has no attribute 'create_execution_context'

(I skipped a lot of the identical warning messages).
It seems like the calibration profile I’m defining may not be used here?

I get a similar set of warnings when I use trtexec, but it does complete the conversion. Is this an issue with the python API only?

spolisetty · December 6, 2022, 6:24am

Hi,

Please refer to the following similar post, hope this will help you.

Thank you.

anna.buchele · December 6, 2022, 2:39pm

Thank you! Do you know when 8.5 will be released?

spolisetty · December 8, 2022, 4:14pm

Hi,

We are moving this post to the Jetson Orin Nano forum to get better answer for the above query.

Thank you.

AastaLLL · December 9, 2022, 1:58am

Hi,

There is a DP version if you want to give it a try.

However, the model should trigger the fallback path (use 95% of total memory) and it’s expected to work.
Could you share the mobilenet_v2.onnx with us so we can get it a try?

More, have you tried the model on Orin devkit without emulation?
Thanks.

system · December 28, 2022, 5:03am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.