Get Input Layer Dimensions - Specifically Channels

Description

I use PyTorch to train my model, then convert to ONNX to TRT. Everything here works great but when I launch my program I want to know at runtime if the model was trained with color or grey scale images. I am using dynamic inputs like [-1, 1, -1, -1] or [-1, 3, -1, -1] and trtexec to make the engine file.

I am able to do this by deserializing it and getting the channels from the binding shapes but this puts the model on the GPU. Is there a way to get this information from the engine file without loading it to the GPU? If not is there a way to stop/delete it from the GPU? I tried like engine.del and runtime.del and it is still showing when I run nvidia-smi

Environment

TensorRT Version: 7.0.0.11
GPU Type: T4
Nvidia Driver Version: 440.33.01
CUDA Version: 10.2
CUDNN Version:
Operating System + Version:
Python Version (if applicable): 3.6
TensorFlow Version (if applicable):
PyTorch Version (if applicable):
Baremetal or Container (if container which image + tag): Baremetal

Steps To Reproduce

This is what I am doing now but would prefer to NOT load to GPU at all if possible
weights = my engine file

    TRT_LOGGER = trt.Logger(trt.Logger.ERROR)
    runtime = trt.Runtime(TRT_LOGGER)
    with open(weights, 'rb') as f:
        with runtime.deserialize_cuda_engine(f.read()) as engine:
            num_channels_in = engine.get_binding_shape(0)[1]

    engine.__del__
    runtime.__del__

Hi @thomas.p.16,

Are you trying to deserialize engine file.
As engine file is a binary file, its not possible.

Thank you.

Hello, thank you for the reply. Which part is not possible? I can deserialize it but it goes on the GPU. Is it not possible to destroy it after I get the information that I need? This would be an acceptable solution as well.

Hi @thomas.p.16,

Looks like there are a couple issues. Firstly, we would recommend serializing the metadata before the engine data, e.g. :

with open(weights , "wb") as f:
   f.write(num_channels)
   f.write(engine.serialize())

Then the engine does not need to be read or deserialized to determine the number of channels. However, if we want to deserialize the engine, we should use a context manager for the runtime too:
with trt.Runtime(TRT_LOGGER) as runtime, ...

If we want to use __del__ , then it needs to be called as method, since it is a function - engine.__del__() rather than engine.__del__

We should not use both __del__ and the context manager (We would recommend using the context manager instead do it in this way).

Thank you.