Inference with Python scripts using .trt TensorRT on JetsonTX2

philip.mizuro · February 28, 2022, 12:09pm

• Hardware: JetsonTX2
• Network Type
- EmotionNet
- HeartRateNet
- GazeEstimation
• TLT Version: v3.21.11-tf1.15.5
• Jetpack 4.4

I have managed to convert gaze model, emotionNet and heartRateNet as well as face detection and facial landmarks from model.etlt to TensorRT engine .trt file.

After the conversion my idea was to use trt models with Python scripts. I have found a way to load/inference FaceDetect & FacialLandmarks and works pretty good.

I have continued to work with EmotionNet but I am still not sure about preprocessing of facial landmarks as input to that model. I tried to normalized them, but still seem to get all zeros out from the model.

I am interested if you have some useful scripts that preprocess the data into these TensorRT models (GazeNet, EmotionNet, HeartRateNet) as a inference script. I have found the Jupyter notebooks but there isn’t nothing that helps with understanding how the models actually expect the data.

Best regards.

Morganh · February 28, 2022, 12:53pm

Currently, only below applications are available for reference.

philip.mizuro · February 28, 2022, 1:22pm

At the moment I am approaching this using this python script. But still I get vector of zeros from inference on the tensor trt model.

import time
import itertools
import numpy as np
import pycuda.driver as cuda
import tensorrt as trt


class HostDeviceMem(object):
    def __init__(self, host_mem, device_mem):
        self.host = host_mem
        self.device = device_mem

    def __str__(self):
        return "Host:\n" + str(self.host) + "\nDevice:\n" + str(self.device)

    def __repr__(self):
        return self.__str__()
    
    
class EmotionDetectNet(object):
    def __init__(self, trt_path, batch_size=1):
        self.trt_path = trt_path
        self.batch_size = batch_size

        TRT_LOGGER = trt.Logger(trt.Logger.WARNING)
        trt_runtime = trt.Runtime(TRT_LOGGER)
        self.trt_engine = self._load_engine(trt_runtime, self.trt_path)

        self.inputs, self.outputs, self.bindings, self.stream = \
            self._allocate_buffers()

        self.context = self.trt_engine.create_execution_context()
        self.list_output = None

    def _load_engine(self, trt_runtime, engine_path):
        with open(engine_path, "rb") as f:
            engine_data = f.read()
        engine = trt_runtime.deserialize_cuda_engine(engine_data)
        return engine

    def _allocate_buffers(self):
        inputs = []
        outputs = []
        bindings = []
        stream = cuda.Stream()

        binding_to_type = {
            "input_landmarks:0": np.float32,
            "softmax/Softmax:0": np.float32
        }

        for binding in self.trt_engine:
            print("Binding: {}".format(binding))
            print("Binding shape: {}".format(self.trt_engine.get_binding_shape(binding)))
            size = trt.volume(self.trt_engine.get_binding_shape(binding)) \
                   * self.batch_size * -1
            dtype = binding_to_type[str(binding)]
            host_mem = cuda.pagelocked_empty(size, dtype)
            device_mem = cuda.mem_alloc(host_mem.nbytes)
            bindings.append(int(device_mem))
            if self.trt_engine.binding_is_input(binding):
                inputs.append(HostDeviceMem(host_mem, device_mem))
            else:
                outputs.append(HostDeviceMem(host_mem, device_mem))

        return inputs, outputs, bindings, stream

    def _do_inference(self, context, bindings, inputs,
                      outputs, stream):
        [cuda.memcpy_htod_async(inp.device, inp.host, stream) \
         for inp in inputs]
        context.execute_async(
            batch_size=self.batch_size, bindings=bindings,
            stream_handle=stream.handle)

        [cuda.memcpy_dtoh_async(out.host, out.device, stream) \
         for out in outputs]

        stream.synchronize()
        return [out.host for out in outputs]
        

    def predict(self, facial_landmarks):
        # List of landmarks per face
        # Input landmarks are 80 tuple points, points after index 68 are the ones for eye pupil center, and we 
        # need only classical 68 facial landmarks points to estimate emotion
        for landmarks in facial_landmarks:
            
            input_landmarks = np.array(list(itertools.chain(*landmarks)))
            # Take only first 68 tuple points (68,2) into (136,1) vector as input to the emotion model
            input_landmarks = np.array(input_landmarks[0:136])
            # Normalize to {0, 1} range
            input_landmarks = (input_landmarks - np.min(input_landmarks)) / (np.max(input_landmarks) - np.min(input_landmarks))
            
            np.copyto(self.inputs[0].host, input_landmarks.ravel())
            t_time = 0
            for i in range(1):
                t1 = time.perf_counter()
                emotion_class = self._do_inference(
                    self.context, bindings=self.bindings, inputs=self.inputs,
                    outputs=self.outputs, stream=self.stream)
                t2 = time.perf_counter()
                t_time += (t2 - t1)
            print('Emotion detect inferece time:', t_time)
            
            # Remove batch dimension
            emotion_class = emotion_class[0]
            
            # Emotion Class order:
            # 0 - Neutral
            # 1 - Happy
            # 2 - Surprise
            # 3 - Squint
            # 4 - Disgust
            # 5 - Scream
            for e in emotion_class:
                print("Emotion: {}".format(e))
           
        return emotion_class

Morganh · February 28, 2022, 1:29pm

Emotionnet is actually a classification network.
Officially, you can refer to GitHub - NVIDIA-AI-IOT/tao-toolkit-triton-apps: Sample app code for deploying TAO Toolkit trained models to Triton
Or you can also search and find some topics in TAO forum. For example,

Inferring resnet18 classification etlt model with python - #40 by Morganh
Error while running inference, model generated through TLT using Opencv-Python - #3 by Morganh
TAO tensorRT model inferencing using python
…

philip.mizuro · February 28, 2022, 1:40pm

Thank you for providing me with links.

I mean, I have a working example for face detection and facial landmarks, and I understand it is an classification network, but the output should be different than vector of zeros. At the moment I am feeding in values from Facial Landmark model (Facial Landmarks Estimation | NVIDIA NGC) as input to Emotion Detection Model. I have also tried points in original coordinated inside the 80x80 face box and same points scaled to [0, 1] range. It seems that either of this approaches is not giving correct results.

Morganh · February 28, 2022, 2:23pm

The input should be a face image. Could you try to run inference with “tao inference” firstly?
To check if your emotionnet tensorrt engine works.
https://docs.nvidia.com/tao/tao-toolkit/text/emotion_classification/emotion_classification.html#run-inference-on-the-model

system · March 14, 2022, 2:24pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TensorRT Inference form a .etlt model on Python TAO Toolkit tensorrt	7	1193	November 16, 2021
Running nvidia pretrained models in Tensorrt inference TAO Toolkit	14	892	October 6, 2022
Inferencing of emotionnet model on Jetson TAO Toolkit	10	1143	December 13, 2021
Inferring resnet18 classification etlt model with python TAO Toolkit	45	3960	October 12, 2021
TensorRT Inference error on Jetson nano TensorRT	3	1176	December 6, 2021
PostProcessing of facial Landmark Model output TAO Toolkit	5	475	June 12, 2023
How to use tlt trained model on Jetson Nano TAO Toolkit tensorrt , jetson-inference	7	2082	October 12, 2021
TAO and Jetson-Inference ...ooops TAO Toolkit jetson-inference	9	1012	February 20, 2023
Classification inference huge performance degradation TAO Toolkit	11	1515	February 18, 2022
Interpreting output of MaskRCNN from TLT to TRT TAO Toolkit tensorrt	7	1672	October 9, 2021

Inference with Python scripts using .trt TensorRT on JetsonTX2

Related topics