Custom LPR

• GeForce GTX 1660 Ti Mobile
• DeepStream 6.1
• Ubuntu 20.04
• TensorRT 8.2.5.1
• GStreamer 1.16.2
• NVIDIA driver 510.47.03
• CUDA 11.6 Update 1
• Container nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3

I trained my custom LPRNet model. It has Accuracy: 4885 / 4893 0.9983650112405478.
I want to use this model with python api. I export my model by:
lprnet export -m /workspace/tao-experiments/lprnet/weights/lprnet_epoch-15.tlt -k nvidia_tlt -e /workspace/tao-experiments/lprnet/tutorial_spec.txt

After this I generated .engine file by starting my app (module LPR seems to deepstream-lpr-python-version). It’s working but recognizing another symbols.
For example I have a number plate: “C496AT58”.
Result LPR: “MCM4M9M6MAMJM5M8”.
I found that LPR added symbol “M” and “T” reconized as “J”, but I have not “J” in my labels. I think that it can be labels misstake, but I changed old “US” labels to custom.

How about the result of "lprnet inference xxx " with your lprnet_epoch-15.tlt ?
Refer to LPRNet — TAO Toolkit 3.22.05 documentation

I have no name!@68ecda1e05cc:/workspace$ lprnet inference -m /workspace/tao-experiments/lprnet/weights/lprnet_epoch-05.tlt -i /workspace/tao-experiments/data/custom/val/image/A001BO92.png -e /workspace/tao-experiments/lprnet/tutorial_spec.txt 
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-tr6021w0 because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Using TensorFlow backend.
INFO: Log file already exists at /workspace/tao-experiments/lprnet/weights/status.json
INFO: Starting LPRNet Inference.
INFO: Merging specification from /workspace/tao-experiments/lprnet/tutorial_spec.txt
Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/magnet/build_wheel.runfiles/ai_infra/magnet/encoding/encoding.py", line 112, in decode
TypeError: descriptor 'encode' requires a 'str' object but received a 'NoneType'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/inference.py", line 192, in <module>
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/inference.py", line 188, in main
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/scripts/inference.py", line 132, in inference
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/lprnet/utils/model_io.py", line 37, in load_model
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/magnet/build_wheel.runfiles/ai_infra/magnet/encoding/encoding.py", line 114, in decode
TypeError: encode_key must be passed as a byte object

Please add “-k yourkey”.

I have no name!@a63ebd2fc13e:/workspace$ lprnet inference -m /workspace/tao-experiments/lprnet/weights/lprnet_epoch-15.tlt -i /workspace/tao-experiments/data/test/ -e /workspace/tao-experiments/lprnet/tutorial_spec.txt -k nvidia_tlt
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-pmwqq4je because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
/usr/local/lib/python3.6/dist-packages/requests/__init__.py:91: RequestsDependencyWarning: urllib3 (1.26.5) or chardet (3.0.4) doesn't match a supported version!
  RequestsDependencyWarning)
Using TensorFlow backend.
INFO: Log file already exists at /workspace/tao-experiments/lprnet/weights/status.json
INFO: Starting LPRNet Inference.
INFO: Merging specification from /workspace/tao-experiments/lprnet/tutorial_spec.txt
Using TLT model for inference, setting batch size to the one in eval_config: 1
/workspace/tao-experiments/data/test/A001BO92.png:A001BO92 
INFO: Inference finished successfully.

So, it’s correct

OK, so the inference result is good when run “tao lprnet inference”.

Then, you can deploy the model with either of below ways.

  1. Officially we provide triton-app for running lprnet inference. You can try GitHub - NVIDIA-AI-IOT/tao-toolkit-triton-apps: Sample app code for deploying TAO Toolkit trained models to Triton . Then replace your .etlt model with existing model to run inference.
  2. Also , there is another official inference way. See LPRNet — TAO Toolkit 3.22.02 documentation and GitHub - NVIDIA-AI-IOT/deepstream_lpr_app: Sample app code for LPR deployment on DeepStream
  3. And also some forum users write some standalone inference scripts. You can also have a try. For example, Python run LPRNet with TensorRT show pycuda._driver.MemoryError: cuMemHostAlloc failed: out of memory

Ok, I start deepstream_lpr_app and it’s working ok. But I want to use it with python.
Everything works as long as I use the standard model to recognize US license plate.
But if I change it to my custom model the result is “MCM4M9M6MAMJM5M8” instead of “C496AT58”.
I just changed the LPR module to my custom and no more.
My script:

import sys

sys.path.append('/opt/nvidia/deepstream/deepstream-6.1/sources/deepstream_python_apps/apps/')
import gi
import configparser

gi.require_version('Gst', '1.0')
from gi.repository import GLib, Gst
import sys
import math
from common.is_aarch_64 import is_aarch64
from common.bus_call import bus_call
from common.FPS import PERF_DATA

import pyds

fps_streams = {}

OSD_PROCESS_MODE = 0
OSD_DISPLAY_TEXT = 1
TILED_OUTPUT_WIDTH = 1920
TILED_OUTPUT_HEIGHT = 1080
perf_data = None
pgie_classes_str = ["lpd"]


def osd_sink_pad_buffer_probe(pad, info, u_data):
    frame_number = 0
    num_rects = 0
    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer ")
        return

    lp_dict = {}
    # Retrieve batch metadata from the gst_buffer
    # Note that pyds.gst_buffer_get_nvds_batch_meta() expects the
    # C address of gst_buffer as input, which is obtained with hash(gst_buffer)
    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list
    while l_frame is not None:
        try:
            # Note that l_frame.data needs a cast to pyds.NvDsFrameMeta
            # The casting is done by pyds.NvDsFrameMeta.cast()
            # The casting also keeps ownership of the underlying memory
            # in the C code, so the Python garbage collector will leave
            # it alone.
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        '''
        print("Frame Number is ", frame_meta.frame_num)
        print("Source id is ", frame_meta.source_id)
        print("Batch id is ", frame_meta.batch_id)
        print("Source Frame Width ", frame_meta.source_frame_width)
        print("Source Frame Height ", frame_meta.source_frame_height)
        print("Num object meta ", frame_meta.num_obj_meta)
        '''
        frame_number = frame_meta.frame_num
        l_obj = frame_meta.obj_meta_list
        num_rects = frame_meta.num_obj_meta

        while l_obj is not None:
            try:
                # Casting l_obj.data to pyds.NvDsObjectMeta
                obj_meta = pyds.NvDsObjectMeta.cast(l_obj.data)
            except StopIteration:
                break

            # no ROI
            l_class = obj_meta.classifier_meta_list

            while l_class is not None:
                try:
                    class_meta = pyds.NvDsClassifierMeta.cast(l_class.data)
                except StopIteration:
                    break

                l_label = class_meta.label_info_list

                while l_label is not None:
                    try:
                        label_info = pyds.NvDsLabelInfo.cast(l_label.data)
                    except StopIteration:
                        break

                    print(label_info.result_label)

                    try:
                        l_label = l_label.next
                    except StopIteration:
                        break
                try:
                    l_class = l_class.next
                except StopIteration:
                    break

            try:
                l_obj = l_obj.next
            except StopIteration:
                break

        # Get meta data from NvDsAnalyticsFrameMeta
        l_user = frame_meta.frame_user_meta_list
        while l_user:
            try:
                user_meta = pyds.NvDsUserMeta.cast(l_user.data)
                if user_meta.base_meta.meta_type == pyds.nvds_get_user_meta_type("NVIDIA.DSANALYTICSFRAME.USER_META"):
                    user_meta_data = pyds.NvDsAnalyticsFrameMeta.cast(user_meta.user_meta_data)
                    # if user_meta_data.objInROIcnt: print("Objs in ROI: {0}".format(user_meta_data.objInROIcnt))
                    if user_meta_data.objLCCumCnt: print(
                        "Linecrossing Cumulative: {0}".format(user_meta_data.objLCCumCnt))
                    # if user_meta_data.objLCCurrCnt: print("Linecrossing Current Frame: {0}".format(user_meta_data.objLCCurrCnt))
                    # if user_meta_data.ocStatus: print("Overcrowding status: {0}".format(user_meta_data.ocStatus))
            except StopIteration:
                break
            try:
                l_user = l_user.next
            except StopIteration:
                break

        # Get frame rate through this probe
        try:
            l_frame = l_frame.next
        except StopIteration:
            break

    return Gst.PadProbeReturn.OK


def cb_newpad(decodebin, decoder_src_pad, data):
    print("In cb_newpad\n")
    caps = decoder_src_pad.get_current_caps()
    gststruct = caps.get_structure(0)
    gstname = gststruct.get_name()
    source_bin = data
    features = caps.get_features(0)

    # Need to check if the pad created by the decodebin is for video and not
    # audio.
    print("gstname=", gstname)
    if (gstname.find("video") != -1):
        # Link the decodebin pad only if decodebin has picked nvidia
        # decoder plugin nvdec_*. We do this by checking if the pad caps contain
        # NVMM memory features.
        print("features=", features)
        if features.contains("memory:NVMM"):
            # Get the source bin ghost pad
            bin_ghost_pad = source_bin.get_static_pad("src")
            if not bin_ghost_pad.set_target(decoder_src_pad):
                sys.stderr.write("Failed to link decoder src pad to source bin ghost pad\n")
        else:
            sys.stderr.write(" Error: Decodebin did not pick nvidia decoder plugin.\n")


def decodebin_child_added(child_proxy, Object, name, user_data):
    print("Decodebin child added:", name, "\n")
    if (name.find("decodebin") != -1):
        Object.connect("child-added", decodebin_child_added, user_data)
    if name.find("nvv4l2decoder") != -1:
        if is_aarch64():
            print("Seting bufapi_version\n")
            Object.set_property("bufapi-version", True)


def create_source_bin(index, uri):
    print("Creating source bin")

    # Create a source GstBin to abstract this bin's content from the rest of the
    # pipeline
    bin_name = "source-bin-%02d" % index
    print(bin_name)
    nbin = Gst.Bin.new(bin_name)
    if not nbin:
        sys.stderr.write(" Unable to create source bin \n")

    # Source element for reading from the uri.
    # We will use decodebin and let it figure out the container format of the
    # stream and the codec and plug the appropriate demux and decode plugins.
    uri_decode_bin = Gst.ElementFactory.make("uridecodebin", "uri-decode-bin")
    if not uri_decode_bin:
        sys.stderr.write(" Unable to create uri decode bin \n")
    # We set the input uri to the source element
    uri_decode_bin.set_property("uri", uri)
    # Connect to the "pad-added" signal of the decodebin which generates a
    # callback once a new pad for raw data has beed created by the decodebin
    uri_decode_bin.connect("pad-added", cb_newpad, nbin)
    uri_decode_bin.connect("child-added", decodebin_child_added, nbin)

    # We need to create a ghost pad for the source bin which will act as a proxy
    # for the video decoder src pad. The ghost pad will not have a target right
    # now. Once the decode bin creates the video decoder and generates the
    # cb_newpad callback, we will set the ghost pad target to the video decoder
    # src pad.
    Gst.Bin.add(nbin, uri_decode_bin)
    bin_pad = nbin.add_pad(Gst.GhostPad.new_no_target("src", Gst.PadDirection.SRC))
    if not bin_pad:
        sys.stderr.write(" Failed to add ghost pad in source bin \n")
        return None
    return nbin


def main(args):
    global perf_data
    perf_data = PERF_DATA(len(args))
    # Check input arguments
    if len(args) < 2:
        sys.stderr.write("usage: %s <uri1> [uri2] ... [uriN]\n" % args[0])
        sys.exit(1)

    number_sources = len(args) - 1

    # Standard GStreamer initialization
    Gst.init(None)

    # Create gstreamer elements */
    # Create Pipeline element that will form a connection of other elements
    print("Creating Pipeline \n ")
    pipeline = Gst.Pipeline()
    is_live = False

    if not pipeline:
        sys.stderr.write(" Unable to create Pipeline \n")
    print("Creating streamux \n ")

    # Create nvstreammux instance to form batches from one or more sources.
    streammux = Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
    if not streammux:
        sys.stderr.write(" Unable to create NvStreamMux \n")

    pipeline.add(streammux)
    for i in range(number_sources):
        print("Creating source_bin ", i, " \n ")
        uri_name = args[i + 1]
        if uri_name.find("rtsp://") == 0:
            is_live = True
        source_bin = create_source_bin(i, uri_name)
        if not source_bin:
            sys.stderr.write("Unable to create source bin \n")
        pipeline.add(source_bin)
        padname = "sink_%u" % i
        sinkpad = streammux.get_request_pad(padname)
        if not sinkpad:
            sys.stderr.write("Unable to create sink pad bin \n")
        srcpad = source_bin.get_static_pad("src")
        if not srcpad:
            sys.stderr.write("Unable to create src pad bin \n")
        srcpad.link(sinkpad)
    queue1 = Gst.ElementFactory.make("queue", "queue1")
    queue2 = Gst.ElementFactory.make("queue", "queue2")
    queue3 = Gst.ElementFactory.make("queue", "queue3")
    queue4 = Gst.ElementFactory.make("queue", "queue4")
    queue5 = Gst.ElementFactory.make("queue", "queue5")
    queue6 = Gst.ElementFactory.make("queue", "queue6")
    queue7 = Gst.ElementFactory.make("queue", "queue7")
    queue8 = Gst.ElementFactory.make("queue", "queue8")
    queue9 = Gst.ElementFactory.make("queue", "queue9")
    pipeline.add(queue1)
    pipeline.add(queue2)
    pipeline.add(queue3)
    pipeline.add(queue4)
    pipeline.add(queue5)
    pipeline.add(queue6)
    pipeline.add(queue7)
    pipeline.add(queue8)
    pipeline.add(queue9)

    print("Creating Pgie \n ")
    pgie = Gst.ElementFactory.make("nvinfer", "primary-inference")
    if not pgie:
        sys.stderr.write(" Unable to create pgie \n")
    print("Creating tiler \n ")

    sgie1 = Gst.ElementFactory.make("nvinfer", "secondary1-nvinference-engine")
    if not sgie1:
        sys.stderr.write(" Unable to create sgie1 \n")

    sgie2 = Gst.ElementFactory.make("nvinfer", "secondary2-nvinference-engine")
    if not sgie2:
        sys.stderr.write(" Unable to make sgie2 \n")

    print("Creating nvtracker \n ")
    tracker = Gst.ElementFactory.make("nvtracker", "tracker")
    if not tracker:
        sys.stderr.write(" Unable to create tracker \n")

    print("Creating tiler \n ")
    tiler = Gst.ElementFactory.make("nvmultistreamtiler", "nvtiler")
    if not tiler:
        sys.stderr.write(" Unable to create tiler \n")

    print("Creating nvvidconv \n ")
    nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
    if not nvvidconv:
        sys.stderr.write(" Unable to create nvvidconv \n")

    print("Creating nvosd \n ")
    nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")
    if not nvosd:
        sys.stderr.write(" Unable to create nvosd \n")
    # nvosd.set_property('process-mode', OSD_PROCESS_MODE)
    # nvosd.set_property('display-text', OSD_DISPLAY_TEXT)

    if (is_aarch64()):
        print("Creating transform \n ")
        transform = Gst.ElementFactory.make("nvegltransform", "nvegl-transform")
        if not transform:
            sys.stderr.write(" Unable to create transform \n")
    print("Creating EGLSink \n")
    sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
    sink.set_property('sync', 0)
    sink.set_property('async', 1)
    if not sink:
        sys.stderr.write(" Unable to create egl sink \n")

    if is_live:
        print("Atleast one of the sources is live")
        streammux.set_property('live-source', 1)

    streammux.set_property('live-source', 1)
    streammux.set_property('width', TILED_OUTPUT_WIDTH)
    streammux.set_property('height', TILED_OUTPUT_HEIGHT)
    streammux.set_property('batch-size', number_sources)
    streammux.set_property('batched-push-timeout', 4000000)

    pgie.set_property('config-file-path', "trafficamnet_config.txt")
    pgie_batch_size = pgie.get_property("batch-size")
    if (pgie_batch_size != number_sources):
        print("WARNING: Overriding infer-config batch-size", pgie_batch_size, " with number of sources ",
              number_sources, " \n")
        pgie.set_property("batch-size", number_sources)

    config = configparser.ConfigParser()
    config.read('./config/tracker_config.txt')
    config.sections()

    for key in config['tracker']:
        if key == 'tracker-width':
            tracker_width = config.getint('tracker', key)
            tracker.set_property('tracker-width', tracker_width)
        if key == 'tracker-height':
            tracker_height = config.getint('tracker', key)
            tracker.set_property('tracker-height', tracker_height)
        if key == 'gpu-id':
            tracker_gpu_id = config.getint('tracker', key)
            tracker.set_property('gpu_id', tracker_gpu_id)
        if key == 'll-lib-file':
            tracker_ll_lib_file = config.get('tracker', key)
            tracker.set_property('ll-lib-file', tracker_ll_lib_file)
        if key == 'll-config-file':
            tracker_ll_config_file = config.get('tracker', key)
            tracker.set_property('ll-config-file', tracker_ll_config_file)
        if key == 'enable-batch-process':
            tracker_enable_batch_process = config.getint('tracker', key)
            tracker.set_property('enable_batch_process', tracker_enable_batch_process)
        if key == 'enable-past-frame':
            tracker_enable_past_frame = config.getint('tracker', key)
            tracker.set_property('enable_past_frame', tracker_enable_past_frame)

    sgie1.set_property('config-file-path', "lpd_us_config.txt")
    sgie1.set_property('process-mode', 2)

    sgie2.set_property('config-file-path', "lpr_config_sgie_us.txt")
    sgie2.set_property('process-mode', 2)

    tiler_rows = int(math.sqrt(number_sources))
    tiler_columns = int(math.ceil((1.0 * number_sources) / tiler_rows))
    tiler.set_property("rows", tiler_rows)
    tiler.set_property("columns", tiler_columns)
    tiler.set_property("width", TILED_OUTPUT_WIDTH)
    tiler.set_property("height", TILED_OUTPUT_HEIGHT)
    sink.set_property("qos", 0)
    sink.set_property("sync", 0)

    print("Adding elements to Pipeline \n")

    pipeline.add(pgie)
    pipeline.add(tracker)
    pipeline.add(sgie1)
    pipeline.add(sgie2)
    pipeline.add(tiler)
    pipeline.add(nvvidconv)
    pipeline.add(nvosd)
    if is_aarch64():
        pipeline.add(transform)
    pipeline.add(sink)

    print("Linking elements in the Pipeline \n")
    streammux.link(queue1)
    queue1.link(pgie)
    pgie.link(queue2)
    queue2.link(tracker)
    tracker.link(queue3)
    queue3.link(sgie1)
    sgie1.link(queue4)
    queue4.link(sgie2)
    sgie2.link(queue9)
    queue9.link(tiler)
    tiler.link(queue5)
    queue5.link(nvvidconv)
    nvvidconv.link(queue6)
    queue6.link(nvosd)
    if is_aarch64():
        nvosd.link(queue7)
        queue7.link(transform)
        transform.link(sink)
    else:
        nvosd.link(queue7)
        queue7.link(sink)

    # create an event loop and feed gstreamer bus mesages to it
    loop = GLib.MainLoop()
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus.connect("message", bus_call, loop)
    GLib.timeout_add(5000, perf_data.perf_print_callback)

    # List the sources
    print("Now playing...")
    for i, source in enumerate(args):
        if i != 0:
            print(i, ": ", source)

    # Lets add probe to get informed of the meta data generated, we add probe to
    # the sink pad of the osd element, since by that time, the buffer would have
    # had got all the metadata.
    osdsinkpad = nvosd.get_static_pad("sink")
    if not osdsinkpad:
        sys.stderr.write(" Unable to get sink pad of nvosd \n")
    osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, osd_sink_pad_buffer_probe, 0)

    print("Starting pipeline \n")
    # start play back and listed to events
    pipeline.set_state(Gst.State.PLAYING)
    try:
        loop.run()
    except:
        pass
    # cleanup
    print("Exiting app\n")
    pipeline.set_state(Gst.State.NULL)


if __name__ == '__main__':
    sys.exit(main(sys.argv))

LPR config:

[property]
gpu-id=0
model-engine-file=/home/codeinside/test_num/LPR/lprnet_epoch-15.etlt_b16_gpu0_fp16.engine
labelfile-path=/home/codeinside/test_num/LPR/labels_ru.txt
tlt-encoded-model=/home/codeinside/test_num/LPR/lprnet_epoch-15.etlt
tlt-model-key=nvidia_tlt
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=3
gie-unique-id=2
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid
#0=Detection 1=Classifier 2=Segmentation
network-type=1
parse-classifier-func-name=NvDsInferParseCustomNVPlate
custom-lib-path=/home/codeinside/deepstream_lpr_app_/nvinfer_custom_lpr_parser/libnvdsinfer_custom_impl_lpr.so
process-mode=2
operate-on-gie-id=2
operate-on-class-ids=0
net-scale-factor=0.00392156862745098
#net-scale-factor=1.0
#0=RGB 1=BGR 2=GRAY
model-color-format=0

[class-attrs-all]
threshold=0.5

To narrow down, could you please try to use one of the 3 ways I mentioned above to run inference and check the result?

I can use my custom model in GitHub - NVIDIA-AI-IOT/deepstream_lpr_app: Sample app code for LPR deployment on DeepStream (2 way)
I tried to start triton server to check 1 way and get nex error:

-------------- The current device memory allocations dump as below --------------
[0]:38735824 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 24 time: 0.000434705
[0x7f73c6800000]:5348352 :GPU per-runner memory in ExecutionContext: at runtime/api/executionContext.cpp: 163 idx: 22 time: 8.7112e-05
[0x7f73ce000000]:344030208 :GPU context memory in ExecutionContext: at runtime/api/executionContext.cpp: 214 idx: 20 time: 0.000344276
[0x7f73e4000000]:28546560 :GPU per-runner memory in ExecutionContext: at runtime/api/executionContext.cpp: 163 idx: 19 time: 0.000120516
[0x7f73e8000000]:877363712 :GPU context memory in ExecutionContext: at runtime/api/executionContext.cpp: 214 idx: 17 time: 0.000743221
[0x7f741e800000]:7108096 :GPU per-runner memory in ExecutionContext: at runtime/api/executionContext.cpp: 163 idx: 16 time: 7.863e-05
[0x7f7426000000]:120637440 :GPU context memory in ExecutionContext: at runtime/api/executionContext.cpp: 214 idx: 14 time: 0.000134599
[0x7f75ea000000]:161578256 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 0 time: 0.00025622
[0x7f73c6e00000]:10135552 :GPU context memory in ExecutionContext: at runtime/api/executionContext.cpp: 214 idx: 23 time: 7.9746e-05
[0x7f7430000000]:60285440 :GPU context memory in ExecutionContext: at runtime/api/executionContext.cpp: 214 idx: 11 time: 0.000124286
[0x7f7482000000]:1935926272 :GPU context memory in ExecutionContext: at runtime/api/executionContext.cpp: 214 idx: 2 time: 0.00169543
[0x7f7603000000]:5616236 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 3 time: 0.000134975
[0x7f7480000000]:5594112 :GPU per-runner memory in ExecutionContext: at runtime/api/executionContext.cpp: 163 idx: 4 time: 9.8495e-05
[0x7f73e6000000]:29591520 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 18 time: 0.000156094
[0x7f742e800000]:3928576 :GPU per-runner memory in ExecutionContext: at runtime/api/executionContext.cpp: 163 idx: 13 time: 6.0643e-05
[0x7f746fc00000]:3686912 :GPU context memory in ExecutionContext: at runtime/api/executionContext.cpp: 214 idx: 8 time: 7.6502e-05
[0x7f7435400000]:3938000 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 12 time: 0.000105956
[0x7f7478000000]:125337600 :GPU context memory in ExecutionContext: at runtime/api/executionContext.cpp: 214 idx: 5 time: 0.00017115
[0x7f7604000000]:133241344 :GPU per-runner memory in ExecutionContext: at runtime/api/executionContext.cpp: 163 idx: 1 time: 0.000212526
[0x7f7434000000]:20427264 :GPU per-runner memory in ExecutionContext: at runtime/api/executionContext.cpp: 163 idx: 10 time: 0.000102739
[0x7f746e000000]:28244992 :GPU per-runner memory in ExecutionContext: at runtime/api/executionContext.cpp: 163 idx: 7 time: 0.000110324
[0x7f73e2a00000]:5369048 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 21 time: 0.000132726
[0x7f742ec00000]:7263040 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 15 time: 0.000135892
[0x7f7470000000]:28249932 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 6 time: 0.000170561
[0x7f7436000000]:54207744 :GpuGlob deserialization in load: at runtime/deserialization/safeDeserialize.cpp: 361 idx: 9 time: 0.000200663
W0627 16:06:38.789688 104 logging.cc:46] Requested amount of GPU memory (38735824 bytes) could not be allocated. There may not be enough free memory for allocation to succeed.
E0627 16:06:38.789749 104 logging.cc:43] 2: [safeDeserialize.cpp::load::361] Error Code 2: OutOfMemory (no further information)
E0627 16:06:38.789767 104 logging.cc:43] 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
I0627 16:06:38.792774 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0627 16:06:38.792800 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
E0627 16:06:38.792827 104 model_repository_manager.cc:1152] failed to load 'yolov3_tao' version 1: Internal: unable to create TensorRT engine
I0627 16:06:38.792876 104 server.cc:522] 
+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0627 16:06:38.792923 104 server.cc:549] 
+-------------+-------------------------------------------------------------------------+--------+
| Backend     | Path                                                                    | Config |
+-------------+-------------------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so                 | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so         | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so         | {}     |
| openvino    | /opt/tritonserver/backends/openvino_2021_2/libtriton_openvino_2021_2.so | {}     |
| tensorrt    | /opt/tritonserver/backends/tensorrt/libtriton_tensorrt.so               | {}     |
+-------------+-------------------------------------------------------------------------+--------+

I0627 16:06:38.792992 104 server.cc:592] 
+------------------------------+---------+---------------------------------------------------------+
| Model                        | Version | Status                                                  |
+------------------------------+---------+---------------------------------------------------------+
| dashcamnet_tao               | 1       | READY                                                   |
| lprnet_tao                   | 1       | READY                                                   |
| multitask_classification_tao | 1       | READY                                                   |
| peoplenet_tao                | 1       | READY                                                   |
| peoplesegnet_tao             | 1       | READY                                                   |
| pose_classification_tao      | 1       | READY                                                   |
| retinanet_tao                | 1       | READY                                                   |
| vehicletypenet_tao           | 1       | READY                                                   |
| yolov3_tao                   | 1       | UNAVAILABLE: Internal: unable to create TensorRT engine |
+------------------------------+---------+---------------------------------------------------------+

I0627 16:06:38.804797 104 metrics.cc:623] Collecting metrics for GPU 0: NVIDIA GeForce GTX 1660 Ti
I0627 16:06:38.805013 104 tritonserver.cc:1932] 
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                               |
| server_version                   | 2.19.0                                                                                                                                                               |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tens |
|                                  | or_data statistics trace                                                                                                                                             |
| model_repository_path[0]         | /model_repository                                                                                                                                                    |
| model_control_mode               | MODE_NONE                                                                                                                                                            |
| strict_model_config              | 1                                                                                                                                                                    |
| rate_limit                       | OFF                                                                                                                                                                  |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                            |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                             |
| response_cache_byte_size         | 0                                                                                                                                                                    |
| min_supported_compute_capability | 6.0                                                                                                                                                                  |
| strict_readiness                 | 1                                                                                                                                                                    |
| exit_timeout                     | 30                                                                                                                                                                   |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+

I0627 16:06:38.805032 104 server.cc:252] Waiting for in-flight requests to complete.
I0627 16:06:38.805040 104 model_repository_manager.cc:1026] unloading: retinanet_tao:1
I0627 16:06:38.805074 104 model_repository_manager.cc:1026] unloading: pose_classification_tao:1
I0627 16:06:38.805110 104 model_repository_manager.cc:1026] unloading: peoplesegnet_tao:1
I0627 16:06:38.805156 104 model_repository_manager.cc:1026] unloading: vehicletypenet_tao:1
I0627 16:06:38.805190 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance stateI0627 16:06:38.805200 104 model_repository_manager.cc:1026] unloading: peoplenet_tao:1

I0627 16:06:38.805235 104 model_repository_manager.cc:1026] unloading: multitask_classification_tao:1
I0627 16:06:38.805245 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0627 16:06:38.805267 104 model_repository_manager.cc:1026] unloading: lprnet_tao:1
I0627 16:06:38.805278 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0627 16:06:38.805292 104 model_repository_manager.cc:1026] unloading: dashcamnet_tao:1
I0627 16:06:38.805295 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0627 16:06:38.805310 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0627 16:06:38.805324 104 server.cc:267] Timeout 30: Found 8 live models and 0 in-flight non-inference requests
I0627 16:06:38.805311 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0627 16:06:38.805346 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0627 16:06:38.805397 104 tensorrt.cc:5343] TRITONBACKEND_ModelInstanceFinalize: delete instance state
I0627 16:06:38.811185 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
I0627 16:06:38.811464 104 model_repository_manager.cc:1132] successfully unloaded 'vehicletypenet_tao' version 1
I0627 16:06:38.812239 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
I0627 16:06:38.813655 104 model_repository_manager.cc:1132] successfully unloaded 'multitask_classification_tao' version 1
I0627 16:06:38.813879 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
I0627 16:06:38.814230 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
I0627 16:06:38.814649 104 model_repository_manager.cc:1132] successfully unloaded 'dashcamnet_tao' version 1
I0627 16:06:38.814711 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
I0627 16:06:38.815175 104 model_repository_manager.cc:1132] successfully unloaded 'peoplenet_tao' version 1
I0627 16:06:38.815916 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
I0627 16:06:38.816231 104 model_repository_manager.cc:1132] successfully unloaded 'retinanet_tao' version 1
I0627 16:06:38.818954 104 model_repository_manager.cc:1132] successfully unloaded 'peoplesegnet_tao' version 1
I0627 16:06:38.819458 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
I0627 16:06:38.819510 104 model_repository_manager.cc:1132] successfully unloaded 'lprnet_tao' version 1
I0627 16:06:38.824714 104 tensorrt.cc:5282] TRITONBACKEND_ModelFinalize: delete model state
I0627 16:06:38.825259 104 model_repository_manager.cc:1132] successfully unloaded 'pose_classification_tao' version 1
I0627 16:06:39.805505 104 server.cc:267] Timeout 29: Found 0 live models and 0 in-flight non-inference requests
error: creating server: Internal - failed to load all models
W0627 16:06:39.807078 104 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000
W0627 16:06:40.807679 104 metrics.cc:401] Unable to get power limit for GPU 0. Status:Success, value:0.000000

It is due to out of memory in your machine.

Since you are only running lprnet, you can use below way to trigger server.

Steps:
$  mv model_repository/ model_repository_bak
$  mkdir model_repository/
$  cp -r model_repository_bak/lprnet_tao   model_repository/lprnet_tao


$  mv scripts/download_and_convert.sh  scripts/download_and_convert.sh.bak
$  vim scripts/download_and_convert.sh   
#!/bin/bash

# Generate an LPRnet model.
echo "Converting the LPRNet model"
mkdir -p /model_repository/lprnet_tao/1
tao-converter /tao_models/lprnet_model/us_lprnet_baseline18_deployable.etlt \
              -k nvidia_tlt \
              -p image_input,1x3x48x96,4x3x48x96,16x3x48x96 \
              -t fp16 \
              -e /model_repository/lprnet_tao/1/model.plan

/opt/tritonserver/bin/tritonserver --model-store /model_repository


$ bash scripts/start_server.sh

It’s okey:

codeinside@CI-1442:~/tao-toolkit-triton-apps/tao_triton/python/entrypoints$ python3 tao_client.py /home/codeinside/tao-experiments/data/test/ -m lprnet_tao -x 1 -b 1 --mode LPRNet -i https -u localhost:8000 --async --output_path /home/codeinside/tao-experiments/data/test/
16 image_input ['tf_op_layer_ArgMax', 'tf_op_layer_Max'] 3 48 96 2 FP32
2022-06-28 11:03:33,155 [INFO] __main__: Sending inference request for batches of data
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 63.48it/s]
2022-06-28 11:03:33,172 [INFO] __main__: Gathering responses from the server and post processing the inferenced outputs.
  0%|                                                                                                                                                                                | 0/1 [00:00<?, ?it/s]/home/codeinside/tao-experiments/data/test/A001BO92.png
inference result: ['A001BO92']

100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 3705.22it/s]
2022-06-28 11:03:33,172 [INFO] __main__: PASS

I solved my problem.
I have next config file:

[property]
gpu-id=0
labelfile-path=./labels.txt
tlt-encoded-model=./lprnet_epoch-15.etlt
model-engine-file=./lprnet_epoch-15.etlt_b16_gpu0_fp16.engine
tlt-model-key=nvidia_tlt
batch-size=16
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
#num-detected-classes=3
gie-unique-id=3
output-blob-names=output_bbox/BiasAdd;output_cov/Sigmoid
#0=Detection 1=Classifier 2=Segmentation
network-type=1
parse-classifier-func-name=NvDsInferParseCustomNVPlate
custom-lib-path=/home/codeinside/deepstream_lpr_app_/nvinfer_custom_lpr_parser/libnvdsinfer_custom_impl_lpr.so
process-mode=2
operate-on-gie-id=2
#operate-on-class-ids=0
net-scale-factor=0.00392156862745098
#net-scale-factor=1.0
#0=RGB 1=BGR 2=GRAY
model-color-format=0

[class-attrs-all]
threshold=0.5

I had problem with labels. Custom LPR found extra labels and ignored my custom labels.txt
I solved it by adding dict.txt (with custom labels) and it started working correct.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.