Deepstream secondary classification error, all results point to the same wrong category!

tms2003 · December 30, 2022, 7:09am

Please provide complete information as applicable to your setup.

• Hardware Platform (Jetson / GPU)
1080 Ti
• DeepStream Version’
Deepstream 6.0
• JetPack Version (valid for Jetson only)
• TensorRT Version
TensorRT 8.0.3.4
• NVIDIA GPU Driver Version (valid for GPU only)
470.161.03
• Issue Type( questions, new requirements, bugs)

questions

• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing)
I want to experiment with a custom quadratic classification model in deepstream. The previous network uses the default human-vehicle detection network, and after detecting the results, the classification network is used for secondary classification. I replaced the original dstest2_sg1_config.txt on the basis of deepstream-test2-app.c with my own resnet50 network model (imagenet 1k classification results taken from pytorch vision).

However, after the program was run, the results were the same for all categories. They all point to spotlight, spot, whose idx is 819. This result is wrong. Where is the problem? How can I get the correct result, I will give all the steps, please help me, thanks!

• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description)

step1 ,I got a onnx file with softmax :

import torch
import torchvision.models as models
import cv2
import numpy as np

class ResNet50_wSoftmax(torch.nn.Module):
    # Merge the softmax post-processing into the model and export it together as onnx
    #  I tested  without softmax also,and the result was  still erorr
    def __init__(self):
        super().__init__()
        self.base_model = models.resnet50(pretrained=True)
        self.softmax = torch.nn.Softmax(dim=1)

    def forward(self, x):
        y = self.base_model(x)
        prob = self.softmax(y)
        return prob

def preprocessing(img):
    # pre image
    IMAGENET_MEAN = [0.485, 0.456, 0.406]
    IMAGENET_STD = [0.229, 0.224, 0.225]
    img = img[:, :, ::-1]
    img = cv2.resize(img, (224, 224))
    img = img / 255.0
    img = (img - IMAGENET_MEAN) / IMAGENET_STD
    img = img.transpose(2, 0, 1).astype(np.float32)
    tensor_img = torch.from_numpy(img)[None]
    return tensor_img

if __name__ == '__main__':
   
    #132,American egret, great white heron, Egretta albus
    image_path = '/home/incar/data/bird5_cutOut/0_bailu/2021-03-25-12-36-10_6.jpg'
    img = cv2.imread(image_path)
    tensor_img = preprocessing(img)
    model = ResNet50_wSoftmax()   
    model.eval()
    pred = model(tensor_img)[0]
    max_idx = torch.argmax(pred)
    print(f"test_image: {image_path}, max_idx: {max_idx}, max_logit: {pred[max_idx].item()}")

    dummpy_input = torch.zeros(1, 3, 224, 224)  
    torch.onnx.export(
            model, dummpy_input, 'resnet50_wSoftmax.onnx',
            input_names=['input'],
            output_names=['ouput'],
            opset_version=11,
           
    )

step 2:
and I convert onnx to engine by trtexec:
trtexec --onnx=resnet50_wSoftmax.onnx --saveEngine=resnet50_wSoftmax.engine --explicitBatch

[12/30/2022-14:23:48] [I] === Model Options ===
[12/30/2022-14:23:48] [I] Format: ONNX
[12/30/2022-14:23:48] [I] Model: resnet50_wSoftmax.onnx
[12/30/2022-14:23:48] [I] Output:
[12/30/2022-14:23:48] [I] === Build Options ===
[12/30/2022-14:23:48] [I] Max batch: explicit
[12/30/2022-14:23:48] [I] Workspace: 16 MiB
[12/30/2022-14:23:48] [I] minTiming: 1
[12/30/2022-14:23:48] [I] avgTiming: 8
[12/30/2022-14:23:48] [I] Precision: FP32
[12/30/2022-14:23:48] [I] Calibration: 
[12/30/2022-14:23:48] [I] Refit: Disabled
[12/30/2022-14:23:48] [I] Sparsity: Disabled
[12/30/2022-14:23:48] [I] Safe mode: Disabled
[12/30/2022-14:23:48] [I] Restricted mode: Disabled
[12/30/2022-14:23:48] [I] Save engine: resnet50_wSoftmax.engine
[12/30/2022-14:23:48] [I] Load engine: 
[12/30/2022-14:23:48] [I] NVTX verbosity: 0
[12/30/2022-14:23:48] [I] Tactic sources: Using default tactic sources
[12/30/2022-14:23:48] [I] timingCacheMode: local
[12/30/2022-14:23:48] [I] timingCacheFile: 
[12/30/2022-14:23:48] [I] Input(s)s format: fp32:CHW
[12/30/2022-14:23:48] [I] Output(s)s format: fp32:CHW
[12/30/2022-14:23:48] [I] Input build shapes: model
[12/30/2022-14:23:48] [I] Input calibration shapes: model
[12/30/2022-14:23:48] [I] === System Options ===
[12/30/2022-14:23:48] [I] Device: 0
[12/30/2022-14:23:48] [I] DLACore: 
[12/30/2022-14:23:48] [I] Plugins:
[12/30/2022-14:23:48] [I] === Inference Options ===
[12/30/2022-14:23:48] [I] Batch: Explicit
[12/30/2022-14:23:48] [I] Input inference shapes: model
[12/30/2022-14:23:48] [I] Iterations: 10
[12/30/2022-14:23:48] [I] Duration: 3s (+ 200ms warm up)
[12/30/2022-14:23:48] [I] Sleep time: 0ms
[12/30/2022-14:23:48] [I] Streams: 1
[12/30/2022-14:23:48] [I] ExposeDMA: Disabled
[12/30/2022-14:23:48] [I] Data transfers: Enabled
[12/30/2022-14:23:48] [I] Spin-wait: Disabled
[12/30/2022-14:23:48] [I] Multithreading: Disabled
[12/30/2022-14:23:48] [I] CUDA Graph: Disabled
[12/30/2022-14:23:48] [I] Separate profiling: Disabled
[12/30/2022-14:23:48] [I] Time Deserialize: Disabled
[12/30/2022-14:23:48] [I] Time Refit: Disabled
[12/30/2022-14:23:48] [I] Skip inference: Disabled
[12/30/2022-14:23:48] [I] Inputs:
[12/30/2022-14:23:48] [I] === Reporting Options ===
[12/30/2022-14:23:48] [I] Verbose: Disabled
[12/30/2022-14:23:48] [I] Averages: 10 inferences
[12/30/2022-14:23:48] [I] Percentile: 99
[12/30/2022-14:23:48] [I] Dump refittable layers:Disabled
[12/30/2022-14:23:48] [I] Dump output: Disabled
[12/30/2022-14:23:48] [I] Profile: Disabled
[12/30/2022-14:23:48] [I] Export timing to JSON file: 
[12/30/2022-14:23:48] [I] Export output to JSON file: 
[12/30/2022-14:23:48] [I] Export profile to JSON file: 
[12/30/2022-14:23:48] [I] 
[12/30/2022-14:23:48] [I] === Device Information ===
[12/30/2022-14:23:48] [I] Selected Device: NVIDIA GeForce GTX 1080 Ti
[12/30/2022-14:23:48] [I] Compute Capability: 6.1
[12/30/2022-14:23:48] [I] SMs: 28
[12/30/2022-14:23:48] [I] Compute Clock Rate: 1.62 GHz
[12/30/2022-14:23:48] [I] Device Global Memory: 11177 MiB
[12/30/2022-14:23:48] [I] Shared Memory per SM: 96 KiB
[12/30/2022-14:23:48] [I] Memory Bus Width: 352 bits (ECC disabled)
[12/30/2022-14:23:48] [I] Memory Clock Rate: 5.505 GHz
[12/30/2022-14:23:48] [I] 
[12/30/2022-14:23:48] [I] TensorRT version: 8003
[12/30/2022-14:23:49] [I] [TRT] [MemUsageChange] Init CUDA: CPU +157, GPU +0, now: CPU 164, GPU 583 (MiB)
[12/30/2022-14:23:49] [I] Start parsing network model
[12/30/2022-14:23:49] [I] [TRT] ----------------------------------------------------------------
[12/30/2022-14:23:49] [I] [TRT] Input filename:   resnet50_wSoftmax.onnx
[12/30/2022-14:23:49] [I] [TRT] ONNX IR version:  0.0.6
[12/30/2022-14:23:49] [I] [TRT] Opset version:    11
[12/30/2022-14:23:49] [I] [TRT] Producer name:    pytorch
[12/30/2022-14:23:49] [I] [TRT] Producer version: 1.11.0
[12/30/2022-14:23:49] [I] [TRT] Domain:           
[12/30/2022-14:23:49] [I] [TRT] Model version:    0
[12/30/2022-14:23:49] [I] [TRT] Doc string:       
[12/30/2022-14:23:49] [I] [TRT] ----------------------------------------------------------------
[12/30/2022-14:23:49] [I] Finish parsing network model
[12/30/2022-14:23:49] [I] [TRT] [MemUsageChange] Init CUDA: CPU +0, GPU +0, now: CPU 262, GPU 583 (MiB)
[12/30/2022-14:23:49] [I] [TRT] [MemUsageSnapshot] Builder begin: CPU 262 MiB, GPU 583 MiB
[12/30/2022-14:23:49] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +249, GPU +104, now: CPU 511, GPU 687 (MiB)
[12/30/2022-14:23:50] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +148, GPU +106, now: CPU 659, GPU 793 (MiB)
[12/30/2022-14:23:50] [W] [TRT] Detected invalid timing cache, setup a local cache instead
[12/30/2022-14:23:53] [I] [TRT] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[12/30/2022-14:24:19] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[12/30/2022-14:24:19] [I] [TRT] Total Host Persistent Memory: 111952
[12/30/2022-14:24:19] [I] [TRT] Total Device Persistent Memory: 135909376
[12/30/2022-14:24:19] [I] [TRT] Total Scratch Memory: 0
[12/30/2022-14:24:19] [I] [TRT] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 9 MiB, GPU 4 MiB
[12/30/2022-14:24:19] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 846, GPU 1017 (MiB)
[12/30/2022-14:24:19] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 846, GPU 1025 (MiB)
[12/30/2022-14:24:19] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 846, GPU 1009 (MiB)
[12/30/2022-14:24:19] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 846, GPU 991 (MiB)
[12/30/2022-14:24:19] [I] [TRT] [MemUsageSnapshot] Builder end: CPU 846 MiB, GPU 991 MiB
[12/30/2022-14:24:20] [I] [TRT] Loaded engine size: 150 MB
[12/30/2022-14:24:20] [I] [TRT] [MemUsageSnapshot] deserializeCudaEngine begin: CPU 996 MiB, GPU 843 MiB
[12/30/2022-14:24:20] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 996, GPU 1005 (MiB)
[12/30/2022-14:24:20] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 996, GPU 1013 (MiB)
[12/30/2022-14:24:20] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 996, GPU 995 (MiB)
[12/30/2022-14:24:20] [I] [TRT] [MemUsageSnapshot] deserializeCudaEngine end: CPU 996 MiB, GPU 995 MiB
[12/30/2022-14:24:21] [I] Engine built in 33.0023 sec.
[12/30/2022-14:24:21] [I] [TRT] [MemUsageSnapshot] ExecutionContext creation begin: CPU 747 MiB, GPU 995 MiB
[12/30/2022-14:24:21] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 747, GPU 1005 (MiB)
[12/30/2022-14:24:21] [I] [TRT] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 747, GPU 1013 (MiB)
[12/30/2022-14:24:21] [I] [TRT] [MemUsageSnapshot] ExecutionContext creation end: CPU 747 MiB, GPU 1151 MiB
[12/30/2022-14:24:21] [I] Created input binding for input with dimensions 1x3x224x224
[12/30/2022-14:24:21] [I] Created output binding for ouput with dimensions 1x1000
[12/30/2022-14:24:21] [I] Starting inference
[12/30/2022-14:24:25] [I] Warmup completed 81 queries over 200 ms
[12/30/2022-14:24:25] [I] Timing trace has 1332 queries over 3.00683 s
[12/30/2022-14:24:25] [I] 
[12/30/2022-14:24:25] [I] === Trace details ===
[12/30/2022-14:24:25] [I] Trace averages of 10 runs:
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20907 ms - Host latency: 2.29739 ms (end to end 4.19025 ms, enqueue 0.523633 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26898 ms - Host latency: 2.35808 ms (end to end 4.30988 ms, enqueue 0.483008 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22997 ms - Host latency: 2.31949 ms (end to end 4.24657 ms, enqueue 0.469675 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24472 ms - Host latency: 2.33289 ms (end to end 4.25959 ms, enqueue 0.471857 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24348 ms - Host latency: 2.32968 ms (end to end 4.25486 ms, enqueue 0.477417 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24522 ms - Host latency: 2.33264 ms (end to end 4.26584 ms, enqueue 0.483603 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23818 ms - Host latency: 2.32494 ms (end to end 4.25003 ms, enqueue 0.489542 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27727 ms - Host latency: 2.3649 ms (end to end 4.33156 ms, enqueue 0.476386 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24297 ms - Host latency: 2.33018 ms (end to end 4.26861 ms, enqueue 0.487863 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20662 ms - Host latency: 2.29342 ms (end to end 4.20544 ms, enqueue 0.48533 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26376 ms - Host latency: 2.35044 ms (end to end 4.3168 ms, enqueue 0.49404 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24512 ms - Host latency: 2.33291 ms (end to end 4.27097 ms, enqueue 0.457834 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24566 ms - Host latency: 2.33215 ms (end to end 4.28366 ms, enqueue 0.482858 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24287 ms - Host latency: 2.32934 ms (end to end 4.2914 ms, enqueue 0.474963 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24686 ms - Host latency: 2.33294 ms (end to end 4.27695 ms, enqueue 0.491754 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27717 ms - Host latency: 2.36486 ms (end to end 4.29584 ms, enqueue 0.484131 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23569 ms - Host latency: 2.32209 ms (end to end 4.28945 ms, enqueue 0.477893 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24575 ms - Host latency: 2.33508 ms (end to end 4.27363 ms, enqueue 0.484839 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24758 ms - Host latency: 2.33482 ms (end to end 4.23656 ms, enqueue 0.498279 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23549 ms - Host latency: 2.32222 ms (end to end 4.30212 ms, enqueue 0.486066 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23896 ms - Host latency: 2.32686 ms (end to end 4.28463 ms, enqueue 0.460004 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.28721 ms - Host latency: 2.37082 ms (end to end 4.14486 ms, enqueue 0.44361 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24349 ms - Host latency: 2.32888 ms (end to end 4.2661 ms, enqueue 0.469183 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.25475 ms - Host latency: 2.34227 ms (end to end 4.28607 ms, enqueue 0.496283 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.28137 ms - Host latency: 2.36852 ms (end to end 4.36055 ms, enqueue 0.464813 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23611 ms - Host latency: 2.32234 ms (end to end 4.24703 ms, enqueue 0.477344 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20355 ms - Host latency: 2.28891 ms (end to end 4.18228 ms, enqueue 0.476562 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26816 ms - Host latency: 2.35424 ms (end to end 4.31978 ms, enqueue 0.479181 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23141 ms - Host latency: 2.31669 ms (end to end 4.23441 ms, enqueue 0.470813 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23777 ms - Host latency: 2.32572 ms (end to end 4.24656 ms, enqueue 0.478674 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22211 ms - Host latency: 2.30917 ms (end to end 4.2414 ms, enqueue 0.472858 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23468 ms - Host latency: 2.32471 ms (end to end 4.25876 ms, enqueue 0.50108 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24 ms - Host latency: 2.32748 ms (end to end 4.2644 ms, enqueue 0.507831 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27318 ms - Host latency: 2.40325 ms (end to end 4.36337 ms, enqueue 0.522241 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24392 ms - Host latency: 2.34196 ms (end to end 4.29952 ms, enqueue 0.542639 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23631 ms - Host latency: 2.32303 ms (end to end 4.22749 ms, enqueue 0.467786 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23977 ms - Host latency: 2.32667 ms (end to end 4.30138 ms, enqueue 0.481445 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23354 ms - Host latency: 2.32162 ms (end to end 4.2537 ms, enqueue 0.487463 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23787 ms - Host latency: 2.32523 ms (end to end 4.25651 ms, enqueue 0.484094 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23303 ms - Host latency: 2.32041 ms (end to end 4.25693 ms, enqueue 0.463 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24727 ms - Host latency: 2.33463 ms (end to end 4.2587 ms, enqueue 0.477515 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.25353 ms - Host latency: 2.34137 ms (end to end 4.28 ms, enqueue 0.47439 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.25812 ms - Host latency: 2.345 ms (end to end 4.3193 ms, enqueue 0.505701 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20419 ms - Host latency: 2.29163 ms (end to end 4.16111 ms, enqueue 0.481653 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22618 ms - Host latency: 2.31272 ms (end to end 4.21229 ms, enqueue 0.473889 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24081 ms - Host latency: 2.3306 ms (end to end 4.26354 ms, enqueue 0.48562 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24161 ms - Host latency: 2.32739 ms (end to end 4.27383 ms, enqueue 0.476489 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22761 ms - Host latency: 2.31061 ms (end to end 4.30924 ms, enqueue 0.433826 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23192 ms - Host latency: 2.31355 ms (end to end 4.32189 ms, enqueue 0.461426 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23976 ms - Host latency: 2.32518 ms (end to end 4.26497 ms, enqueue 0.470715 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26088 ms - Host latency: 2.34644 ms (end to end 4.29346 ms, enqueue 0.47876 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.25955 ms - Host latency: 2.34529 ms (end to end 4.30167 ms, enqueue 0.480652 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20537 ms - Host latency: 2.29346 ms (end to end 4.19354 ms, enqueue 0.49574 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23569 ms - Host latency: 2.32277 ms (end to end 4.23483 ms, enqueue 0.483191 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23085 ms - Host latency: 2.31672 ms (end to end 4.24729 ms, enqueue 0.489929 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22761 ms - Host latency: 2.31368 ms (end to end 4.24891 ms, enqueue 0.47467 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24014 ms - Host latency: 2.32617 ms (end to end 4.25647 ms, enqueue 0.47533 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26456 ms - Host latency: 2.35023 ms (end to end 4.29652 ms, enqueue 0.496326 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26509 ms - Host latency: 2.37063 ms (end to end 4.31626 ms, enqueue 0.528516 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24003 ms - Host latency: 2.36788 ms (end to end 4.3374 ms, enqueue 0.496899 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23978 ms - Host latency: 2.36389 ms (end to end 4.32738 ms, enqueue 0.475647 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23317 ms - Host latency: 2.36337 ms (end to end 4.28123 ms, enqueue 0.509143 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23899 ms - Host latency: 2.36937 ms (end to end 4.28611 ms, enqueue 0.502771 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.2405 ms - Host latency: 2.37678 ms (end to end 4.27936 ms, enqueue 0.504126 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26941 ms - Host latency: 2.40005 ms (end to end 4.33647 ms, enqueue 0.482117 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20251 ms - Host latency: 2.32878 ms (end to end 4.27311 ms, enqueue 0.49834 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23759 ms - Host latency: 2.35935 ms (end to end 4.30154 ms, enqueue 0.510437 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.2837 ms - Host latency: 2.37113 ms (end to end 4.3427 ms, enqueue 0.538635 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22665 ms - Host latency: 2.31399 ms (end to end 4.25674 ms, enqueue 0.486291 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23491 ms - Host latency: 2.32011 ms (end to end 4.28578 ms, enqueue 0.458655 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22781 ms - Host latency: 2.31483 ms (end to end 4.24868 ms, enqueue 0.482654 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23594 ms - Host latency: 2.3214 ms (end to end 4.26541 ms, enqueue 0.462476 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23835 ms - Host latency: 2.32363 ms (end to end 4.27173 ms, enqueue 0.461951 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26882 ms - Host latency: 2.35352 ms (end to end 4.31442 ms, enqueue 0.465039 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20283 ms - Host latency: 2.29061 ms (end to end 4.18779 ms, enqueue 0.482202 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23838 ms - Host latency: 2.32443 ms (end to end 4.26239 ms, enqueue 0.474207 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26908 ms - Host latency: 2.35459 ms (end to end 4.33544 ms, enqueue 0.463953 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23539 ms - Host latency: 2.32285 ms (end to end 4.25945 ms, enqueue 0.469275 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22572 ms - Host latency: 2.31083 ms (end to end 4.22483 ms, enqueue 0.470142 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24708 ms - Host latency: 2.33378 ms (end to end 4.27649 ms, enqueue 0.495642 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.2361 ms - Host latency: 2.32383 ms (end to end 4.25802 ms, enqueue 0.469214 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24493 ms - Host latency: 2.33043 ms (end to end 4.29318 ms, enqueue 0.465063 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26113 ms - Host latency: 2.34788 ms (end to end 4.31179 ms, enqueue 0.481934 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20432 ms - Host latency: 2.29045 ms (end to end 4.19277 ms, enqueue 0.49292 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22385 ms - Host latency: 2.31062 ms (end to end 4.24583 ms, enqueue 0.462061 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26653 ms - Host latency: 2.35173 ms (end to end 4.32502 ms, enqueue 0.463013 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24016 ms - Host latency: 2.32729 ms (end to end 4.24348 ms, enqueue 0.475439 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.2395 ms - Host latency: 2.32607 ms (end to end 4.25613 ms, enqueue 0.471729 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23633 ms - Host latency: 2.32173 ms (end to end 4.24744 ms, enqueue 0.477246 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23892 ms - Host latency: 2.32571 ms (end to end 4.2562 ms, enqueue 0.489233 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.2457 ms - Host latency: 2.33235 ms (end to end 4.26201 ms, enqueue 0.466089 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27427 ms - Host latency: 2.36052 ms (end to end 4.32622 ms, enqueue 0.462769 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20342 ms - Host latency: 2.2897 ms (end to end 4.17219 ms, enqueue 0.502075 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24248 ms - Host latency: 2.33113 ms (end to end 4.26455 ms, enqueue 0.496436 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26936 ms - Host latency: 2.35735 ms (end to end 4.3075 ms, enqueue 0.483936 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23357 ms - Host latency: 2.31973 ms (end to end 4.23652 ms, enqueue 0.473608 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.28398 ms - Host latency: 2.36812 ms (end to end 4.3135 ms, enqueue 0.471606 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20295 ms - Host latency: 2.28948 ms (end to end 4.22996 ms, enqueue 0.482837 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23259 ms - Host latency: 2.3186 ms (end to end 4.24988 ms, enqueue 0.469995 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27068 ms - Host latency: 2.35911 ms (end to end 4.32979 ms, enqueue 0.482227 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24565 ms - Host latency: 2.33179 ms (end to end 4.26689 ms, enqueue 0.505151 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20442 ms - Host latency: 2.29082 ms (end to end 4.18738 ms, enqueue 0.472119 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.28159 ms - Host latency: 2.36763 ms (end to end 4.31555 ms, enqueue 0.495703 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22891 ms - Host latency: 2.31467 ms (end to end 4.30391 ms, enqueue 0.469995 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24353 ms - Host latency: 2.32952 ms (end to end 4.26111 ms, enqueue 0.487085 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24905 ms - Host latency: 2.33601 ms (end to end 4.2854 ms, enqueue 0.490674 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20352 ms - Host latency: 2.29058 ms (end to end 4.18093 ms, enqueue 0.504395 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23745 ms - Host latency: 2.32563 ms (end to end 4.25728 ms, enqueue 0.497192 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.28003 ms - Host latency: 2.36719 ms (end to end 4.36123 ms, enqueue 0.485059 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.2353 ms - Host latency: 2.32129 ms (end to end 4.24128 ms, enqueue 0.490527 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20181 ms - Host latency: 2.28652 ms (end to end 4.1802 ms, enqueue 0.474048 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27432 ms - Host latency: 2.39465 ms (end to end 4.35486 ms, enqueue 0.5396 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23333 ms - Host latency: 2.36357 ms (end to end 4.30889 ms, enqueue 0.496558 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24529 ms - Host latency: 2.37998 ms (end to end 4.30935 ms, enqueue 0.500147 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27681 ms - Host latency: 2.41082 ms (end to end 4.3594 ms, enqueue 0.496631 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20339 ms - Host latency: 2.33174 ms (end to end 4.20864 ms, enqueue 0.493335 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22449 ms - Host latency: 2.3571 ms (end to end 4.25181 ms, enqueue 0.4875 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.25591 ms - Host latency: 2.37466 ms (end to end 4.31016 ms, enqueue 0.547778 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.24424 ms - Host latency: 2.33044 ms (end to end 4.27273 ms, enqueue 0.492261 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20598 ms - Host latency: 2.29263 ms (end to end 4.20706 ms, enqueue 0.494971 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27053 ms - Host latency: 2.35852 ms (end to end 4.3312 ms, enqueue 0.490942 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23293 ms - Host latency: 2.31953 ms (end to end 4.26013 ms, enqueue 0.488379 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.31204 ms - Host latency: 2.39712 ms (end to end 4.38572 ms, enqueue 0.468066 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20251 ms - Host latency: 2.28765 ms (end to end 4.22263 ms, enqueue 0.487939 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20251 ms - Host latency: 2.29014 ms (end to end 4.17424 ms, enqueue 0.503467 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.27876 ms - Host latency: 2.36323 ms (end to end 4.30574 ms, enqueue 0.471167 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23135 ms - Host latency: 2.31685 ms (end to end 4.27527 ms, enqueue 0.480273 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23149 ms - Host latency: 2.31978 ms (end to end 4.23577 ms, enqueue 0.47478 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26538 ms - Host latency: 2.35183 ms (end to end 4.30144 ms, enqueue 0.477222 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.20432 ms - Host latency: 2.29194 ms (end to end 4.20798 ms, enqueue 0.484937 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.23801 ms - Host latency: 2.32449 ms (end to end 4.25493 ms, enqueue 0.489404 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.26489 ms - Host latency: 2.35122 ms (end to end 4.31753 ms, enqueue 0.479688 ms)
[12/30/2022-14:24:25] [I] Average on 10 runs - GPU latency: 2.22673 ms - Host latency: 2.31384 ms (end to end 4.23433 ms, enqueue 0.485156 ms)
[12/30/2022-14:24:25] [I] 
[12/30/2022-14:24:25] [I] === Performance summary ===
[12/30/2022-14:24:25] [I] Throughput: 442.991 qps
[12/30/2022-14:24:25] [I] Latency: min = 2.27661 ms, max = 2.76099 ms, mean = 2.33359 ms, median = 2.29205 ms, percentile(99%) = 2.7052 ms
[12/30/2022-14:24:25] [I] End-to-End Host Latency: min = 2.30627 ms, max = 4.84204 ms, mean = 4.27034 ms, median = 4.19855 ms, percentile(99%) = 4.67358 ms
[12/30/2022-14:24:25] [I] Enqueue Time: min = 0.331787 ms, max = 0.686279 ms, mean = 0.484204 ms, median = 0.475586 ms, percentile(99%) = 0.597778 ms
[12/30/2022-14:24:25] [I] H2D Latency: min = 0.0662231 ms, max = 0.181885 ms, mean = 0.0883145 ms, median = 0.0834961 ms, percentile(99%) = 0.138428 ms
[12/30/2022-14:24:25] [I] GPU Compute Time: min = 2.19238 ms, max = 2.65527 ms, mean = 2.2417 ms, median = 2.20459 ms, percentile(99%) = 2.61328 ms
[12/30/2022-14:24:25] [I] D2H Latency: min = 0.00250244 ms, max = 0.0101318 ms, mean = 0.00357835 ms, median = 0.00317383 ms, percentile(99%) = 0.00537109 ms
[12/30/2022-14:24:25] [I] Total Host Walltime: 3.00683 s
[12/30/2022-14:24:25] [I] Total GPU Compute Time: 2.98595 s
[12/30/2022-14:24:25] [I] Explanations of the performance metrics are printed in the verbose logs.
[12/30/2022-14:24:25] [I] 
&&&& PASSED TensorRT.trtexec [TensorRT v8003] # trtexec --onnx=resnet50_wSoftmax.onnx --saveEngine=resnet50_wSoftmax.engine --explicitBatch
[12/30/2022-14:24:25] [I] [TRT] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 747, GPU 1003 (MiB)

it 's seem ok for convertion. so I write the config file for config_infer_secondary_1k.txt


[property]
gpu-id=0
net-scale-factor=1
model-engine-file=/home/incar/tms/source/gb/resnet50_wSoftmax.engine
#infer-dims=3;330;330
#int8-calib-file=/home/incar/tms/deepstream-6.0/samples/models/Secondary_CarColor/cal_trt.bin
mean-file=/home/incar/tms/source/gb/224.ppm
labelfile-path=/home/incar/tms/source/gb/alllable.txt
force-implicit-batch-dim=1
batch-size=1
model-color-format=1
process-mode=2
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
is-classifier=1
#output-blob-names=output
classifier-async-mode=0
classifier-threshold=0.51
input-object-min-width=20
input-object-min-height=20
operate-on-gie-id=1
operate-on-class-ids=0;1;2
classifier-type=carcolor
gie-unique-id=2
num-detected-classes=1000


#scaling-filter=0
#scaling-compute-hw=0

whatever i do ,the error result show in runtime windows ,the class always spot lights. coud u give me some adivce?

Fiona.Chen · January 3, 2023, 12:39am

tms2003:

def preprocessing(img):
    # pre image
    IMAGENET_MEAN = [0.485, 0.456, 0.406]
    IMAGENET_STD = [0.229, 0.224, 0.225]
    img = img[:, :, ::-1]
    img = cv2.resize(img, (224, 224))
    img = img / 255.0
    img = (img - IMAGENET_MEAN) / IMAGENET_STD
    img = img.transpose(2, 0, 1).astype(np.float32)
    tensor_img = torch.from_numpy(img)[None]
    return tensor_img

You need to transfer these preprocess algorithm to correct preprocess parameters in gst-nvinfer configuration file. Gst-nvinfer — DeepStream 6.3 Release documentation

E.G. img = img / 255.0 means the scaling factor is 1/255, you need to set “net-scale-factor=0.0039215686” in config_infer_secondary_1k.txt.

Please read the code and algorithm by yourself. The parameters in nvinfer configuration file has been explained in Gst-nvinfer — DeepStream 6.3 Release documentation and DeepStream SDK FAQ - Intelligent Video Analytics / DeepStream SDK - NVIDIA Developer Forums.

system · January 17, 2023, 12:39am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Deepstream secondary gie classification results are incorrect DeepStream SDK jetson-inference , gstreamer , deepstream61	2	140	July 5, 2024
Inaccurate GIE classification after converting to TensorRT DeepStream SDK tensorrt , onnx , caffe , deepstream	18	666	July 22, 2024
Issue with image classification tutorial and testing with deepstream-app TAO Toolkit tensorrt , jetson-inference	34	6137	October 12, 2021
Debug a customized classification model with TLT 2.0 + Deepstream 5.0 DeepStream SDK	5	1065	March 15, 2022
Custom resnet50 classification model secondary-gie0 outputs only first entry in label.txt DeepStream SDK	4	689	December 13, 2021
Whether the primary reasoning of deep stream can directly classify images without detection DeepStream SDK	2	809	September 27, 2022
Issue when a simple classification model deployed with Deepstream 5.0 Jetpack 4.4 TAO Toolkit	10	758	October 12, 2021
Why after i use tritonserver and when i run deepstream the classification result is different even though using the same weight file? DeepStream SDK deepstream	2	74	January 20, 2025
Custom classification deploy deepstream sgie DeepStream SDK	14	1533	November 7, 2022
Secondary classifiers labels are missing from output DeepStream SDK	4	807	July 22, 2020

Deepstream secondary classification error, all results point to the same wrong category!

Related topics