Bpnet sample code error

AK51 · September 29, 2022, 7:40am

Hi,

I am trying the bpnet in cv_samples_v1.4.0
Everything is ok except the this step.
I got this error, not sure what it is. Thx
In[61]
[ERROR] 1: Unexpected exception _Map_base::at
[ERROR] Unable to create engine

9.4. Generate TensorRT engine

# Convert to TensorRT engine(INT8).
!tao converter $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
                -k $KEY \
                -t int8 \
                -c $USER_EXPERIMENT_DIR/models/exp_m1_final/calibration.$IN_HEIGHT.$IN_WIDTH.bin \
                -e $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.engine \
                -p ${INPUT_NAME},1x$INPUT_SHAPE,${OPT_BATCH_SIZE}x$INPUT_SHAPE,${MAX_BATCH_SIZE}x$INPUT_SHAPE

2022-09-29 12:31:55,854 [INFO] root: Registry: ['nvcr.io']
2022-09-29 12:31:55,916 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
2022-09-29 12:31:55,943 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/nvidia/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[INFO] [MemUsageChange] Init CUDA: CPU +337, GPU +0, now: CPU 348, GPU 549 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 348 MiB, GPU 549 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 483 MiB, GPU 581 MiB
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/filef6Fr17
[INFO] ONNX IR version:  0.0.5
[INFO] Opset version:    10
[INFO] Producer name:    tf2onnx
[INFO] Producer version: 1.9.2
[INFO] Domain:           
[INFO] Model version:    0
[INFO] Doc string:       
[INFO] ----------------------------------------------------------------
[INFO] Detected input dimensions from the model: (-1, -1, -1, 3)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Using optimization profile opt shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Using optimization profile max shape: (1, 288, 384, 3) for input: input_1:0
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +518, GPU +226, now: CPU 1057, GPU 807 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +114, GPU +52, now: CPU 1171, GPU 859 (MiB)
[INFO] Timing cache disabled. Turning it on will improve builder speed.
[WARNING] Calibration Profile is not defined. Running calibration with Profile 0
[INFO] Detected 1 inputs and 2 output network tensors.
[INFO] Total Host Persistent Memory: 9120
[INFO] Total Device Persistent Memory: 0
[INFO] Total Scratch Memory: 0
[INFO] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 56 MiB
[INFO] [BlockAssignment] Algorithm ShiftNTopDown took 2.00965ms to assign 5 blocks to 71 nodes requiring 58524160 bytes.
[INFO] Total Activation Memory: 58524160
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1694, GPU 1139 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1694, GPU 1147 (MiB)
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 1693, GPU 1123 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +1, GPU +8, now: CPU 1694, GPU 1131 (MiB)
[INFO] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +55, now: CPU 0, GPU 111 (MiB)
[INFO] Starting Calibration.
[INFO]   Post Processing Calibration data in 5.61e-07 seconds.
[ERROR] 1: Unexpected exception _Map_base::at
[ERROR] Unable to create engine
2022-09-29 12:31:59,034 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Morganh · September 29, 2022, 7:50am

Please try to set lower "-m " .

AK51 · September 29, 2022, 7:55am

Hi,

May I know the full line? This one does not work…

Thanks

# Convert to TensorRT engine(INT8).
!tao converter $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
                -k $KEY \
                -t int8 \
                -c $USER_EXPERIMENT_DIR/models/exp_m1_final/calibration.$IN_HEIGHT.$IN_WIDTH.bin \
                -e $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.engine \
                -p ${INPUT_NAME},1x$INPUT_SHAPE,${OPT_BATCH_SIZE}x$INPUT_SHAPE,${MAX_BATCH_SIZE}x$INPUT_SHAPE -m 1

2022-09-29 15:54:45,393 [INFO] root: Registry: ['nvcr.io']
2022-09-29 15:54:45,454 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
2022-09-29 15:54:45,479 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/nvidia/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[INFO] [MemUsageChange] Init CUDA: CPU +337, GPU +0, now: CPU 348, GPU 551 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 348 MiB, GPU 551 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 483 MiB, GPU 583 MiB
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/fileb8RINI
[INFO] ONNX IR version:  0.0.5
[INFO] Opset version:    10
[INFO] Producer name:    tf2onnx
[INFO] Producer version: 1.9.2
[INFO] Domain:           
[INFO] Model version:    0
[INFO] Doc string:       
[INFO] ----------------------------------------------------------------
[INFO] Detected input dimensions from the model: (-1, -1, -1, 3)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Using optimization profile opt shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Using optimization profile max shape: (1, 288, 384, 3) for input: input_1:0
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +518, GPU +226, now: CPU 1057, GPU 809 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +114, GPU +52, now: CPU 1171, GPU 861 (MiB)
[INFO] Timing cache disabled. Turning it on will improve builder speed.
[WARNING] Calibration Profile is not defined. Running calibration with Profile 0
[INFO] Detected 1 inputs and 2 output network tensors.
[INFO] Total Host Persistent Memory: 9120
[INFO] Total Device Persistent Memory: 0
[INFO] Total Scratch Memory: 0
[INFO] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 56 MiB
[INFO] [BlockAssignment] Algorithm ShiftNTopDown took 2.00978ms to assign 5 blocks to 71 nodes requiring 58524160 bytes.
[INFO] Total Activation Memory: 58524160
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1694, GPU 1141 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1694, GPU 1149 (MiB)
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 1693, GPU 1125 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +1, GPU +8, now: CPU 1694, GPU 1133 (MiB)
[INFO] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +55, now: CPU 0, GPU 111 (MiB)
[INFO] Starting Calibration.
[INFO]   Post Processing Calibration data in 5.2e-07 seconds.
[ERROR] 1: Unexpected exception _Map_base::at
[ERROR] Unable to create engine
2022-09-29 15:54:48,563 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Morganh · September 29, 2022, 7:59am

According to below info, please try to increase the workspace size.

root@df5f93481c11:/workspace# converter -h
usage: converter [-h] [-e ENGINE_FILE_PATH]
        [-k ENCODE_KEY] [-c CACHE_FILE]
        [-o OUTPUTS] [-d INPUT_DIMENSIONS]
        [-b BATCH_SIZE] [-m MAX_BATCH_SIZE]
        [-w MAX_WORKSPACE_SIZE] [-t DATA_TYPE]
        [-i INPUT_ORDER] [-s] [-u DLA_CORE]
        input_file

Generate TensorRT engine from exported model

positional arguments:
  input_file            Input file (.etlt exported model).

required flag arguments:
  -d            comma separated list of input dimensions(not required for TLT 3.0 new models).
  -k            model encoding key.

optional flag arguments:
  -b            calibration batch size (default 8).
  -c            calibration cache file (default cal.bin).
  -e            file the engine is saved to (default saved.engine).
  -i            input dimension ordering -- nchw, nhwc, nc (default nchw).
  -m            maximum TensorRT engine batch size (default 16). If meet with out-of-memory issue, please decrease the batch size accordingly.
  -o            comma separated list of output node names (default none).
  -p            comma separated list of optimization profile shapes in the format <input_name>,<min_shape>,<opt_shape>,<max_shape>, where each shape has `x` as delimiter, e.g., NxC, NxCxHxW, NxCxDxHxW, etc. Can be specified multiple times if there are multiple input tensors for the model. This argument is only useful in dynamic shape case.
  -s            TensorRT strict_type_constraints flag for INT8 mode(default false).
  -t            TensorRT data type -- fp32, fp16, int8 (default fp32).
  -u            Use DLA core N for layers that support DLA(default = -1, which means no DLA core will be utilized for inference. Note that it'll always allow GPU fallback).
  -w            maximum workspace size of TensorRT engine (default 1<<30). If meet with out-of-memory issue, please increase the workspace size accordingly.

AK51 · September 29, 2022, 8:03am

Morganh:


positional arguments:
  input_file            Input file (.etlt exported model).

required flag arguments:
  -d            comma separated list of input dimensions(not required for TLT 3.0 new models).
  -k            model encoding key.

Hi,

I have set “-m 1” in my previous message, same error…

AK51 · September 29, 2022, 8:06am

Hi,

I does not seem like a memory issue… I did try -m 1 and -m 16

AK51 · September 29, 2022, 8:47am

Hi,

I have tried 512, same error

# Convert to TensorRT engine(INT8).

!tao converter $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \

                -k $KEY \

                -t int8 \

                -c $USER_EXPERIMENT_DIR/models/exp_m1_final/calibration.$IN_HEIGHT.$IN_WIDTH.bin \

                -e $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.engine \

                -p ${INPUT_NAME},1x$INPUT_SHAPE,${OPT_BATCH_SIZE}x$INPUT_SHAPE,${MAX_BATCH_SIZE}x$INPUT_SHAPE \

                -m 512

                

2022-09-29 16:46:08,588 [INFO] root: Registry: ['nvcr.io']
2022-09-29 16:46:08,649 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
2022-09-29 16:46:08,676 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/nvidia/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[INFO] [MemUsageChange] Init CUDA: CPU +337, GPU +0, now: CPU 348, GPU 556 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 348 MiB, GPU 556 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 483 MiB, GPU 588 MiB
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/filegKLAQ5
[INFO] ONNX IR version:  0.0.5
[INFO] Opset version:    10
[INFO] Producer name:    tf2onnx
[INFO] Producer version: 1.9.2
[INFO] Domain:           
[INFO] Model version:    0
[INFO] Doc string:       
[INFO] ----------------------------------------------------------------
[INFO] Detected input dimensions from the model: (-1, -1, -1, 3)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Using optimization profile opt shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Using optimization profile max shape: (1, 288, 384, 3) for input: input_1:0
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +518, GPU +226, now: CPU 1057, GPU 814 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +114, GPU +52, now: CPU 1171, GPU 866 (MiB)
[INFO] Timing cache disabled. Turning it on will improve builder speed.
[WARNING] Calibration Profile is not defined. Running calibration with Profile 0
[INFO] Detected 1 inputs and 2 output network tensors.
[INFO] Total Host Persistent Memory: 9120
[INFO] Total Device Persistent Memory: 0
[INFO] Total Scratch Memory: 0
[INFO] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 0 MiB, GPU 56 MiB
[INFO] [BlockAssignment] Algorithm ShiftNTopDown took 2.01691ms to assign 5 blocks to 71 nodes requiring 58524160 bytes.
[INFO] Total Activation Memory: 58524160
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1694, GPU 1146 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1694, GPU 1154 (MiB)
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +10, now: CPU 1693, GPU 1130 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +1, GPU +8, now: CPU 1694, GPU 1138 (MiB)
[INFO] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +55, now: CPU 0, GPU 111 (MiB)
[INFO] Starting Calibration.
[INFO]   Post Processing Calibration data in 5.33e-07 seconds.
[ERROR] 1: Unexpected exception _Map_base::at
[ERROR] Unable to create engine
2022-09-29 16:46:11,855 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Morganh · September 29, 2022, 8:52am

According to the hint of Convert model to Jetson Error during model export step in TAO notebook - #20 by Morganh, make sure the cal.bin is available.

! tao bpnet run ls $USER_EXPERIMENT_DIR/models/exp_m1_final/calibration.$IN_HEIGHT.$IN_WIDTH.bin

AK51 · September 30, 2022, 5:18am

Hi,

This line can run but lots of warning… It is normal? Thx

# Convert to TensorRT engine(INT8).
!tao converter $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.etlt \
                -k $KEY \
                -t int8 \
                -c $USER_EXPERIMENT_DIR/models/exp_m1_final/calibration.$IN_HEIGHT.$IN_WIDTH.deploy.bin \
                -e $USER_EXPERIMENT_DIR/models/exp_m1_final/bpnet_model.$IN_HEIGHT.$IN_WIDTH.int8.engine \
                -p ${INPUT_NAME},1x$INPUT_SHAPE,${OPT_BATCH_SIZE}x$INPUT_SHAPE,${MAX_BATCH_SIZE}x$INPUT_SHAPE

2022-09-30 17:47:36,909 [INFO] root: Registry: ['nvcr.io']
2022-09-30 17:47:36,967 [INFO] tlt.components.instance_handler.local_instance: Running command in container: nvcr.io/nvidia/tao/tao-toolkit-tf:v3.22.05-tf1.15.5-py3
2022-09-30 17:47:36,999 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/nvidia/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
[INFO] [MemUsageChange] Init CUDA: CPU +337, GPU +0, now: CPU 348, GPU 599 (MiB)
[INFO] [MemUsageSnapshot] Begin constructing builder kernel library: CPU 348 MiB, GPU 599 MiB
[INFO] [MemUsageSnapshot] End constructing builder kernel library: CPU 483 MiB, GPU 631 MiB
[INFO] ----------------------------------------------------------------
[INFO] Input filename:   /tmp/fileALZYz3
[INFO] ONNX IR version:  0.0.5
[INFO] Opset version:    10
[INFO] Producer name:    tf2onnx
[INFO] Producer version: 1.9.2
[INFO] Domain:           
[INFO] Model version:    0
[INFO] Doc string:       
[INFO] ----------------------------------------------------------------
[INFO] Detected input dimensions from the model: (-1, -1, -1, 3)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Using optimization profile opt shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Using optimization profile max shape: (1, 288, 384, 3) for input: input_1:0
[INFO] Reading Calibration Cache for calibrator: EntropyCalibration2
[INFO] Generated calibration scales using calibration cache. Make sure that calibration cache has latest scales.
[INFO] To regenerate calibration cache, please delete the existing one. TensorRT will generate a new calibration cache.
[WARNING] Missing scale and zero-point for tensor block_1a_conv_1/convolution__93:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__389:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__393:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__397:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__401:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__405:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__409:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__413:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__417:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__421:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__425:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__429:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__433:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__449:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__453:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__457:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__485:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__489:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__469:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__473:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__493:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__497:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__501:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__505:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__509:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__537:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__541:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__521:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__525:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__545:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__549:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__553:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__557:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__561:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__581:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__585:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__573:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor Conv__577:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor paf_out/BiasAdd:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +518, GPU +226, now: CPU 1057, GPU 857 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +115, GPU +52, now: CPU 1172, GPU 909 (MiB)
[INFO] Local timing cache in use. Profiling results in this builder pass will not be stored.

[INFO] Some tactics do not have sufficient workspace memory to run. Increasing workspace size may increase performance, please check verbose output.
[INFO] Detected 1 inputs and 2 output network tensors.
[INFO] Total Host Persistent Memory: 70464
[INFO] Total Device Persistent Memory: 16681984
[INFO] Total Scratch Memory: 0
[INFO] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 15 MiB, GPU 998 MiB
[INFO] [BlockAssignment] Algorithm ShiftNTopDown took 0.650256ms to assign 4 blocks to 40 nodes requiring 11280384 bytes.
[INFO] Total Activation Memory: 11280384
[INFO] [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +8, now: CPU 1705, GPU 1149 (MiB)
[INFO] [MemUsageChange] Init cuDNN: CPU +0, GPU +8, now: CPU 1705, GPU 1157 (MiB)
[INFO] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +14, GPU +16, now: CPU 14, GPU 16 (MiB)
2022-09-30 17:48:07,279 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Thx

AK51 · September 30, 2022, 10:13am

Hi,
It seems working… thx
And when it is done, how to run the model in Jetson Orin?
I am still trying to find a tutorial link.
'Coz my goal is to retrain the bpnet and deploy the model in Jetson.
And what software for posenet annotation? Sorry for so many follow up questions…

Thx

Morganh · October 10, 2022, 6:23am

See Body Pose Estimation — TAO Toolkit 3.22.05 documentation and then Deepstream-TAO Other apps repository.

AK51 · October 11, 2022, 9:12am

Hi, I tried to follow the link

(base) nvidia@nvidia-desktop:~/deepstream_tao_apps$ make
make -C post_processor
make[1]: Entering directory '/home/nvidia/deepstream_tao_apps/post_processor'
/bin/sh: 1: deepstream-app: not found
g++ -o libnvds_infercustomparser_tao.so nvdsinfer_custombboxparser_tao.cpp -I/opt/nvidia/deepstream/deepstream-/sources/includes -I/usr/local/cuda-11.6/include -Wall -std=c++11 -shared -fPIC -Wl,--start-group -lnvinfer -lnvparsers -L/usr/local/cuda-11.6/lib64 -lcudart -lcublas -Wl,--end-group
nvdsinfer_custombboxparser_tao.cpp:25:10: fatal error: nvdsinfer_custom_impl.h: No such file or directory
   25 | #include "nvdsinfer_custom_impl.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make[1]: *** [Makefile:49: libnvds_infercustomparser_tao.so] Error 1
make[1]: Leaving directory '/home/nvidia/deepstream_tao_apps/post_processor'
make: *** [Makefile:24: all] Error 2
(base) nvidia@nvidia-desktop:~/deepstream_tao_apps$

Morganh · October 11, 2022, 2:35pm

There is no update from you for a period, assuming this is not an issue anymore.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Please check deepstream_tao_apps/apps/tao_others at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub

system · November 15, 2022, 1:55am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
TAO toolkit fails to convert RetinaNet INT8 etlt model to INT8 CUDA engine (calibration cache needs to be deleted?) TAO Toolkit tensorrt , cuda	4	452	June 10, 2022
Failed to create .engine File TAO Toolkit	33	2033	July 11, 2022
Cannot infer with fpenet with TensorRT8.0 TAO Toolkit	14	1579	March 3, 2022
[Hugging Face transformer models + pytorch_quantization] PTQ quantization int8 is slower than fp16 TensorRT tensorrt , python , onnx , natural-language-processing-nlp	4	3009	January 6, 2022
Deepstream infrence gives no detection TAO Toolkit	28	1932	December 9, 2021
EfficientDet in Deepstream Causes a Seg Fault TAO Toolkit efficientdet , tao	15	1062	July 19, 2022
TensorRT8 INT8 (signed char) I/O interface for ONNX model TensorRT tensorrt , onnx	4	1353	February 15, 2022
Tensorrt fails for custom ssd_inception Model TensorRT	18	2802	May 14, 2020
Error in executing TensorRT samples through docker container environment DRIVE AGX Orin General docker , driveos-dl	14	86	October 24, 2024
Deepstream-gaze-app segmentation fault DeepStream SDK	10	258	February 13, 2024

Bpnet sample code error

Related topics