Hi,
The Triton Inference Server Container 23.06 use Ubuntu 22.04
Ubuntu 22.04 has upgraded libssl to 3 and does not propose libssl1.1
So the tao-conveter generate this error:
tao-converter: error while loading shared libraries: libcrypto.so.1.1: cannot open shared object file: No such file or directory
I forced the installation of libssl1.1 by adding the ubuntu 20.04 source:
echo "deb http://security.ubuntu.com/ubuntu focal-security main" | tee /etc/apt/sources.list.d/focal-security.list
apt-get update
apt-get install libssl1.1
Then deleted the focal-security list file created:
rm /etc/apt/sources.list.d/focal-security.list
All seems to work.
below output:
[INFO] [MemUsageChange] Init CUDA: CPU +518, GPU +0, now: CPU 523, GPU 253 (MiB)
[INFO] [MemUsageChange] Init builder kernel library: CPU +883, GPU +172, now: CPU 1483, GPU 425 (MiB)
[WARNING] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
[INFO] ----------------------------------------------------------------
[INFO] Input filename: /tmp/fileb5aqZz
[INFO] ONNX IR version: 0.0.7
[INFO] Opset version: 13
[INFO] Producer name: tf2onnx
[INFO] Producer version: 1.12.0 ddca3a
[INFO] Domain:
[INFO] Model version: 0
[INFO] Doc string:
[INFO] ----------------------------------------------------------------
[WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
[INFO] Detected input dimensions from the model: (-1, 3, 256, 256)
[INFO] Model has dynamic shape. Setting up optimization profiles.
[INFO] Using optimization profile min shape: (1, 3, 256, 256) for input: input:0
[INFO] Using optimization profile opt shape: (8, 3, 256, 256) for input: input:0
[INFO] Using optimization profile max shape: (16, 3, 256, 256) for input: input:0
[INFO] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32.
[INFO] Graph optimization time: 0.0314244 seconds.
[INFO] Reading Calibration Cache for calibrator: EntropyCalibration2
[INFO] Generated calibration scales using calibration cache. Make sure that calibration cache has latest scales.
[INFO] To regenerate calibration cache, please delete the existing one. TensorRT will generate a new calibration cache.
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block1a_se_squeeze/Mean_Squeeze__402:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block2a_se_squeeze/Mean_Squeeze__398:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block2b_se_squeeze/Mean_Squeeze__380:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block3a_se_squeeze/Mean_Squeeze__384:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block3b_se_squeeze/Mean_Squeeze__388:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block4a_se_squeeze/Mean_Squeeze__408:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block4b_se_squeeze/Mean_Squeeze__406:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block4c_se_squeeze/Mean_Squeeze__404:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block5a_se_squeeze/Mean_Squeeze__390:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block5b_se_squeeze/Mean_Squeeze__396:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block5c_se_squeeze/Mean_Squeeze__386:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block6a_se_squeeze/Mean_Squeeze__400:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block6b_se_squeeze/Mean_Squeeze__392:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block6c_se_squeeze/Mean_Squeeze__378:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block6d_se_squeeze/Mean_Squeeze__394:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor StatefulPartitionedCall/efficientnet-b0/block7a_se_squeeze/Mean_Squeeze__382:0, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[WARNING] Missing scale and zero-point for tensor (Unnamed Layer* 412) [Softmax]_output, expect fall back to non-int8 implementation for any layer consuming or producing given tensor
[INFO] Graph optimization time: 0.141812 seconds.
[INFO] BuilderFlag::kTF32 is set but hardware does not support TF32. Disabling TF32.
[INFO] Local timing cache in use. Profiling results in this builder pass will not be stored.
[INFO] Detected 1 inputs and 1 output network tensors.
[INFO] Total Host Persistent Memory: 474592
[INFO] Total Device Persistent Memory: 343552
[INFO] Total Scratch Memory: 36352
[INFO] [MemUsageStats] Peak memory usage of TRT CPU/GPU memory allocators: CPU 2 MiB, GPU 65 MiB
[INFO] [BlockAssignment] Started assigning block shifts. This will take 129 steps to complete.
[INFO] [BlockAssignment] Algorithm ShiftNTopDown took 5.78666ms to assign 5 blocks to 129 nodes requiring 16777728 bytes.
[INFO] Total Activation Memory: 16777728
[INFO] (Sparsity) Layers eligible for sparse math:
[INFO] (Sparsity) TRT inference plan picked sparse implementation for layers:
[WARNING] TensorRT encountered issues when converting weights between types and that could affect accuracy.
[WARNING] If this is not the desired behavior, please modify the weights or retrain with regularization to adjust the magnitude of the weights.
[WARNING] Check verbose logs for the list of affected weights.
[WARNING] - 65 weights are affected by this issue: Detected subnormal FP16 values.
[WARNING] - 19 weights are affected by this issue: Detected values less than smallest positive FP16 subnormal value and converted them to the FP16 minimum subnormalized value.
[WARNING] - 2 weights are affected by this issue: Detected finite FP32 values which would overflow in FP16 and converted them to the closest finite FP16 value.
[INFO] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +1, GPU +4, now: CPU 1, GPU 4 (MiB)