HI everyone I am trying to follow this tutorial
but when I run this command on my onnx file it gives me an assertion error.
(tao-toolkit-env) addverb@addverb-MS-7C84:~/kartik/yolov8/models$ python3 -m modelopt.onnx.quantization --onnx_path unet_model_common_opset13.onnx --quantize_mode int8 --output_path unet-co
mmon-quantized.onnx
WARNING:root:No custom ops found. If that's not correct, please make sure that the 'tensorrt' python is correctly installed and that the path to 'libcudnn*.so' is in PATH or LD_LIBRARY_PATH. If the custom op is not directly available as a plugin in TensorRT, please also make sure that the path to the compiled '.so' TensorRT plugin is also being given via the '--trt_plugins' flag (requires TRT 10+).
INFO:root:Model unet_model_common_opset13.onnx with opset_version 13 is loaded.
INFO:root:Quantization Mode: int8
INFO:root:Quantizable op types in the model: ['Conv', 'Resize', 'MaxPool']
INFO:root:Building non-residual Add input map ...
INFO:root:Searching for hard-coded patterns like MHA, LayerNorm, etc. to avoid quantization.
INFO:root:Building KGEN/CASK targeted partitions ...
INFO:root:Classifying the partition nodes ...
INFO:root:Total number of nodes: 121
WARNING:root:Please consider to run pre-processing before quantization. Refer to example: https://github.com/microsoft/onnxruntime-inference-examples/blob/main/quantization/image_classification/cpu/ReadMe.md
2025-03-27 11:03:17.316573823 [W:onnxruntime:, transformer_memcpy.cc:74 ApplyImpl] 4 Memcpy nodes are added to the graph main_graph for CUDAExecutionProvider. It might have negative impact on performance (including unable to run CUDA graph). Set session_options.log_severity_level=1 to see the detail logs before this message.
Collecting tensor data and making histogram ...
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 35/35 [00:01<00:00, 26.35it/s]
Finding optimal threshold for each tensor using 'entropy' algorithm ...
Number of tensors : 35
Number of histogram bins : 128 (The number may increase depends on the data it collects)
Number of quantized bins : 128
Traceback (most recent call last):
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/modelopt/onnx/quantization/__main__.py", line 228, in <module>
main()
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/modelopt/onnx/quantization/__main__.py", line 201, in main
quantize(
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/modelopt/onnx/quantization/quantize.py", line 310, in quantize
onnx_model = quantize_func(
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/modelopt/onnx/quantization/int8.py", line 220, in quantize
quantize_static(
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/modelopt/onnx/quantization/ort_patching.py", line 547, in _quantize_static
tensors_range = calibrator.compute_data()
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/onnxruntime/quantization/calibrate.py", line 565, in compute_data
return TensorsData(cal, self.collector.compute_collection_result())
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/onnxruntime/quantization/calibrate.py", line 857, in compute_collection_result
return self.compute_entropy()
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/onnxruntime/quantization/calibrate.py", line 922, in compute_entropy
optimal_threshold = self.get_entropy_threshold(histogram, num_quantized_bins)
File "/home/addverb/miniconda3/envs/tao-toolkit-env/lib/python3.8/site-packages/onnxruntime/quantization/calibrate.py", line 1080, in get_entropy_threshold
assert hasattr(optimal_threshold[0], "dtype")
AssertionError
Kindly help me in this matter. I am attaching my onnx file for your reference.
Environment
TensorRT Version: v100900
Operating System + Version: Ubuntu 22.04
Relevant Files
Edit: I have also checked that my onnx file is valid.