TensorRT6 Dynamic Input Size does not support int8 with calibrator.

The hidden interface is just avaliable for int8 with calibrator on fixed input size.

You didn’t test it for dynamic input size and just place on the guider page ?

Hi,

Can you provide the following information so we can better help?

Provide details on the platforms you are using:
Linux distro and version
GPU type
Nvidia driver version
CUDA version
CUDNN version
Python version [if using python]
Tensorflow version
TensorRT version
If Jetson, OS, hw versions

Files

Include any logs, source, models (.uff, .pb, etc.) that would be helpful to diagnose the problem.

If relevant, please include the full traceback.


Reproducibility

Please provide a minimal test case that reproduces your error.

Ubuntu16.04
P40
396.26
9.0
7.6.3
3.6.8
1.12
6.0.1.5

class MyCalibrator(trt.IInt8EntropyCalibrator2):
    def __init__(self, engine8_path, num_of_samples):
        trt.IInt8EntropyCalibrator2.__init__(self)

        self.num_of_samples = num_of_samples

        self.device_dict = dict()
        self.load_a_batch = self.get_a_batch()

        basename = os.path.basename(engine8_path)
        self.cache_file = f'{CACHE_FOLDER}/{basename}.cache_{num_of_samples}'

    def copy_data(self, data):
        if data.shape not in self.device_dict:
            nbytes = data.size * trt.float32.itemsize
            self.device_dict[data.shape] = cuda.mem_alloc(nbytes)

        cuda.memcpy_htod(self.device_dict[data.shape], data)
        return int(self.device_dict[data.shape])

    def get_a_batch(self):
        count = 0
        for line in open(IMAGE8_PATH):
            path = line.strip()
            path = os.path.join(IMAGE8_DIR, path)
            data = read_image(path)
            device_id = self.copy_data(data)

            count += 1
            print(f'[{"%06d"%count}] Reading {path} ok ...')
            yield [device_id]

            if count >= self.num_of_samples: break

    def get_batch(self, *names):
        try:
            return next(self.load_a_batch)
        except StopIteration:
            return None

    def get_batch_size(self):
        return 1

    def read_calibration_cache(self):
        if os.path.exists(self.cache_file):
            with open(self.cache_file, "rb") as f:
                return f.read()

    def write_calibration_cache(self, cache):
        with open(self.cache_file, "wb") as f:
            f.write(cache)

#---------------------------------------------------------------

def build_engine8(weights, calibrator):                                                                                                                             
    flag = 1 << int(trt.NetworkDefinitionCreationFlag.EXPLICIT_BATCH)                                                                                               
    with trt.Builder(TRT_LOGGER) as builder, \                                                                                                                      
            builder.create_network(flag) as network, \                                                                                                              
            builder.create_builder_config() as config:                                                                                                              
        get_network(ModelData, network, weights)                                                                                                                    
                                                                                                                                                                    
        builder.max_batch_size = calibrator.get_batch_size()                                                                                                        
        builder.max_workspace_size = common.GiB(100)                                                                                                                
                                                                                                                                                                    
        config.int8_calibrator = calibrator                                                                                                                         
        config.set_flag(trt.BuilderFlag.INT8)                                                                                                                       
        config.set_flag(trt.BuilderFlag.STRICT_TYPES)                                                                                                               
                                                                                                                                                                    
        profile = builder.create_optimization_profile()                                                                                                             
        profile.set_shape(                                                                                                                                          
                ModelData.INPUT_NAME,                                                                                                                               
                ModelData.MIN_INPUT_SHAPE,                                                                                                                          
                ModelData.OPT_INPUT_SHAPE,                                                                                                                          
                ModelData.MAX_INPUT_SHAPE)                                                                                                                          
        config.add_optimization_profile(profile)                                                                                                                    
        # print(config.get_flag(trt.BuilderFlag.INT8))                                                                                                              
        # print(config.get_flag(trt.BuilderFlag.STRICT_TYPES))                                                                                                      
        return builder.build_engine(network, config)

Using the above code to quantize a model, error occurs as follows.

[TensorRT] WARNING: Explicit batch network detected and batch size specified, use execute without batch size instead.                                               
[TensorRT] ERROR: Parameter check failed at: engine.cpp::resolveSlots::1024, condition: allInputDimensionsSpecified(routine)

I think the problem is that, the internal calibrating API in building_cuda_engine isn’t designed for dynamic input size, since there is no interface for the custom calibrator to control the process of feeding data to the calibrator but just return a batch data. So the user can’t set the bindingShape, and finally a allInputDimensionsSpecified error occurs.

I think the trt.IInt8EntropyCalibrator2.get_batch function should provide another arg as context, for the user to do context.set_binding_shape(0, data.shape), and the problem would be solved.

And the internal calibrating process seems to used the deprecated enqueue function but not the enqueueV2, since WARNING: Explicit batch network detected and batch size specified, use execute without batch size instead. .

@NVES_R

builder.max_workspace_size = common.GiB(100)

Maybe this number too big?

Hi yfjiaren,

I’m not sure what input shapes you’re using here:

    profile.set_shape(                                                                                                                                          
            ModelData.INPUT_NAME,                                                                                                                               
            ModelData.MIN_INPUT_SHAPE,                                                                                                                          
            ModelData.OPT_INPUT_SHAPE,                                                                                                                          
            ModelData.MAX_INPUT_SHAPE)

But I see this in the docs, specifically for int8: https://docs.nvidia.com/deeplearning/sdk/tensorrt-archived/tensorrt-601/tensorrt-developer-guide/index.html#rest_dynamic_shapes

“Int8 requires that the channel dimension be a build-time constant.”

Perhaps your channel dimension is dynamic, and that isn’t allowed for Int8?

class ModelData:                                                                                                                                                                
    MIN_BATCH = 1                                                                                                                                                               
    OPT_BATCH = 1                                                                                                                                                               
    MAX_BATCH = 8                                                                                                                                                               
                                                                                                                                                                                
    BLOCK_SIZE = 8                                                                                                                                                              
    FIX_HEIGHT = 48                                                                                                                                                             
    MIN_WIDTH = 48                                                                                                                                                              
    OPT_WIDTH = 480                                                                                                                                                             
    MAX_WIDTH = 1200                                                                                                                                                            
                                                                                                                                                                                
    INPUT_NAME = "input"                                                                                                                                                        
    INPUT_DTYPE = trt.float32                                                                                                                                                   
    INPUT_SHAPE = (-1, 1, FIX_HEIGHT, -1)                                                                                                                                       
    OUTPUT_NAME = "output"                                                                                                                                                      
    OUTPUT_SHAPE = (-1, -1, 21502)                                                                                                                                              
                                                                                                                                                                                
    MIN_INPUT_SHAPE = (MIN_BATCH, 1, FIX_HEIGHT, MIN_WIDTH)                                                                                                                     
    OPT_INPUT_SHAPE = (OPT_BATCH, 1, FIX_HEIGHT, OPT_WIDTH)                                                                                                                     
    MAX_INPUT_SHAPE = (MAX_BATCH, 1, FIX_HEIGHT, MAX_WIDTH)

This is the definition of ModelData class.

Hi yfjiaren,

Does this error still occur for you in TensorRT 7? If so, can you share a sample dynamic shape ONNX model for me to play around with?

However, just FYI there is a separate issue with ONNX + INT8 calibration being tracked here that may affect you: https://github.com/NVIDIA/TensorRT/issues/289. The issue was just fixed internally, not sure when it will be available yet.

Yes, the problem still exists in version 7.0.0.11 …