[Urgent] Can't run `tlt-evaluate faster_rcnn` for exported model

I trained faster_rcnn and exported on one machine, then I copied the model.etlt file to another machine. Then, I run tlt-evaluate faster_rcnn on the 2nd machine using the same spec_file with model.etlt. I also tried by coping model.tlt and then exporting to model.etlt on the 2nd machine. In both cases, I get the following error.

env: LOGFILE=results/eval_testset_e150_fp32_conf_0.9_.txt
2020-11-15 12:30:20.939106: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Using TensorFlow backend.
2020-11-15 12:30:23,825 [INFO] iva.faster_rcnn.spec_loader.spec_loader: Loading experiment spec at /workspace/specs/resnet18.txt.
2020-11-15 12:30:23.833597: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-15 12:30:23.833737: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 12:30:23.834291: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce RTX 2070 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2020-11-15 12:30:23.834314: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 12:30:23.834359: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-15 12:30:23.835392: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-15 12:30:23.835440: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-15 12:30:23.836752: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-15 12:30:23.837800: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-15 12:30:23.837845: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-15 12:30:23.837933: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 12:30:23.838321: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 12:30:23.838633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-15 12:30:23.838657: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 12:30:24.384657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-15 12:30:24.384692: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-11-15 12:30:24.384700: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-11-15 12:30:24.384876: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 12:30:24.385288: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 12:30:24.385644: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 12:30:24.385975: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5827 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-11-15 12:30:24,386 [INFO] iva.faster_rcnn.scripts.test: Running evaluation with TensorRT as backend.
2020-11-15 12:30:24,387 [INFO] iva.faster_rcnn.tensorrt_inference.tensorrt_model: Building TensorRT engine from
                        the etlt model for inference: /workspace/models/exported/frcnn_resnet18.epoch150_fp32.etlt
2020-11-15 12:30:39,949 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2020-11-15 12:30:39,949 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2020-11-15 12:30:39,949 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2020-11-15 12:30:39,949 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 12, io threads: 24, compute threads: 12, buffered batches: 4
2020-11-15 12:30:39,949 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 45, number of sources: 1, batch size per gpu: 1, steps: 45
Traceback (most recent call last):
  File "/usr/local/bin/tlt-evaluate", line 8, in <module>
    sys.exit(main())
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_evaluate.py", line 45, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/scripts/test.py", line 108, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/data_loader/inputs_loader.py", line 79, in __init__
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/drivenet_dataloader.py", line 575, in get_dataset_tensors
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/blocks/trainers/multi_task_trainer/data_loader_interface.py", line 77, in __call__
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/blocks/data_loaders/multi_source_loader/data_loader.py", line 396, in call
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1990, in apply
    return DatasetV1Adapter(super(DatasetV1, self).apply(transformation_func))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1378, in apply
    dataset = transformation_func(self)
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/drivenet_dataloader.py", line 311, in <lambda>
  File "/opt/nvidia/third_party/keras/tensorflow_backend.py", line 345, in new_map
    self, _map_func_set_random_wrapper, num_parallel_calls=num_parallel_calls
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1909, in map
    MapDataset(self, map_func, preserve_cardinality=False))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3434, in __init__
    use_legacy_function=use_legacy_function)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2713, in __init__
    self._function = wrapper_fn._get_concrete_function_internal()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1853, in _get_concrete_function_internal
    *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1847, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 2147, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 2038, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2707, in wrapper_fn
    ret = _wrapper_helper(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2652, in _wrapper_helper
    ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
    raise e.ag_error_metadata.to_exception(e)
StopIteration: in converted code:

    /opt/nvidia/third_party/keras/tensorflow_backend.py:342 _map_func_set_random_wrapper  *
        return map_func(*args, **kwargs)
    /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/drivenet_dataloader.py:126 __call__
        
    /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/drivenet_dataloader.py:100 _get_parse_example
        
    /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/utilities.py:217 extract_tfrecords_features
        

    StopIteration:

tlt-evaluate faster_rcnn is running fine on the first machine.
How can I solve this issue? Please help.
Thanks.

Can you double check below which you mentioned? It does not make sense. It should be working.

I also tried by coping model.tlt and then exporting to model.etlt on the 2nd machine.

Please paste your command and full log for above.

Hi @Morganh, thanks for your reply. I just double checked.

export-command

!tlt-export faster_rcnn -m $PROJECT_DIR/models/trained/frcnn_resnet18.epoch150.tlt  \
                        -o $PROJECT_DIR/models/exported/frcnn_resnet18.epoch150_int8.etlt \
                        -e $SPECS_DIR/resnet18.txt \
                        -k $KEY \
                        --data_type int8 \
                        --batch_size 1 \
                        --batches 10 \
                        --cal_cache_file $PROJECT_DIR/models/exported/cal.bin

log

Using TensorFlow backend.
2020-11-15 20:29:18.282652: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:29:21.419925: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-15 20:29:21.420186: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:21.420636: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce RTX 2070 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2020-11-15 20:29:21.420679: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:29:21.420726: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-15 20:29:21.421700: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-15 20:29:21.421764: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-15 20:29:21.423204: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-15 20:29:21.424331: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-15 20:29:21.424393: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-15 20:29:21.424478: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:21.424908: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:21.425261: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-15 20:29:21.425305: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:29:22.145391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-15 20:29:22.145443: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-11-15 20:29:22.145474: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-11-15 20:29:22.145643: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:22.146104: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:22.146573: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:22.146992: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5925 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-11-15 20:29:33,022 [INFO] iva.faster_rcnn.spec_loader.spec_loader: Loading experiment spec at /workspace/specs/resnet18.txt.
2020-11-15 20:29:42.411422: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:42.411867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce RTX 2070 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2020-11-15 20:29:42.411947: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:29:42.412050: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-15 20:29:42.412099: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-15 20:29:42.412126: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-15 20:29:42.412188: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-15 20:29:42.412228: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-15 20:29:42.412265: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-15 20:29:42.412361: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:42.412811: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:42.413190: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-15 20:29:42.413221: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-15 20:29:42.413233: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-11-15 20:29:42.413243: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-11-15 20:29:42.413344: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:42.414215: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:29:42.414652: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5925 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-11-15 20:31:17.094879: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:17.095351: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce RTX 2070 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2020-11-15 20:31:17.095395: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:31:17.095433: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-15 20:31:17.095458: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-15 20:31:17.095476: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-15 20:31:17.095495: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-15 20:31:17.095515: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-15 20:31:17.095534: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-15 20:31:17.095614: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:17.096023: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:17.096643: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-15 20:31:17.096678: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-15 20:31:17.096691: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-11-15 20:31:17.096701: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-11-15 20:31:17.096813: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:17.097299: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:17.097751: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5925 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_image (InputLayer)        (None, 3, 480, 640)  0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 240, 320) 9408        input_image[0][0]                
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 240, 320) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 240, 320) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
block_1a_conv_1 (Conv2D)        (None, 64, 120, 160) 36864       activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_1 (BatchNormalizati (None, 64, 120, 160) 256         block_1a_conv_1[0][0]            
__________________________________________________________________________________________________
block_1a_relu_1 (Activation)    (None, 64, 120, 160) 0           block_1a_bn_1[0][0]              
__________________________________________________________________________________________________
block_1a_conv_2 (Conv2D)        (None, 64, 120, 160) 36864       block_1a_relu_1[0][0]            
__________________________________________________________________________________________________
block_1a_conv_shortcut (Conv2D) (None, 64, 120, 160) 4096        activation_1[0][0]               
__________________________________________________________________________________________________
block_1a_bn_2 (BatchNormalizati (None, 64, 120, 160) 256         block_1a_conv_2[0][0]            
__________________________________________________________________________________________________
block_1a_bn_shortcut (BatchNorm (None, 64, 120, 160) 256         block_1a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_1 (Add)                     (None, 64, 120, 160) 0           block_1a_bn_2[0][0]              
                                                                 block_1a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_1a_relu (Activation)      (None, 64, 120, 160) 0           add_1[0][0]                      
__________________________________________________________________________________________________
block_1b_conv_1 (Conv2D)        (None, 64, 120, 160) 36864       block_1a_relu[0][0]              
__________________________________________________________________________________________________
block_1b_bn_1 (BatchNormalizati (None, 64, 120, 160) 256         block_1b_conv_1[0][0]            
__________________________________________________________________________________________________
block_1b_relu_1 (Activation)    (None, 64, 120, 160) 0           block_1b_bn_1[0][0]              
__________________________________________________________________________________________________
block_1b_conv_2 (Conv2D)        (None, 64, 120, 160) 36864       block_1b_relu_1[0][0]            
__________________________________________________________________________________________________
block_1b_conv_shortcut (Conv2D) (None, 64, 120, 160) 4096        block_1a_relu[0][0]              
__________________________________________________________________________________________________
block_1b_bn_2 (BatchNormalizati (None, 64, 120, 160) 256         block_1b_conv_2[0][0]            
__________________________________________________________________________________________________
block_1b_bn_shortcut (BatchNorm (None, 64, 120, 160) 256         block_1b_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_2 (Add)                     (None, 64, 120, 160) 0           block_1b_bn_2[0][0]              
                                                                 block_1b_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_1b_relu (Activation)      (None, 64, 120, 160) 0           add_2[0][0]                      
__________________________________________________________________________________________________
block_2a_conv_1 (Conv2D)        (None, 128, 60, 80)  73728       block_1b_relu[0][0]              
__________________________________________________________________________________________________
block_2a_bn_1 (BatchNormalizati (None, 128, 60, 80)  512         block_2a_conv_1[0][0]            
__________________________________________________________________________________________________
block_2a_relu_1 (Activation)    (None, 128, 60, 80)  0           block_2a_bn_1[0][0]              
__________________________________________________________________________________________________
block_2a_conv_2 (Conv2D)        (None, 128, 60, 80)  147456      block_2a_relu_1[0][0]            
__________________________________________________________________________________________________
block_2a_conv_shortcut (Conv2D) (None, 128, 60, 80)  8192        block_1b_relu[0][0]              
__________________________________________________________________________________________________
block_2a_bn_2 (BatchNormalizati (None, 128, 60, 80)  512         block_2a_conv_2[0][0]            
__________________________________________________________________________________________________
block_2a_bn_shortcut (BatchNorm (None, 128, 60, 80)  512         block_2a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_3 (Add)                     (None, 128, 60, 80)  0           block_2a_bn_2[0][0]              
                                                                 block_2a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_2a_relu (Activation)      (None, 128, 60, 80)  0           add_3[0][0]                      
__________________________________________________________________________________________________
block_2b_conv_1 (Conv2D)        (None, 128, 60, 80)  147456      block_2a_relu[0][0]              
__________________________________________________________________________________________________
block_2b_bn_1 (BatchNormalizati (None, 128, 60, 80)  512         block_2b_conv_1[0][0]            
__________________________________________________________________________________________________
block_2b_relu_1 (Activation)    (None, 128, 60, 80)  0           block_2b_bn_1[0][0]              
__________________________________________________________________________________________________
block_2b_conv_2 (Conv2D)        (None, 128, 60, 80)  147456      block_2b_relu_1[0][0]            
__________________________________________________________________________________________________
block_2b_conv_shortcut (Conv2D) (None, 128, 60, 80)  16384       block_2a_relu[0][0]              
__________________________________________________________________________________________________
block_2b_bn_2 (BatchNormalizati (None, 128, 60, 80)  512         block_2b_conv_2[0][0]            
__________________________________________________________________________________________________
block_2b_bn_shortcut (BatchNorm (None, 128, 60, 80)  512         block_2b_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_4 (Add)                     (None, 128, 60, 80)  0           block_2b_bn_2[0][0]              
                                                                 block_2b_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_2b_relu (Activation)      (None, 128, 60, 80)  0           add_4[0][0]                      
__________________________________________________________________________________________________
block_3a_conv_1 (Conv2D)        (None, 256, 30, 40)  294912      block_2b_relu[0][0]              
__________________________________________________________________________________________________
block_3a_bn_1 (BatchNormalizati (None, 256, 30, 40)  1024        block_3a_conv_1[0][0]            
__________________________________________________________________________________________________
block_3a_relu_1 (Activation)    (None, 256, 30, 40)  0           block_3a_bn_1[0][0]              
__________________________________________________________________________________________________
block_3a_conv_2 (Conv2D)        (None, 256, 30, 40)  589824      block_3a_relu_1[0][0]            
__________________________________________________________________________________________________
block_3a_conv_shortcut (Conv2D) (None, 256, 30, 40)  32768       block_2b_relu[0][0]              
__________________________________________________________________________________________________
block_3a_bn_2 (BatchNormalizati (None, 256, 30, 40)  1024        block_3a_conv_2[0][0]            
__________________________________________________________________________________________________
block_3a_bn_shortcut (BatchNorm (None, 256, 30, 40)  1024        block_3a_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_5 (Add)                     (None, 256, 30, 40)  0           block_3a_bn_2[0][0]              
                                                                 block_3a_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_3a_relu (Activation)      (None, 256, 30, 40)  0           add_5[0][0]                      
__________________________________________________________________________________________________
block_3b_conv_1 (Conv2D)        (None, 256, 30, 40)  589824      block_3a_relu[0][0]              
__________________________________________________________________________________________________
block_3b_bn_1 (BatchNormalizati (None, 256, 30, 40)  1024        block_3b_conv_1[0][0]            
__________________________________________________________________________________________________
block_3b_relu_1 (Activation)    (None, 256, 30, 40)  0           block_3b_bn_1[0][0]              
__________________________________________________________________________________________________
block_3b_conv_2 (Conv2D)        (None, 256, 30, 40)  589824      block_3b_relu_1[0][0]            
__________________________________________________________________________________________________
block_3b_conv_shortcut (Conv2D) (None, 256, 30, 40)  65536       block_3a_relu[0][0]              
__________________________________________________________________________________________________
block_3b_bn_2 (BatchNormalizati (None, 256, 30, 40)  1024        block_3b_conv_2[0][0]            
__________________________________________________________________________________________________
block_3b_bn_shortcut (BatchNorm (None, 256, 30, 40)  1024        block_3b_conv_shortcut[0][0]     
__________________________________________________________________________________________________
add_6 (Add)                     (None, 256, 30, 40)  0           block_3b_bn_2[0][0]              
                                                                 block_3b_bn_shortcut[0][0]       
__________________________________________________________________________________________________
block_3b_relu (Activation)      (None, 256, 30, 40)  0           add_6[0][0]                      
__________________________________________________________________________________________________
rpn_conv1 (Conv2D)              (None, 512, 30, 40)  1180160     block_3b_relu[0][0]              
__________________________________________________________________________________________________
rpn_out_class (Conv2D)          (None, 9, 30, 40)    4617        rpn_conv1[0][0]                  
__________________________________________________________________________________________________
rpn_out_regress (Conv2D)        (None, 36, 30, 40)   18468       rpn_conv1[0][0]                  
__________________________________________________________________________________________________
proposal_1 (Proposal)           (None, 300, 4)       0           rpn_out_class[0][0]              
                                                                 rpn_out_regress[0][0]            
__________________________________________________________________________________________________
crop_and_resize_1 (CropAndResiz (None, 300, 256, 7,  0           block_3b_relu[0][0]              
                                                                 proposal_1[0][0]                 
__________________________________________________________________________________________________
time_distributed_1 (TimeDistrib (None, 300, 512, 7,  1179648     crop_and_resize_1[0][0]          
__________________________________________________________________________________________________
time_distributed_2 (TimeDistrib (None, 300, 512, 7,  2048        time_distributed_1[0][0]         
__________________________________________________________________________________________________
block_4a_relu_1 (Activation)    (None, 300, 512, 7,  0           time_distributed_2[0][0]         
__________________________________________________________________________________________________
time_distributed_3 (TimeDistrib (None, 300, 512, 7,  2359296     block_4a_relu_1[0][0]            
__________________________________________________________________________________________________
time_distributed_5 (TimeDistrib (None, 300, 512, 7,  131072      crop_and_resize_1[0][0]          
__________________________________________________________________________________________________
time_distributed_4 (TimeDistrib (None, 300, 512, 7,  2048        time_distributed_3[0][0]         
__________________________________________________________________________________________________
time_distributed_6 (TimeDistrib (None, 300, 512, 7,  2048        time_distributed_5[0][0]         
__________________________________________________________________________________________________
add_7 (Add)                     (None, 300, 512, 7,  0           time_distributed_4[0][0]         
                                                                 time_distributed_6[0][0]         
__________________________________________________________________________________________________
block_4a_relu (Activation)      (None, 300, 512, 7,  0           add_7[0][0]                      
__________________________________________________________________________________________________
time_distributed_7 (TimeDistrib (None, 300, 512, 7,  2359296     block_4a_relu[0][0]              
__________________________________________________________________________________________________
time_distributed_8 (TimeDistrib (None, 300, 512, 7,  2048        time_distributed_7[0][0]         
__________________________________________________________________________________________________
block_4b_relu_1 (Activation)    (None, 300, 512, 7,  0           time_distributed_8[0][0]         
__________________________________________________________________________________________________
time_distributed_9 (TimeDistrib (None, 300, 512, 7,  2359296     block_4b_relu_1[0][0]            
__________________________________________________________________________________________________
time_distributed_11 (TimeDistri (None, 300, 512, 7,  262144      block_4a_relu[0][0]              
__________________________________________________________________________________________________
time_distributed_10 (TimeDistri (None, 300, 512, 7,  2048        time_distributed_9[0][0]         
__________________________________________________________________________________________________
time_distributed_12 (TimeDistri (None, 300, 512, 7,  2048        time_distributed_11[0][0]        
__________________________________________________________________________________________________
add_8 (Add)                     (None, 300, 512, 7,  0           time_distributed_10[0][0]        
                                                                 time_distributed_12[0][0]        
__________________________________________________________________________________________________
block_4b_relu (Activation)      (None, 300, 512, 7,  0           add_8[0][0]                      
__________________________________________________________________________________________________
time_distributed_13 (TimeDistri (None, 300, 512, 1,  0           block_4b_relu[0][0]              
__________________________________________________________________________________________________
time_distributed_flatten (TimeD (None, 300, 512)     0           time_distributed_13[0][0]        
__________________________________________________________________________________________________
dense_class_td (TimeDistributed (None, 300, 4)       2052        time_distributed_flatten[0][0]   
__________________________________________________________________________________________________
dense_regress_td (TimeDistribut (None, 300, 12)      6156        time_distributed_flatten[0][0]   
==================================================================================================
Total params: 12,753,917
Trainable params: 12,742,269
Non-trainable params: 11,648
__________________________________________________________________________________________________
2020-11-15 20:31:20.883592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:20.884030: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce RTX 2070 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2020-11-15 20:31:20.884101: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:31:20.884168: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-15 20:31:20.884209: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-15 20:31:20.884247: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-15 20:31:20.884268: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-15 20:31:20.884320: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-15 20:31:20.884341: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-15 20:31:20.884461: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:20.884895: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:20.885320: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-15 20:31:20.885365: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-15 20:31:20.885376: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-11-15 20:31:20.885385: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-11-15 20:31:20.885486: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:20.885872: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:20.886330: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5925 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-11-15 20:31:24.473292: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:24.473988: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce RTX 2070 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2020-11-15 20:31:24.474080: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:31:24.474198: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-15 20:31:24.474256: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-15 20:31:24.474318: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-15 20:31:24.474351: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-15 20:31:24.474412: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-15 20:31:24.474462: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-15 20:31:24.474617: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:24.475053: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:24.475389: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-15 20:31:24.475665: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:24.476011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce RTX 2070 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2020-11-15 20:31:24.476030: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:31:24.476047: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-15 20:31:24.476086: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-15 20:31:24.476137: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-15 20:31:24.476186: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-15 20:31:24.476225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-15 20:31:24.476244: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-15 20:31:24.476348: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:24.476824: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:24.477198: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-15 20:31:24.477222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-15 20:31:24.477230: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-11-15 20:31:24.477237: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-11-15 20:31:24.477354: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:24.477702: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:31:24.478055: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5925 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
NOTE: UFF has been tested with TensorFlow 1.14.0.
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
DEBUG: convert reshape to flatten node
Warning: No conversion function registered for layer: CropAndResize yet.
Converting roi_pooling_conv_1/CropAndResize_new as custom op: CropAndResize
Warning: No conversion function registered for layer: Proposal yet.
Converting proposal as custom op: Proposal
DEBUG [/usr/local/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py:96] Marking ['proposal', 'dense_class_td/Softmax', 'dense_regress_td/BiasAdd'] as outputs
2020-11-15 20:31:27,192 [INFO] iva.faster_rcnn.export.exporter: Using data loader to generate the data for INT8 calibration.
2020-11-15 20:31:27,293 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2020-11-15 20:31:27,293 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2020-11-15 20:31:27,293 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2020-11-15 20:31:27,294 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 12, io threads: 24, compute threads: 12, buffered batches: 4
2020-11-15 20:31:27,294 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 620, number of sources: 1, batch size per gpu: 1, steps: 620
2020-11-15 20:31:27,435 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
2020-11-15 20:31:27,728 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: True - shard 0 of 1
2020-11-15 20:31:27,735 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:
2020-11-15 20:31:27,735 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000
2020-11-15 20:31:28,418 [INFO] iva.faster_rcnn.export.faster_rcnn_calibrator: Number of samples in training dataset: 620
2020-11-15 20:31:28.419066: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-15 20:31:28.419095: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      
2020-11-15 20:31:28,824 [INFO] iva.common.export.base_exporter: Calibration takes time especially if number of batches is large.
DEPRECATED: This variant of get_batch is deprecated. Please use the single argument variant described in the documentation instead.
2020-11-15 20:32:02,160 [INFO] iva.common.export.base_calibrator: Saving calibration cache (size 4393) to /workspace/models/exported/cal.bin

evaluation_config

  evaluation_config {
    #model: "/workspace/models/trained/frcnn_resnet18.epoch150.tlt"
    batch_size: 1
    validation_period_during_training: 1
    labels_dump_dir: '/workspace/data/test_dump_labels'
    rpn_pre_nms_top_N: 6000
    rpn_nms_max_boxes: 300
    rpn_nms_overlap_threshold: 0.7
    classifier_nms_max_boxes: 300
    classifier_nms_overlap_threshold: 0.3
    object_confidence_thres: 0.9
    use_voc07_11point_metric: False

    trt_evaluation {
      etlt_model {
        model: '/workspace/models/exported/frcnn_resnet18.epoch150_int8.etlt'
        calibration_cache: '/workspace/models/exported/cal.bin'
      }
      trt_data_type: 'int8'
      max_workspace_size_MB: 2000
    }
  }

evaluation-command

%env LOGFILE=results/eval_testset_e150_int8_conf_0.9_.txt
!tlt-evaluate faster_rcnn -e $SPECS_DIR/resnet18.txt 2>&1 | tee $LOGFILE

log

env: LOGFILE=results/eval_testset_e150_int8_conf_0.9_.txt
2020-11-15 20:42:17.347583: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
Using TensorFlow backend.
2020-11-15 20:42:20,336 [INFO] iva.faster_rcnn.spec_loader.spec_loader: Loading experiment spec at /workspace/specs/resnet18.txt.
2020-11-15 20:42:20.345804: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-11-15 20:42:20.345986: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:42:20.346524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce RTX 2070 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.185
pciBusID: 0000:01:00.0
2020-11-15 20:42:20.346554: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:42:20.346611: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2020-11-15 20:42:20.347749: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2020-11-15 20:42:20.347810: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2020-11-15 20:42:20.349433: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2020-11-15 20:42:20.350616: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2020-11-15 20:42:20.350671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-11-15 20:42:20.350795: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:42:20.351330: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:42:20.351767: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-11-15 20:42:20.351817: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2020-11-15 20:42:21.023604: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-11-15 20:42:21.023668: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2020-11-15 20:42:21.023694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2020-11-15 20:42:21.024052: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:42:21.024733: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:42:21.025317: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:983] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-11-15 20:42:21.025835: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5850 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2070 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-11-15 20:42:21,026 [INFO] iva.faster_rcnn.scripts.test: Running evaluation with TensorRT as backend.
2020-11-15 20:42:21,027 [INFO] iva.faster_rcnn.tensorrt_inference.tensorrt_model: Building TensorRT engine from
                        the etlt model for inference: /workspace/models/exported/frcnn_resnet18.epoch150_int8.etlt
2020-11-15 20:48:46,225 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2020-11-15 20:48:46,225 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2020-11-15 20:48:46,225 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2020-11-15 20:48:46,225 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 12, io threads: 24, compute threads: 12, buffered batches: 4
2020-11-15 20:48:46,225 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 45, number of sources: 1, batch size per gpu: 1, steps: 45
Traceback (most recent call last):
  File "/usr/local/bin/tlt-evaluate", line 8, in <module>
    sys.exit(main())
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/common/magnet_evaluate.py", line 45, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/scripts/test.py", line 108, in main
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/faster_rcnn/data_loader/inputs_loader.py", line 79, in __init__
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/drivenet_dataloader.py", line 575, in get_dataset_tensors
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/blocks/trainers/multi_task_trainer/data_loader_interface.py", line 77, in __call__
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/core/build_wheel.runfiles/ai_infra/moduluspy/modulus/blocks/data_loaders/multi_source_loader/data_loader.py", line 396, in call
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1990, in apply
    return DatasetV1Adapter(super(DatasetV1, self).apply(transformation_func))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1378, in apply
    dataset = transformation_func(self)
  File "/home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/drivenet_dataloader.py", line 311, in <lambda>
  File "/opt/nvidia/third_party/keras/tensorflow_backend.py", line 345, in new_map
    self, _map_func_set_random_wrapper, num_parallel_calls=num_parallel_calls
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1909, in map
    MapDataset(self, map_func, preserve_cardinality=False))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3434, in __init__
    use_legacy_function=use_legacy_function)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2713, in __init__
    self._function = wrapper_fn._get_concrete_function_internal()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1853, in _get_concrete_function_internal
    *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1847, in _get_concrete_function_internal_garbage_collected
    graph_function, _, _ = self._maybe_define_function(args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 2147, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 2038, in _create_graph_function
    capture_by_value=self._capture_by_value),
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2707, in wrapper_fn
    ret = _wrapper_helper(*args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2652, in _wrapper_helper
    ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper
    raise e.ag_error_metadata.to_exception(e)
StopIteration: in converted code:

    /opt/nvidia/third_party/keras/tensorflow_backend.py:342 _map_func_set_random_wrapper  *
        return map_func(*args, **kwargs)
    /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/drivenet_dataloader.py:126 __call__
        
    /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/drivenet_dataloader.py:100 _get_parse_example
        
    /home/vpraveen/.cache/dazel/_dazel_vpraveen/715c8bafe7816f3bb6f309cd506049bb/execroot/ai_infra/bazel-out/k8-py3-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataloader/utilities.py:217 extract_tfrecords_features
        

    StopIteration:

I also tried for fp32 and fp16 same results.

Hi @cogbot,
In your 2nd device, tlt-export runs well but tlt-evaluate gets stuck.
When faster_rcnn runs tlt-evaluate, the evaluation dataset is needed.
The dataset is available at the 1st device but I’m afraid not available at the 2nd device.
So, please copy the same dataset and path into 2nd device.

Hi @Morganh, thanks again. I don’t think that is the case here. If you look at the tlt-evaluate log, just before the error it shows dataset size 45 as shown below. It is correct, in the test set I have 45 images.

2020-11-15 20:48:46,225 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 45, number of sources: 1, batch size per gpu: 1, steps: 45

Anyway, to double-check I deleted the dataset folder and copied again. I even generated tfrecords from kitti files on machine 2. I sill get the same error.

Can you describe more detailed info about your two machines? Do they have the same gpu card?

I try two V100 on my side, but cannot reproduce your case.
Can you double check

  1. If it can really work in the 1st machine
  2. Can you check the folder of tfrecord files? Is there any wrong or empty tfrecord? Can you show me the full log when you generate your tfrecords? The spec is also appreciated.

GPU cards:

Machine 1: 4 GPUs, Geforce RTX2080 TI`

Machine 2: 1 GPU, GeForce RTX 2070 with Max-Q Design

  1. If it can really work in the 1st machine

Yes.

  1. Can you check the folder of tfrecord files? Is there any wrong or empty tfrecord?

One of the testsets tfrecord was empty.

The issue has been resolved after removing the empty tfrecord file from the folder.
Thanks a lot @Morganh