Hi thanks for the response ,
- what should be the minimum size of the custom images for training data since all the tf records what i generated where zero
- so i took kitti dataset of 297 images and generated the tfrecord and started the training but getting the following errror
INFO:tensorflow:Graph was finalized.
2021-07-02 08:26:05,444 [INFO] tensorflow: Graph was finalized.
INFO:tensorflow:Running local_init_op.
2021-07-02 08:26:07,065 [INFO] tensorflow: Running local_init_op.
INFO:tensorflow:Done running local_init_op.
2021-07-02 08:26:08,202 [INFO] tensorflow: Done running local_init_op.
INFO:tensorflow:Saving checkpoints for step-0.
2021-07-02 08:26:14,671 [INFO] tensorflow: Saving checkpoints for step-0.
Traceback (most recent call last):
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1365, in _do_call
return fn(*args)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1350, in _run_fn
target_list, run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1443, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Conv2DCustomBackpropInputOp only supports NHWC.
[[{{node gradients/resnet18_nopool_bn_detectnet_v2/output_bbox/convolution_grad/Conv2DBackpropInput}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 843, in
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 832, in
File “”, line 2, in main
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py”, line 46, in wrapped_fn
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 821, in main
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 702, in run_experiment
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 638, in train_gridbox
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 154, in run_training_loop
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 754, in run
run_metadata=run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 1360, in run
raise six.reraise(*original_exc_info)
File “/usr/local/lib/python3.6/dist-packages/six.py”, line 696, in reraise
raise value
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 1345, in run
return self._sess.run(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 1418, in run
run_metadata=run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/monitored_session.py”, line 1176, in run
return self._sess.run(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 956, in run
run_metadata_ptr)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1180, in _run
feed_dict_tensor, options, run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1359, in _do_run
run_metadata)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py”, line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Conv2DCustomBackpropInputOp only supports NHWC.
[[node gradients/resnet18_nopool_bn_detectnet_v2/output_bbox/convolution_grad/Conv2DBackpropInput (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]
Original stack trace for ‘gradients/resnet18_nopool_bn_detectnet_v2/output_bbox/convolution_grad/Conv2DBackpropInput’:
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 832, in
File “”, line 2, in main
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/utilities/timer.py”, line 46, in wrapped_fn
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 821, in main
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 702, in run_experiment
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 613, in train_gridbox
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 468, in build_training_graph
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/model/detectnet_model.py”, line 598, in build_training_graph
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/train_op_generator.py”, line 59, in get_train_op
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/train_op_generator.py”, line 74, in _get_train_op_without_cost_scaling
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py”, line 419, in minimize
grad_loss=grad_loss)
File “/usr/local/lib/python3.6/dist-packages/horovod/tensorflow/init.py”, line 253, in compute_gradients
gradients = self._optimizer.compute_gradients(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/training/optimizer.py”, line 537, in compute_gradients
colocate_gradients_with_ops=colocate_gradients_with_ops)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_impl.py”, line 158, in gradients
unconnected_gradients)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py”, line 703, in _GradientsHelper
lambda: grad_fn(op, *out_grads))
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py”, line 362, in _MaybeCompile
return grad_fn() # Exit early
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gradients_util.py”, line 703, in
lambda: grad_fn(op, *out_grads))
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_grad.py”, line 596, in _Conv2DGrad
data_format=data_format),
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py”, line 1407, in conv2d_backprop_input
name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py”, line 794, in _apply_op_helper
op_def=op_def)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py”, line 513, in new_func
return func(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 3357, in create_op
attrs, op_def, compute_device)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 3426, in _create_op_internal
op_def=op_def)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 1748, in init
self._traceback = tf_stack.extract_stack()
…which was originally created as op ‘resnet18_nopool_bn_detectnet_v2/output_bbox/convolution’, defined at:
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 832, in
[elided 5 identical lines from previous traceback]
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py”, line 468, in build_training_graph
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/model/detectnet_model.py”, line 572, in build_training_graph
File “/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py”, line 457, in call
output = self.call(inputs, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/network.py”, line 564, in call
output_tensors, _, _ = self.run_internal_graph(inputs, masks)
File “/usr/local/lib/python3.6/dist-packages/keras/engine/network.py”, line 721, in run_internal_graph
layer.call(computed_tensor, **kwargs))
File “/usr/local/lib/python3.6/dist-packages/keras/layers/convolutional.py”, line 171, in call
dilation_rate=self.dilation_rate)
File “/opt/nvidia/third_party/keras/tensorflow_backend.py”, line 113, in conv2d
data_format=tf_data_format,
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py”, line 921, in convolution
name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/nn_ops.py”, line 1032, in convolution_internal
name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/gen_nn_ops.py”, line 1071, in conv2d
data_format=data_format, dilations=dilations, name=name)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/op_def_library.py”, line 794, in _apply_op_helper
op_def=op_def)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/util/deprecation.py”, line 513, in new_func
return func(*args, **kwargs)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 3357, in create_op
attrs, op_def, compute_device)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 3426, in _create_op_internal
op_def=op_def)
File “/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py”, line 1748, in init
self._traceback = tf_stack.extract_stack()
2021-07-02 13:56:38,232 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.