Object Detection using TAO DetectNet_v2. The category accuracy results are missing

user85575 · December 28, 2021, 8:58am

Please provide the following information when requesting support.

• Hardware (T4/V100/Xavier/Nano/etc) T4
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc) Detectnet_v2
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)detectnet_v2.ipynb
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
When I used the detectnet_v2 algorithm to train my kitti format data set, I replaced the pretrained_resnet18 model with the dashcamnet model. My own data set categories are pedestrian and car. The following results appeared during training. There is only the accuracy of the pedestrian category, and the accuracy results of the other two categories are missing.

 class name      average precision (in %)
------------  --------------------------
car                               0
cyclist                           0
pedestrian                       12.7891

Here are the detailed results.Please help me.


/usr/local/lib/python3.6/dist-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was *not* compiled. Compile it manually.
  warnings.warn('No training configuration found in save file: '
_________________________________________________________
2021-12-28 07:30:06,779 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2021-12-28 07:30:06,779 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2021-12-28 07:30:06,779 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2021-12-28 07:30:06,779 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 48, io threads: 96, compute threads: 48, buffered batches: 4
2021-12-28 07:30:06,779 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 114, number of sources: 1, batch size per gpu: 4, steps: 29
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

2021-12-28 07:30:06,831 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/autograph/converters/directives.py:119: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

WARNING:tensorflow:Entity <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f09cbe30fd0>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f09cbe30fd0>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2021-12-28 07:30:06,893 [WARNING] tensorflow: Entity <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f09cbe30fd0>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f09cbe30fd0>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2021-12-28 07:30:06,922 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
2021-12-28 07:30:07,230 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: True - shard 0 of 1
2021-12-28 07:30:07,238 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:
2021-12-28 07:30:07,238 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000
WARNING:tensorflow:Entity <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f09b02cfc50>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f09b02cfc50>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2021-12-28 07:30:07,259 [WARNING] tensorflow: Entity <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f09b02cfc50>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f09b02cfc50>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2021-12-28 07:30:07,692 [INFO] __main__: Found 114 samples in training set
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/rasterizers/bbox_rasterizer.py:347: The name tf.bincount is deprecated. Please use tf.math.bincount instead.

2021-12-28 07:30:07,832 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/rasterizers/bbox_rasterizer.py:347: The name tf.bincount is deprecated. Please use tf.math.bincount instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/training_proto_utilities.py:89: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.

2021-12-28 07:30:07,981 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/training_proto_utilities.py:89: The name tf.train.get_or_create_global_step is deprecated. Please use tf.compat.v1.train.get_or_create_global_step instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/training_proto_utilities.py:36: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

2021-12-28 07:30:08,000 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/training_proto_utilities.py:36: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_functions.py:17: The name tf.log is deprecated. Please use tf.math.log instead.

2021-12-28 07:30:08,214 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_functions.py:17: The name tf.log is deprecated. Please use tf.math.log instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:235: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

2021-12-28 07:30:08,262 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/cost_function/cost_auto_weight_hook.py:235: The name tf.assign_add is deprecated. Please use tf.compat.v1.assign_add instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/model/detectnet_model.py:587: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

2021-12-28 07:30:08,273 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/model/detectnet_model.py:587: The name tf.summary.scalar is deprecated. Please use tf.compat.v1.summary.scalar instead.

2021-12-28 07:30:10,495 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2021-12-28 07:30:10,495 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2021-12-28 07:30:10,495 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2021-12-28 07:30:10,495 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: number of cpus: 48, io threads: 96, compute threads: 48, buffered batches: 4
2021-12-28 07:30:10,495 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: total dataset size 18, number of sources: 1, batch size per gpu: 4, steps: 5
WARNING:tensorflow:Entity <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f09cbe30828>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f09cbe30828>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2021-12-28 07:30:10,512 [WARNING] tensorflow: Entity <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f09cbe30828>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method DriveNetTFRecordsParser.__call__ of <iva.detectnet_v2.dataloader.drivenet_dataloader.DriveNetTFRecordsParser object at 0x7f09cbe30828>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2021-12-28 07:30:10,542 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
2021-12-28 07:30:10,823 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: False - shard 0 of 1
2021-12-28 07:30:10,829 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:
2021-12-28 07:30:10,829 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000
WARNING:tensorflow:Entity <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f09583e23c8>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f09583e23c8>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2021-12-28 07:30:10,845 [WARNING] tensorflow: Entity <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f09583e23c8>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to locate the source code of <bound method Processor.__call__ of <modulus.blocks.data_loaders.multi_source_loader.processors.asset_loader.AssetLoader object at 0x7f09583e23c8>>. Note that functions defined in certain environments, like the interactive Python shell do not expose their source code. If that is the case, you should to define them in a .py source file. If you are certain the code is graph-compatible, wrap the call using @tf.autograph.do_not_convert. Original error: could not get source code
2021-12-28 07:30:11,125 [INFO] __main__: Found 18 samples in validation set
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/validation_hook.py:40: The name tf.summary.FileWriterCache is deprecated. Please use tf.compat.v1.summary.FileWriterCache instead.

2021-12-28 07:30:11,861 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/validation_hook.py:40: The name tf.summary.FileWriterCache is deprecated. Please use tf.compat.v1.summary.FileWriterCache instead.

2021-12-28 07:30:13,292 [INFO] __main__: Checkpoint interval: 10
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:108: The name tf.train.Scaffold is deprecated. Please use tf.compat.v1.train.Scaffold instead.

2021-12-28 07:30:13,293 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/scripts/train.py:108: The name tf.train.Scaffold is deprecated. Please use tf.compat.v1.train.Scaffold instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:14: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.

2021-12-28 07:30:13,293 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:14: The name tf.local_variables_initializer is deprecated. Please use tf.compat.v1.local_variables_initializer instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:15: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.

2021-12-28 07:30:13,294 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:15: The name tf.tables_initializer is deprecated. Please use tf.compat.v1.tables_initializer instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:16: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

2021-12-28 07:30:13,295 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/common/graph/initializers.py:16: The name tf.get_collection is deprecated. Please use tf.compat.v1.get_collection instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:59: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

2021-12-28 07:30:13,298 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:59: The name tf.train.LoggingTensorHook is deprecated. Please use tf.estimator.LoggingTensorHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:60: The name tf.train.StopAtStepHook is deprecated. Please use tf.estimator.StopAtStepHook instead.

2021-12-28 07:30:13,298 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:60: The name tf.train.StopAtStepHook is deprecated. Please use tf.estimator.StopAtStepHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:73: The name tf.train.StepCounterHook is deprecated. Please use tf.estimator.StepCounterHook instead.

2021-12-28 07:30:13,298 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:73: The name tf.train.StepCounterHook is deprecated. Please use tf.estimator.StepCounterHook instead.

INFO:tensorflow:Create CheckpointSaverHook.
2021-12-28 07:30:13,299 [INFO] tensorflow: Create CheckpointSaverHook.
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:99: The name tf.train.SummarySaverHook is deprecated. Please use tf.estimator.SummarySaverHook instead.

2021-12-28 07:30:13,299 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/tfhooks/utils.py:99: The name tf.train.SummarySaverHook is deprecated. Please use tf.estimator.SummarySaverHook instead.

WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py:140: The name tf.train.SingularMonitoredSession is deprecated. Please use tf.compat.v1.train.SingularMonitoredSession instead.

2021-12-28 07:30:13,300 [WARNING] tensorflow: From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/training/utilities.py:140: The name tf.train.SingularMonitoredSession is deprecated. Please use tf.compat.v1.train.SingularMonitoredSession instead.

INFO:tensorflow:Graph was finalized.
2021-12-28 07:30:14,536 [INFO] tensorflow: Graph was finalized.
INFO:tensorflow:Running local_init_op.
2021-12-28 07:30:16,823 [INFO] tensorflow: Running local_init_op.
INFO:tensorflow:Done running local_init_op.
2021-12-28 07:30:17,613 [INFO] tensorflow: Done running local_init_op.
INFO:tensorflow:Saving checkpoints for step-0.
2021-12-28 07:30:26,523 [INFO] tensorflow: Saving checkpoints for step-0.
INFO:tensorflow:epoch = 0.0, learning_rate = 4.9999994e-06, loss = 0.09131467, step = 0
2021-12-28 07:30:57,713 [INFO] tensorflow: epoch = 0.0, learning_rate = 4.9999994e-06, loss = 0.09131467, step = 0
2021-12-28 07:30:57,722 [INFO] iva.detectnet_v2.tfhooks.task_progress_monitor_hook: Epoch 0/120: loss: 0.09131 learning rate: 0.00000 Time taken: 0:00:00 ETA: 0:00:00
2021-12-28 07:30:57,722 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 0.439
INFO:tensorflow:global_step/sec: 0.772853
2021-12-28 07:31:00,306 [INFO] tensorflow: global_step/sec: 0.772853
INFO:tensorflow:global_step/sec: 10.915
2021-12-28 07:31:00,489 [INFO] tensorflow: global_step/sec: 10.915
INFO:tensorflow:global_step/sec: 10.8179
2021-12-28 07:31:00,674 [INFO] tensorflow: global_step/sec: 10.8179
INFO:tensorflow:global_step/sec: 10.6503
2021-12-28 07:31:00,862 [INFO] tensorflow: global_step/sec: 10.6503
INFO:tensorflow:global_step/sec: 11.1249
2021-12-28 07:31:01,042 [INFO] tensorflow: global_step/sec: 11.1249
INFO:tensorflow:global_step/sec: 11.0209
2021-12-28 07:31:01,223 [INFO] tensorflow: global_step/sec: 11.0209
INFO:tensorflow:global_step/sec: 10.5862
2021-12-28 07:31:01,412 [INFO] tensorflow: global_step/sec: 10.5862
INFO:tensorflow:global_step/sec: 10.0341
2021-12-28 07:31:01,611 [INFO] tensorflow: global_step/sec: 10.0341
INFO:tensorflow:global_step/sec: 11.1387
2021-12-28 07:31:01,791 [INFO] tensorflow: global_step/sec: 11.1387
INFO:tensorflow:global_step/sec: 9.6168
2021-12-28 07:31:01,999 [INFO] tensorflow: global_step/sec: 9.6168
INFO:tensorflow:global_step/sec: 9.13294
2021-12-28 07:31:02,218 [INFO] tensorflow: global_step/sec: 9.13294
INFO:tensorflow:global_step/sec: 10.1474
2021-12-28 07:31:02,415 [INFO] tensorflow: global_step/sec: 10.1474
2021-12-28 07:31:02,416 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 7.246
INFO:tensorflow:global_step/sec: 10.1296
2021-12-28 07:31:02,612 [INFO] tensorflow: global_step/sec: 10.1296
INFO:tensorflow:epoch = 0.9655172413793103, learning_rate = 7.242517e-06, loss = 0.06743895, step = 28 (5.111 sec)
2021-12-28 07:31:02,824 [INFO] tensorflow: epoch = 0.9655172413793103, learning_rate = 7.242517e-06, loss = 0.06743895, step = 28 (5.111 sec)
INFO:tensorflow:global_step/sec: 9.23476
2021-12-28 07:31:02,829 [INFO] tensorflow: global_step/sec: 9.23476
49da551de758:116:224 [0] NCCL INFO Bootstrap : Using [0]lo:127.0.0.1<0> [1]eth0:172.17.0.7<0>
49da551de758:116:224 [0] NCCL INFO NET/Plugin : No plugin found (libnccl-net.so), using internal implementation
49da551de758:116:224 [0] NCCL INFO NET/IB : No device found.
49da551de758:116:224 [0] NCCL INFO NET/Socket : Using [0]lo:127.0.0.1<0> [1]eth0:172.17.0.7<0>
49da551de758:116:224 [0] NCCL INFO Using network Socket
NCCL version 2.7.8+cuda11.1
49da551de758:116:224 [0] NCCL INFO Channel 00/32 :    0
49da551de758:116:224 [0] NCCL INFO Channel 01/32 :    0
49da551de758:116:224 [0] NCCL INFO Channel 02/32 :    0
....
Median Inference Time: 0.008003
INFO:tensorflow:epoch = 110.0, learning_rate = 1.7969065e-05, loss = 8.230345e-05, step = 3190 (9.965 sec)
2021-12-28 07:37:44,296 [INFO] tensorflow: epoch = 110.0, learning_rate = 1.7969065e-05, loss = 8.230345e-05, step = 3190 (9.965 sec)
INFO:tensorflow:global_step/sec: 0.216576
2021-12-28 07:37:44,298 [INFO] tensorflow: global_step/sec: 0.216576
2021-12-28 07:37:44,302 [INFO] iva.detectnet_v2.tfhooks.task_progress_monitor_hook: Epoch 110/120: loss: 0.00008 learning rate: 0.00002 Time taken: 0:00:11.694104 ETA: 0:01:56.941044
INFO:tensorflow:global_step/sec: 10.2147
....
INFO:tensorflow:global_step/sec: 10.9742
2021-12-28 07:38:07,158 [INFO] tensorflow: global_step/sec: 10.9742
INFO:tensorflow:global_step/sec: 11.0977
2021-12-28 07:38:07,339 [INFO] tensorflow: global_step/sec: 11.0977
INFO:tensorflow:global_step/sec: 10.3461
2021-12-28 07:38:07,532 [INFO] tensorflow: global_step/sec: 10.3461
INFO:tensorflow:global_step/sec: 11.9319
2021-12-28 07:38:07,700 [INFO] tensorflow: global_step/sec: 11.9319
INFO:tensorflow:global_step/sec: 10.8791
2021-12-28 07:38:07,883 [INFO] tensorflow: global_step/sec: 10.8791
2021-12-28 07:38:07,975 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 43.965
INFO:tensorflow:global_step/sec: 10.9462
2021-12-28 07:38:08,066 [INFO] tensorflow: global_step/sec: 10.9462
2021-12-28 07:38:08,166 [INFO] iva.detectnet_v2.tfhooks.task_progress_monitor_hook: Epoch 119/120: loss: 0.00007 learning rate: 0.00001 Time taken: 0:00:02.648905 ETA: 0:00:02.648905
INFO:tensorflow:global_step/sec: 10.6033
2021-12-28 07:38:08,255 [INFO] tensorflow: global_step/sec: 10.6033
INFO:tensorflow:global_step/sec: 10.9142
2021-12-28 07:38:08,438 [INFO] tensorflow: global_step/sec: 10.9142
INFO:tensorflow:global_step/sec: 10.5721
2021-12-28 07:38:08,627 [INFO] tensorflow: global_step/sec: 10.5721
INFO:tensorflow:global_step/sec: 11.1025
2021-12-28 07:38:08,807 [INFO] tensorflow: global_step/sec: 11.1025
INFO:tensorflow:global_step/sec: 11.317
2021-12-28 07:38:08,984 [INFO] tensorflow: global_step/sec: 11.317
INFO:tensorflow:global_step/sec: 10.2896
2021-12-28 07:38:09,178 [INFO] tensorflow: global_step/sec: 10.2896
INFO:tensorflow:global_step/sec: 10.7394
2021-12-28 07:38:09,365 [INFO] tensorflow: global_step/sec: 10.7394
INFO:tensorflow:global_step/sec: 11.306
2021-12-28 07:38:09,542 [INFO] tensorflow: global_step/sec: 11.306
INFO:tensorflow:global_step/sec: 10.4033
2021-12-28 07:38:09,734 [INFO] tensorflow: global_step/sec: 10.4033
INFO:tensorflow:epoch = 119.6551724137931, learning_rate = 5.225487e-06, loss = 0.00011281592, step = 3470 (5.146 sec)
2021-12-28 07:38:09,919 [INFO] tensorflow: epoch = 119.6551724137931, learning_rate = 5.225487e-06, loss = 0.00011281592, step = 3470 (5.146 sec)
INFO:tensorflow:global_step/sec: 10.5473
2021-12-28 07:38:09,924 [INFO] tensorflow: global_step/sec: 10.5473
INFO:tensorflow:global_step/sec: 11.345
2021-12-28 07:38:10,100 [INFO] tensorflow: global_step/sec: 11.345
INFO:tensorflow:global_step/sec: 11.0311
2021-12-28 07:38:10,281 [INFO] tensorflow: global_step/sec: 11.0311
2021-12-28 07:38:10,284 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 43.326
INFO:tensorflow:global_step/sec: 11.2454
2021-12-28 07:38:10,459 [INFO] tensorflow: global_step/sec: 11.2454
INFO:tensorflow:global_step/sec: 11.3238
2021-12-28 07:38:10,636 [INFO] tensorflow: global_step/sec: 11.3238
INFO:tensorflow:Saving checkpoints for step-3480.
2021-12-28 07:38:10,736 [INFO] tensorflow: Saving checkpoints for step-3480.
WARNING:tensorflow:Ignoring: /tmp/tmpi08w9yct; No such file or directory
2021-12-28 07:38:10,979 [WARNING] tensorflow: Ignoring: /tmp/tmpi08w9yct; No such file or directory
2021-12-28 07:38:15,057 [INFO] iva.detectnet_v2.evaluation.evaluation: step 0 / 4, 0.00s/step
Matching predictions to ground truth, class 1/3.: 100%|█| 3282/3282 [00:00<00:00, 24675.21it/s]
Matching predictions to ground truth, class 3/3.: 100%|█| 1341/1341 [00:00<00:00, 15211.67it/s]
Epoch 120/120
=========================

Validation cost: 0.000036
Mean average_precision (in %): 4.2630

class name      average precision (in %)
------------  --------------------------
car                               0
cyclist                           0
pedestrian                       12.7891

Median Inference Time: 0.007898
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

2021-12-28 07:38:20,386 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:98: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

2021-12-28 07:38:20,386 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:98: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

2021-12-28 07:38:20,390 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 43.326
Time taken to run __main__:main: 0:08:27.855154.
2021-12-28 15:38:26,442 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

Morganh · December 28, 2021, 12:06pm

Could you share your training spec?

user85575 · December 28, 2021, 12:36pm

The spec file is this: detectnet_v2_train_resnet18_kitti.txt

Morganh · December 28, 2021, 3:55pm

The validation set only contains 18 images.
Could you please check if these images have car and cyclist?

user85575 · December 29, 2021, 2:28am

This is the result of converting the kitti dataset to tfrecords. I see that partition 0 is the validation set, which has both pedestrian and car. Do you think so?

Converting Tfrecords for kitti trainval dataset
2021-12-28 20:14:54,030 [INFO] root: Registry: ['nvcr.io']
2021-12-28 20:14:54,860 [WARNING] tlt.components.docker_handler.docker_handler: 
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/root/.tao_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
Using TensorFlow backend.
2021-12-28 12:15:04,243 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2021-12-28 12:15:04,244 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Creating output directory /workspace/tao-experiments/data/tfrecords/kitti_trainval
2021-12-28 12:15:04,244 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 114	Val: 18
2021-12-28 12:15:04,244 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2021-12-28 12:15:04,245 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0
WARNING:tensorflow:From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:142: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

2021-12-28 12:15:04,245 - tensorflow - WARNING - From /root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/detectnet_v2/dataio/dataset_converter_lib.py:142: The name tf.python_io.TFRecordWriter is deprecated. Please use tf.io.TFRecordWriter instead.

/usr/local/lib/python3.6/dist-packages/iva/detectnet_v2/dataio/kitti_converter_lib.py:283: VisibleDeprecationWarning: Reading unicode strings without specifying the encoding argument is deprecated. Set the encoding, use None for the system default.
2021-12-28 12:15:04,253 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
2021-12-28 12:15:04,254 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 2
2021-12-28 12:15:04,256 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 3
2021-12-28 12:15:04,257 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 4
2021-12-28 12:15:04,258 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 5
2021-12-28 12:15:04,259 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 6
2021-12-28 12:15:04,261 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 7
2021-12-28 12:15:04,262 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 8
2021-12-28 12:15:04,263 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 9
2021-12-28 12:15:04,273 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
b'pedestrian': 21
b'car': 10

2021-12-28 12:15:04,273 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 0
2021-12-28 12:15:04,285 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 1
2021-12-28 12:15:04,298 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 2
2021-12-28 12:15:04,310 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 3
2021-12-28 12:15:04,323 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 4
2021-12-28 12:15:04,336 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 5
2021-12-28 12:15:04,350 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 6
2021-12-28 12:15:04,363 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 7
2021-12-28 12:15:04,377 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 8
2021-12-28 12:15:04,390 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 1, shard 9
2021-12-28 12:15:04,407 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
b'pedestrian': 222
b'car': 30

2021-12-28 12:15:04,408 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Cumulative object statistics
2021-12-28 12:15:04,408 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - 
Wrote the following numbers of objects:
b'pedestrian': 243
b'car': 40

2021-12-28 12:15:04,408 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Class map. 
Label in GT: Label in tfrecords file 
b'pedestrian': b'pedestrian'
b'car': b'car'
For the dataset_config in the experiment_spec, please use labels in the tfrecords file, while writing the classmap.

2021-12-28 12:15:04,408 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Tfrecords generation complete.
2021-12-28 20:15:06,076 [INFO] tlt.components.docker_handler.docker_handler: Stopping container.

The following is the content of the generated file.

The following is the generated file.
kitti_trainval-fold-000-of-002-shard-00009-of-00010 (5.7 KB)
Is it because there are too few datasets?please help me.

Morganh · December 29, 2021, 3:34am

So the 18 validation dataset contain 10 cars and 21 pedestrians.
How about the resolution of the car? Is it small?

user85575 · December 29, 2021, 6:59am

Firstly, the resolution of image is 19201040.
This is the kitti data label. It can be seen that the resolution of car is 255234, which is not very small.
car 0.00 0 0.00 646 261 901 495 0.00 0.00 0.00 0.00 0.00 0.00 0.00
It can be seen that the resolution of pedestrian is 120*321.

pedestrian 0.00 0 0.00 716 485 836 806 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Morganh · December 29, 2021, 7:03am

Can you upload your training spec file? Thanks.

user85575 · December 29, 2021, 7:10am

detectnet_v2_train_resnet18_kitti.txt (5.5 KB)
I suspect that the data path is not written correctly? Thanks.

Morganh · December 29, 2021, 7:15am

Please modify

output_image_width: 1248
output_image_height: 384

to

output_image_width:  960
output_image_height: 544

And also set

enable_auto _resize: True

Refer to DetectNet_v2 - NVIDIA Docs

user85575 · December 29, 2021, 8:20am

Thank you so much!
One is to solve this problem for me, and the other is to give me instructions on the configuration of the parameters.
However, I still have a few small questions.

Is the width and height of the output image based on the width and height of the input image? Why not change it to 19201040 (my image resolution) but modify it to 960544?
I use this spec file with higher accuracy.
detectnet_v2_train_resnet18_kitti.txt (4.3 KB)

Epoch 120/120
=========================

Validation cost: 0.000059
Mean average_precision (in %): 51.3095

class name      average precision (in %)
------------  --------------------------
car                              8.33333
pedestrian                      94.2857

Median Inference Time: 0.008700
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

2021-12-29 07:50:59,392 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:98: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

2021-12-29 07:50:59,392 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:98: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

2021-12-29 07:50:59,396 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 41.936

I use this spec file, but the accuracy is not high. why?
detectnet_v2_train_resnet18_kitti_copy2.txt (5.5 KB)

Epoch 120/120
=========================

Validation cost: 0.000060
Mean average_precision (in %): 22.3850

class name      average precision (in %)
------------  --------------------------
car                               0.1607
cyclist                           0
pedestrian                       66.9942

Median Inference Time: 0.009179
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

2021-12-29 07:39:32,000 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:95: The name tf.reset_default_graph is deprecated. Please use tf.compat.v1.reset_default_graph instead.

WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:98: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

2021-12-29 07:39:32,000 [WARNING] tensorflow: From /usr/local/lib/python3.6/dist-packages/keras/backend/tensorflow_backend.py:98: The name tf.placeholder_with_default is deprecated. Please use tf.compat.v1.placeholder_with_default instead.

2021-12-29 07:39:32,003 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 41.889
Time taken to run __main__:main: 0:09:01.790022.

The first two questions are to pursue answers, and the last question is the problem I want to solve. Why is the accuracy of car still low?

Morganh · December 29, 2021, 9:46am

For example, if you want to train a 960x544 model. You can set
output_image_width: 960
output_image_height: 544

Because you are using dashcamnet as the pretrained model. If you use ngc’s .hdf5 format pretrained model or do not use any pretrained model, you can set to other input_size. But please make user meet the requirement mentioned in (DetectNet_v2 - NVIDIA Docs)

For last question, the class weighting for car is different for the two experiments. Since there are no “cyclist” in your training dataset, please just run the 1st experiment.
More, please set car’s class_weight: 4.0 and pedestrian’s class_weight: 1.0 .

user85575 · December 30, 2021, 2:26am

Thank you very much! Your answer solved all my problems!
I modified the weights according to this,[quote=“Morganh, post:12, topic:199054”]
More, please set car’s class_weight: 4.0 and pedestrian’s class_weight: 1.0 .
[/quote] and the accuracy of the car has indeed improved.

Validation cost: 0.000044
Mean average_precision (in %): 56.4574

class name      average precision (in %)
------------  --------------------------
car                              20.1923
pedestrian                       92.7225

Median Inference Time: 0.009450

Can you explain why this is modified? It should be related to cost function, but I don’t know what cost function is and what is the basis for weight modification.
In addition, the accuracy of car is still very low, is it because the data set of car is too small? I have tried to modify the car’class_weight to 5, but the accuracy is also very low.

Morganh · December 30, 2021, 6:10am

Yes, please train with more dataset. You can refer to jupyter notebook to use public KITTI dataset to train.

Since your dataset for car is less than pedestrian, so set a larger class-weighting for it.

user85575 · December 30, 2021, 6:25am

Is there a limit? How to adjust the weight to be appropriate? Can you tell me the principle of adjustment? Is it possible to set 5, 10, 20, 30, etc.?

Morganh · December 30, 2021, 6:30am

For example, if car has 10 objects, pedestrian has 50 objects. Roughly you can set car’s class weighting to 5 and pedestrian class weight to 1.

user85575 · December 30, 2021, 6:48am

Understood, thank you!

system · January 13, 2022, 6:49am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Mean average precision of 0.00 in training Trafficcamnet model using Tao Toolkit TAO Toolkit deepstream	25	19	January 13, 2025
Transfer learning using the cityscapes dataset in unet-peopleSemSegNet causes poor generalization performance TAO Toolkit segmentation	12	1484	November 10, 2021
Error while training on tlt TAO Toolkit	4	712	September 5, 2021
Yolo V4 Training Error TAO Toolkit	3	647	August 2, 2022
TAO Yolo v4 using custom data creates some blank TFRecords TAO Toolkit yolo , computer-vision	4	694	December 1, 2022
Detectnet_v2: Assertion Error while training and validation Frameworks	0	573	October 1, 2021
Empty label files in custom dataset TAO Toolkit	2	365	December 15, 2022
Tensor reshape error when evaluating TrafficCamNet TAO Toolkit tensorflow	14	995	August 20, 2023
Run detectnet_v2.ipynb error with my own data TAO Toolkit tao	23	1397	March 4, 2022
Narrow and long bounding boxes TAO Toolkit yolo , python , deepstream	17	830	January 2, 2023

Object Detection using TAO DetectNet_v2. The category accuracy results are missing

Related topics