NotImplementedError: torch mode doesn't support custom image_mean

xyz · July 28, 2021, 4:48am

• Hardware
A100
• Network Type
Classification
• TLT Version
Configuration of the TLT Instance

dockers:
nvidia/tlt-streamanalytics:
docker_registry: nvcr.io
docker_tag: v3.0-py3
tasks:
1. augment
2. bpnet
3. classification
4. detectnet_v2
5. dssd
6. emotionnet
7. faster_rcnn
8. fpenet
9. gazenet
10. gesturenet
11. heartratenet
12. lprnet
13. mask_rcnn
14. multitask_classification
15. retinanet
16. ssd
17. unet
18. yolo_v3
19. yolo_v4
20. tlt-converter
nvidia/tlt-pytorch:
docker_registry: nvcr.io
docker_tag: v3.0-py3
tasks:
1. speech_to_text
2. speech_to_text_citrinet
3. text_classification
4. question_answering
5. token_classification
6. intent_slot_classification
7. punctuation_and_capitalization
format_version: 1.0
tlt_version: 3.0
published_date: 04/16/2021

• Training spec file

github.com

NVIDIA-AI-IOT/deepstream_tao_apps/blob/release/tlt3.0/misc/dev_blog/SOTA/classification/darknet/darknet53.txt

################################################################################
# The MIT License (MIT)
#
# Copyright (c) 2019-2021 NVIDIA CORPORATION
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING

This file has been truncated. show original

• How to reproduce the issue ?
follow the dev blog
https://developer.nvidia.com/blog/preparing-state-of-the-art-models-for-classification-and-object-detection-with-tlt/
tlt classification train -e /workspace/tlt-experiments/classification/darknet53/spec.txt -r /workspace/tlt-experiments/classification/darknet53 -k nvidia_tlt --gpus 8 --use_amp

error infos：
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/utils/mixup_generator.py”, line 82, in next
File “/usr/local/lib/python3.6/dist-packages/keras_preprocessing/image/iterator.py”, line 116, in next
return self._get_batches_of_transformed_samples(index_array)
File “/usr/local/lib/python3.6/dist-packages/keras_preprocessing/image/iterator.py”, line 239, in _get_batches_of_transformed_samples
x = self.image_data_generator.standardize(x)
File “/usr/local/lib/python3.6/dist-packages/keras_preprocessing/image/image_data_generator.py”, line 708, in standardize
x = self.preprocessing_function(x)
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/utils/preprocess_input.py”, line 246, in preprocess_input
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/utils/preprocess_input.py”, line 69, in _preprocess_numpy_input
NotImplementedError: torch mode doesn’t support custom image_mean

Morganh · July 28, 2021, 7:12am

This blog was released on Feb. It aims to TLT 3.0-dp version.
For TLT 3.0 version ,please add “image_mean” in the training spec for caffe mode.
See NVIDIA TAO Documentation

xyz · July 28, 2021, 7:20am

image_mean dict ‘b’: 103.939 ‘g’: 116.779 ‘r’: 123.68 A key/value pair to specify image mean values. It’s only applicable when preprocess_mode is caffe . If omitted, ImageNet mean will be used for image preprocessing. If set, depending on output_channel, either ‘r/g/b’ or ‘l’ key/value pair must be configured.

what’s the difference between caffe mode、torch mode？

Morganh · July 28, 2021, 7:35am

caffe mode: will convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling
torch mode: will scale pixels between 0 and 1 and then will normalize each channel with respect to the ImageNet dataset.

For torch mode, please delete image_mean.

xyz · July 28, 2021, 7:40am

In this spec deepstream_tao_apps/darknet53.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub

no image_mean,why got errors?

Morganh · July 28, 2021, 7:45am

I will check further.

Morganh · July 28, 2021, 8:59am

Unfortunately, there is an issue in TLT 3.0. The img_mean is not set correctly before training.
Please use “caffe” instead.
We will fix it in next release.

xyz · July 28, 2021, 9:09am

ok
Thanks

xyz · July 28, 2021, 9:24am

I have another question. In yolov3 preprocess will scale pixels between 0 and 1, torch mode is the best way to get yolov3 imagenet pretrained model？

Morganh · July 29, 2021, 2:19am

Some clarification.

TLT yolo_v3 preprocess does not scale pixels to 0~1
You can refer to https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/blob/master/configs/yolov3_tlt/pgie_yolov3_tlt_config.txt
net-scale-factor=1.0
offsets=103.939;116.779;123.68
In the blog, it tells end user how to train a pretrained model with Imagenet dataset. The end user can train a yolo_v3/ssd/faster_rcnn/etc model with this previously trained ImageNet-based model as pretrained weights.
It is actually using TLT classification to train a pretrained model with Imagenet dataset, see the blog, we cannot draw a conclusion that which mode is better. Some spec file does not use torch mode.

VGG16

ResNet50

ResNet101

EfficientNet B0

DarkNet53