NotImplementedError: torch mode doesn't support custom image_mean

• Hardware
A100
• Network Type
Classification
• TLT Version
Configuration of the TLT Instance

dockers:
nvidia/tlt-streamanalytics:
docker_registry: nvcr.io
docker_tag: v3.0-py3
tasks:
1. augment
2. bpnet
3. classification
4. detectnet_v2
5. dssd
6. emotionnet
7. faster_rcnn
8. fpenet
9. gazenet
10. gesturenet
11. heartratenet
12. lprnet
13. mask_rcnn
14. multitask_classification
15. retinanet
16. ssd
17. unet
18. yolo_v3
19. yolo_v4
20. tlt-converter
nvidia/tlt-pytorch:
docker_registry: nvcr.io
docker_tag: v3.0-py3
tasks:
1. speech_to_text
2. speech_to_text_citrinet
3. text_classification
4. question_answering
5. token_classification
6. intent_slot_classification
7. punctuation_and_capitalization
format_version: 1.0
tlt_version: 3.0
published_date: 04/16/2021

• Training spec file

• How to reproduce the issue ?
follow the dev blog
https://developer.nvidia.com/blog/preparing-state-of-the-art-models-for-classification-and-object-detection-with-tlt/
tlt classification train -e /workspace/tlt-experiments/classification/darknet53/spec.txt -r /workspace/tlt-experiments/classification/darknet53 -k nvidia_tlt --gpus 8 --use_amp

error infos:
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/utils/mixup_generator.py”, line 82, in next
File “/usr/local/lib/python3.6/dist-packages/keras_preprocessing/image/iterator.py”, line 116, in next
return self._get_batches_of_transformed_samples(index_array)
File “/usr/local/lib/python3.6/dist-packages/keras_preprocessing/image/iterator.py”, line 239, in _get_batches_of_transformed_samples
x = self.image_data_generator.standardize(x)
File “/usr/local/lib/python3.6/dist-packages/keras_preprocessing/image/image_data_generator.py”, line 708, in standardize
x = self.preprocessing_function(x)
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/utils/preprocess_input.py”, line 246, in preprocess_input
File “/opt/tlt/.cache/dazel/_dazel_tlt/2b81a5aac84a1d3b7a324f2a7a6f400b/execroot/ai_infra/bazel-out/k8-fastbuild/bin/magnet/packages/iva/build_wheel.runfiles/ai_infra/iva/makenet/utils/preprocess_input.py”, line 69, in _preprocess_numpy_input
NotImplementedError: torch mode doesn’t support custom image_mean

This blog was released on Feb. It aims to TLT 3.0-dp version.
For TLT 3.0 version ,please add “image_mean” in the training spec for caffe mode.
See NVIDIA TAO Documentation

image_mean dict ‘b’: 103.939 ‘g’: 116.779 ‘r’: 123.68 A key/value pair to specify image mean values. It’s only applicable when preprocess_mode is caffe . If omitted, ImageNet mean will be used for image preprocessing. If set, depending on output_channel, either ‘r/g/b’ or ‘l’ key/value pair must be configured.

what’s the difference between caffe mode、torch mode?

caffe mode: will convert the images from RGB to BGR, then will zero-center each color channel with respect to the ImageNet dataset, without scaling
torch mode: will scale pixels between 0 and 1 and then will normalize each channel with respect to the ImageNet dataset.

For torch mode, please delete image_mean.

In this spec deepstream_tao_apps/darknet53.txt at master · NVIDIA-AI-IOT/deepstream_tao_apps · GitHub

no image_mean,why got errors?

I will check further.

Unfortunately, there is an issue in TLT 3.0. The img_mean is not set correctly before training.
Please use “caffe” instead.
We will fix it in next release.

1 Like

ok
Thanks

I have another question. In yolov3 preprocess will scale pixels between 0 and 1, torch mode is the best way to get yolov3 imagenet pretrained model?

Some clarification.

  • TLT yolo_v3 preprocess does not scale pixels to 0~1
    You can refer to https://github.com/NVIDIA-AI-IOT/deepstream_tlt_apps/blob/master/configs/yolov3_tlt/pgie_yolov3_tlt_config.txt
    net-scale-factor=1.0
    offsets=103.939;116.779;123.68

  • In the blog, it tells end user how to train a pretrained model with Imagenet dataset. The end user can train a yolo_v3/ssd/faster_rcnn/etc model with this previously trained ImageNet-based model as pretrained weights.

  • It is actually using TLT classification to train a pretrained model with Imagenet dataset, see the blog, we cannot draw a conclusion that which mode is better. Some spec file does not use torch mode.

1 Like