TAO Toolkit 5.2 (5.2.0.1-pyt1.14.0:Segformer) - OSError: [Errno 39] Directory not empty: '/results/train/.eval_hook'

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

Hi @IainA For this case, validation_interval larger than max_iters, the evaluation will not run during training.

More, actually I cannot reproduce the /results/train/.eval_hook’ error during training.
Below is my command and spec file.
$ docker run --runtime=nvidia -it --rm -v /home/morganh:/home/morganh nvcr.io/nvidia/tao/tao-toolkit:5.2.0.1-pyt1.14.0 /bin/bash

Then run training.
#segformer train -e /home/morganh/demo_3.0/forum_repro/segformer/spec_isbi.yaml -r /home/morganh/demo_3.0/forum_repro/segformer/result_train -g 1

#cat spec_isbi.yaml
train:
  exp_config:
      manual_seed: 49
  checkpoint_interval: 50
  logging_interval: 50
  max_iters: 220
  resume_training_checkpoint_path: null
  validate: True
  validation_interval: 220
  trainer:
      find_unused_parameters: True
      sf_optim:
        lr: 0.00006
model:
  input_height: 512
  input_width: 512
  pretrained_model_path: null
  backbone:
    type: "mit_b1"
dataset:
  input_type: "grayscale"
  img_norm_cfg:
        mean:
          - 127.5
          - 127.5
          - 127.5
        std:
          - 127.5
          - 127.5
          - 127.5
        to_rgb: True
  data_root: /tlt-pytorch
  train_dataset:
      img_dir:
        - /home/morganh/demo_2.0/unet/data/isbi/images/train
      ann_dir:
        - /home/morganh/demo_2.0/unet/data/isbi/images/train
      pipeline:
        augmentation_config:
          random_crop:
            cat_max_ratio: 0.75
          resize:
            ratio_range:
              - 0.5
              - 2.0
          random_flip:
            prob: 0.5
  val_dataset:
      img_dir: /home/morganh/demo_2.0/unet/data/isbi/images/val
      ann_dir: /home/morganh/demo_2.0/unet/data/isbi/images/val
  palette:
    - seg_class: foreground
      rgb:
        - 0
        - 0
        - 0
      label_id: 0
      mapping_class: foreground
    - seg_class: background
      rgb:
        - 255
        - 255
        - 255
      label_id: 1
      mapping_class: background
  repeat_data_times: 500
  batch_size: 1
  workers_per_gpu: 1

Could you try my steps as well?
More, to narrow down, please set -r to a full local path. In my case, it is-r /home/morganh/demo_3.0/forum_repro/segformer/result_train.