Please provide the following information when requesting support.
• Hardware (T4/V100/Xavier/Nano/etc): NVIDIA RTX PRO 4000
• Network Type (Detectnet_v2/Faster_rcnn/Yolo_v4/LPRnet/Mask_rcnn/Classification/etc): SegFormer
• TLT Version (Please run “tlt info --verbose” and share “docker_tag” here)
• Training spec file(If have, please share here)
• How to reproduce the issue ? (This is for errors. Please share the command line and the detailed log here.)
I want to fine-tuning the Segformer network in my own data (two classes: tumor segmentation and background). How can I get the pre-trained model and its spec?
Please refer to https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/pretrained_segformer_imagenet/ .
For example, pretrained model for fan_base is in https://catalog.ngc.nvidia.com/orgs/nvidia/teams/tao/models/pretrained_segformer_imagenet/files?version=fan_hybrid_base_in22k_1k_384.
BTW, for nv_dino_v2 models,
For example, https://catalog.ngc.nvidia.com/orgs/nvaie/models/nv_dinov2_classification_model/files https://catalog.ngc.nvidia.com/orgs/nvaie/models/imagenet_nv_dinov2/files
You can set a larger input size(change 224 to 512) and use a larger backbone(e.g., fan_base).
Below is an example I run with an older docker nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt . You can try as well.
I have a question regarding the fanbase.yaml spec? Do I need to do any modification to adapt to my custom data? For example, my image size is 512x512 with just one channel.
In your spec (fanbase.yaml) I see some thing like:
img_scale:
I could’t run your previous experiment do to the GPU compatibility:
docker run --gpus all -it --rm
-u $(id -u):$(id -g)
-v /home/cvig/CVIG/Devel/tao_experiments_segformer_ccg:/workspace/tao_experiments nvcr.io/nvidia/tao/tao-toolkit:5.5.0-pyt /bin/bash
===========================
=== TAO Toolkit PyTorch ===
NVIDIA Release 5.5.0-PyT (build 88113656)
TAO Toolkit Version 5.5.0
Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES. All rights reserved.
This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
WARNING: Detected NVIDIA RTX PRO 4000 Blackwell Generation Laptop GPU GPU, which is not yet supported in this version of the container
ERROR: No supported GPU(s) detected to run this container
However, I managed to have better result using larger model and image resolution (512x512) :
OK, the NVIDIA RTX PRO 4000 (blackwell) is compatible with TAO6.x docker instead of TAO5.5 docker.
Glad to know the result is better now. So, you are still running with TAO6.0 docker instead of TAO5.5 docker, right? Just to confirm since your latest train yaml is TAO6’s version.