NVIDIA TAO 5.5 Release : New Foundation Models and Training Capabilities

Highlights from this release:

  • Explore new foundation and multi-modal models:
    • Grounding-DINO—Open vocabulary object detection with fine-tuning
    • Mask-GroundingDINO—Open vocabulary instance segmentation with fine-tuning
    • NV-CLIP—Foundation model for image and text embedding
    • BEVFusion—Sensor fusion model combining image and lidar data for 3D understanding with fine-tuning
    • SEGIC—In-context segmentation on any object based on visual prompting.
    • FoundationPose—Six DoF object pose estimation for any novel objects
    • Mask2Former—State-of-the-art instance and panoptic segmentation model with fine-tuning
  • Automatically create label datasets for object detection and segmentation using text prompts.
  • Knowledge distillation—Create smaller efficient and accurate networks from distilling knowledge of larger networks.

Download NVIDIA TAO GitHub - NVIDIA/tao_tutorials: Quick start scripts and tutorial notebooks to get started with TAO Toolkit

Learn more about NVIDIA TAO TAO Toolkit | NVIDIA Developer

Get started page Get Started with TAO Toolkit | NVIDIA Developer | NVIDIA Developer

Developer video tutorials Playlist | TAO Video Tutorials | NVIDIA On-Demand

Source code can be found in the bottom of link.