NeMo AutoModel: Full YAML config for Qwen3-VL-30B-A3B LoRA fine-tuning (8xH100, single node)

vitali4 · March 19, 2026, 3:39pm

I’m planning to fine-tune Qwen/Qwen3-VL-30B-A3B-Instruct using NeMo AutoModel on a single-node 8xH100 setup with LoRA (PEFT).

I see from the VLM model coverage table that this model is supported with both FSDP2 and PEFT, with the reference config qwen3_vl_moe_30b_te_deepep.yaml.

However, I can’t find the actual contents of this YAML file in the docs or the GitHub repo examples. Could you share the reference config or point me to where it’s published?

Specifically I need to know:

LoRA target_modules — I’m targeting q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj and excluding the MoE router. Is this correct?
Batch size recommendations for 8xH100 80GB with this MoE model (128 experts, top-8)
Does the nvcr.io/nvidia/nemo-automodel:26.02 container include the MoE LoRA improvements from PR #1300 (merged 2026-02-26)?
DeepEP configuration — is ep_size: 8 the right setting for 8 GPUs?

My setup:

Hardware: 8x NVIDIA H100 80GB (single node, Thunder Compute)
Model: Qwen/Qwen3-VL-30B-A3B-Instruct (NOT the FP8 variant)
Data: 1M text Q&A pairs (instruction/output format, Hebrew + English)
Method: LoRA, rank 32, FSDP2
Container: nvcr.io/nvidia/nemo-automodel:26.02

Thank you!

Vitali Yudilevich
Founder, Allocator (allocator.live)

Topic		Replies	Views
NeMo AutoModel → NIM: Export path for Qwen3-VL-30B-A3B MoE after LoRA training DGX Spark / GB10 nim , nemo-framework	0	24	March 19, 2026
Fine-tuning Qwen/Qwen3-VL-30B-A3B-Instruct-FP8 with QLoRA on DGX Spark DGX Spark / GB10	8	1786	December 1, 2025
NeMo AutoModel: Text-only SFT on Qwen3-VL MoE — skip vision encoder or use LLM path? DGX Spark / GB10 nemo-framework	0	24	March 19, 2026
nvidia/Nemotron-Cascade-2-30B-A3B yet another model to test DGX Spark / GB10 nemotron	19	1248	March 24, 2026
Post-Training Quantization of LLMs with NVIDIA NeMo and NVIDIA TensorRT Model Optimizer Technical Blog	1	79	September 10, 2024
Deploying a 1.3B GPT-3 Model with NVIDIA NeMo Megatron Technical Blog	3	1076	March 31, 2023
DGX Spark, Nemotron3, and NVFP4: Getting to 65+ tps DGX Spark / GB10 spark , nemotron , dgx	14	1665	December 22, 2025
NVIDIA AI 파운데이션 모델: 프로덕션-레디 LLM으로 맞춤형 엔터프라이즈 챗봇 및 코파일럿 구축 Technical Blog - South Korea	0	547	November 17, 2023
NVIDIA AI Foundation Models: Build Custom Enterprise Chatbots and Co-Pilots with Production-Ready LLMs Technical Blog	4	699	April 12, 2024
How to Deploy and Run an LLM Designed with the 'NVIDIA NeMo Framework' and 'NVIDIA Megatron' NVIDIA Nemotron nemo	3	605	February 21, 2025

NeMo AutoModel: Full YAML config for Qwen3-VL-30B-A3B LoRA fine-tuning (8xH100, single node)

Related topics