Dear Team,
I am raising a major concern regarding the quality and accuracy of the code and architectural knowledge provided by Nsight Copilot, which is supposedly built on a robust “CUDA Knowledge Model.”
I requested a minimal, self-contained CUTLASS example for an NVFP4 GEMM. The provided solution is not only deeply flawed but contains a fundamental and critical error regarding hardware compatibility that makes the example entirely unusable and dangerously misleading for any developer working with new architectures.
1. The Core Factual Error
Nsight Copilot’s Claim:
The generated documentation and code repeatedly assert that NVFP4 is supported on SM 90 (Hopper) and even suggest SM 89 (Ada).
-
Important: NVFP4 is only supported on GPUs with compute capability 9.0+ (Hopper).
-
Code Snippet:
arch::Sm90andCUDA_ARCHITECTURES 90are used.
The Undeniable Reality:
Based on all official NVIDIA documentation and hardware releases:
-
The NVFP4 data type is only supported on GPUs with Compute Capability SM 100 (Blackwell) or newer.
-
SM 90 (Hopper) supports FP8, but NOT NVFP4.
-
SM 89 (Ada) supports neither FP8 nor NVFP4.
2. The Unacceptable Implication for an “AI Programming Tool”
This is not a minor bug; this is a catastrophic failure of the tool’s core knowledge base.
-
A CUDA Knowledge Model is Fundamentally Flawed: If an AI assistant designated to help with CUDA can be factually wrong about the architectural requirements for a major, recently introduced data type (NVFP4), its utility for cutting-edge GPU programming is zero.
-
Waste of Developer Time: Developers trust these tools to provide accurate starting points. Instead, this example leads to immediate and frustrating compilation/runtime errors (e.g., “Unsupported architecture”) and forces developers to debug the AI tool’s output instead of their own code.
-
Unacceptable for a Professional Tool: This is the equivalent of a hardware manual listing the wrong system requirements. The tool is actively giving out outdated or fabricated information.
3. Immediate Action Required
You must urgently update the underlying knowledge model for Nsight Copilot.
-
Correct the NVFP4 requirement: The tool must accurately state that NVFP4 requires SM 100+ (Blackwell).
-
Update the Example: The generated example must use
arch::Sm100and recommendCUDA_ARCHITECTURES 100.
A tool that misinforms users about hardware support, especially for new architectures, is worse than useless—it actively hinders development. Please address this severe knowledge deficiency immediately.
