Hello. On the DGX-A100 system, the GPT-2 model (from Hugging Face)is being profiled using the nsight system. In some cases, computation (kernel execution) and memory operations are not performed during the profile results (purple range is due to CUDA stream sync, yellow is unknown). I want to know why nothing is done in the case of yellow. I guess itβs because the execution time of the kernel is too short.
1 Like
Is there any chance that you could share the .nsys-rep file with us?