The Asus Ascent is power limited after an OOM crash or other crashes which involve terminating a cuda workload. nvidia-smi reports 5-7-9 W usage after this happens, even after reboot. The performance of the GB10 is restored only after the device is powered off and its power brick is unplugged for a while. The temperature reported by nvidia-smi is 38 C. nvidia-smi reports power usage up to ~80W and temperature up to 55 C at 96% GB10 load. This is obviously something which can be detected.
The software is fully updated. The firmware is up to date. The machineβs power brick has been unplugged several times after the firmware updates.
This hardware promised a lot more. It delivers too little right now. Itβs slow and buggy. Itβs really not worth the money.
Iβve provided the diagnostic log bundles. One bundle is the result of a diagnostic run after a crash. The other one is the result of a diagnostic run after unplugging the power brick and powering the system again.
I havenβt heard anything from anyone at Nvidia since providing the diagnostics files as instructed. It took about 1h40 to obtain the diagnostics both in a bad state with the GPU in a power limited state and in a good state after a full power cycle.
This gb10 doesnβt end up in a power limited state as long as it doesnβt crash due to an OOM condition. It doesnβt crash either during regular use (no OOM). Iβll keep testing and RMA if nobody at Nvidia follows up. I canβt believe how bad this hardware is. The GPU drivers are old. The hardware and the firmware appear to be buggy.
We have received the logs and confirmed they do not show an issue with the unit. However, I have reached out to engineering and will report back when I have more information.
Thank you. I had no idea whatβs going on after sending the requested logs. Itβs good to know that such an expensive device isnβt about to become a paperweight due to some faulty components. Iβll wait for a proper solution (firmware update, recommendation to RMA, etc). Itβs unacceptable for such expensive hardware to be this bad. Itβs not a cheap $ 10 device bought off some random website from Asia. Nvidia doesnβt look too good to me. The hardware performs poorly as it is when itβs not throttled and doesnβt run into bugs.
nvidia-smi currently reports the power at 4-14 W. There was no crash this time. Performance dropped for inference. The GB10 is completely useless in this state.
Unplugging the power brick after shutdown for 2 minutes worked around the problem. This GB10 was turned off for a while. It appears to have booted in this broken state.