Unit reboots, when Whisper model is ran on AGX Xavier Industrial

Hi,

We have an issue reported by our customers.
whenever they run whisper model, the HDMI Display goes off and the unit reboots automatically.

We are running 5.1.2 version of L4T.

please let us know what might be the root cause for this.

Thanks

Hi,
We are not sure what whisper model is. Please share more information about it. And do you try it on developer kit?

Whisper bigger model it seems.

As I am bit new to this running the models, could you please point me to any reference steps to be followed
to run this model.

I am installing Jetpack version 5.1.2 which has CUDA and other related library packages on my NVidia Dev Kit.

Please tell use further steps to be followed. Thanks.

Hi,
Do you use AGX Xavier Industrial or AGX Orin Industrial? Looks like Xavier is not supported per

Whisper - NVIDIA Jetson AI Lab

we are using AGX Xavier Industrial.

Are you sure it wont support at all?
Is that the reason why display goes off, when we run whisper model?

or there any work around we can follow so that we can run whisper model on Jetson AGX Xavier Industrial.

@dusty_nv Could you please provide you thoughts on this whisper model issue on jetson AGX Xavier Industrial.
As per the link
Whisper - NVIDIA Jetson AI Lab AGX Xavier Industrial is not listed.

Note: We are running whisper large model outside the container.

I was able to run whisper large model successfully on Jetson AGX Xavier dev kit, in only power modes MODES 30W ALL, MODE 30W 6 CORES etc.

whereas in power mode : MAXN the unit is restarting when we run Whisper large model. Any reason and work around to solve this issue.

the python program we are using is attached for your reference.
test_gpu.txt (854 Bytes)

Any idea how to optimize this program so that, it works fine at MAXN power mode.
Since this issue is happening on the Dev kit also, please provide solution for this issue.

The video having JTOP details just before unit reboots during whisper model execution with MAXN mode is attached for reference.

Thanks.
Jetson_Agx_Xavier_Shutdown-MAXN_whisper_model.zip (11.9 MB)

@DaneLLL

Any updates on this. Please let us know the reason for this behaviour.

Hi,
Since Xavier is not listed as the supported platform, it is possible it may not work properly in certain conditions. We would suggest use the platforms in the list.

Thanks for the information.

But we need to provide the technical documentation and details to customer justifying, why the unit resets at MAXN mode for whisper - large model on AGX Xavier Industrial.

It would be great if we can get the actual reason for unit restarting.

@carolyuu @dusty_nv

Could you please provide us support and help for the issue in this thread. Thanks.

Some one please reply for the queries.
Thanks.

You could ask your customer to dump the UART serial console log.

I saw you posted such comment on other posts when suggesting to other users.
It would help for your own issue too.

If the system goes sudden reboot, then serial console log might provide some hint for what is going on.

You are right. We thought we will keep considering debug console log for analysis as the last option.
Before that we have provided some trouble shooting.work around methods and waiting for customer to revert back.

As we have provided some work around solutions for unit reset issue at MAXN mode.
we want some one to comment on our observations.

  1. We dont have fans on the units, we have delivered as they are based on conduction cool plating method.
  2. If we reduce the GPU frequency from maximum to one or two lesser than maximum value,in the list of available GPU frequencies, then we observe the GPU will not reset.

Hi,
Since we validate it on Orin series:

Whisper - NVIDIA Jetson AI Lab

It is possible it does not work properly on other platforms. Please kindly note this.

Thanks.

we wanted a much better thorough technical reason why the unit reboot issue does not happen at bit lower GPU frequency?

Any better technical clarification for this issue of whisper model on AGX Xavier Industrial in MaxN mode?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.