TX1 GPU test issue

Hi everyone:
On our TX1 platform(jetpack3.2.1). 12 oceanFFT processes were started after boot-up. After 21 hours, it was found that all oceanFFT processes had exit without any kernel message.
Could anyone give us some guidance to debug this issue, or there is something wrong with the tx1 modules in hardware level?

Hi liuke,

Some question need clarify with you:

  1. Are you enable performance mode? ($ sudo ./jetson_clocks.sh)
  2. How long is your goal to run?
  3. The issue only reproduced on JetPack-3.2.1? Can you repro issue with JetPack-3.3?

Hi carolyuu
Thanks for your reply, here is our test information:
1. sudo ./jetson_clocks.sh is enabled
2. our test goal is 7x24 hours
3. we have only test it on jetpack3.2.1 because our customer’s work is based on it.
We also found that some tx1 modules run 7 oceanFFT process it will automaticlly power down while other tx1 modules did not.
It seems there is something wrong with the tx1 module quality?

Hi liuke,

I tried to run oceanFFT sample about 22 hours on JetPack-3.2.1/TX1, but I can’t reproduce processes exit or shutdown issue.
Are you reproduce processes exit issue every times? please list tx1 modules serial number that can reproduce shutdown issue for us checking. Thanks!

Hi carolyuu
Thanks for your reply, now we have 3 tx1 modules all of them reproduce the oceanFFT exit issue. There serial number are:

module1: 04219180701790400303
module2:  04219180686770008205
module3:  042191806990808085ff

The auto shutdown tx1 modules have been returned by our customer and it’s on the way to mail. These days we will continue do more test.Please hold on we will update the new test result later.