Is it possible to adjust GPU voltage on the TX2?

We want to incress GPU voltage on TX2. Is it possible?
Our VDD_IN is 12V.

Hi,

What is the purpose of doing this? We don’t support it.

Our application use GPU almost to 99%, sometimes TX2 hang up or just reboot directly.

So we doubt may be GPU power is not enough or stable.

The follow is one of system hang up boot up log. This happened our GPU&CPU stress test for long-term, then system reboot and hang up.

[0000.167] C> I2C command failed
[0000.170] C> block index = (4) and rail_id = (1)
[0000.175] C> Addr: Reg = [0xe8:0x07]: 336166925
[0000.179] C> I2C command failed
[0000.182] C> block index = (5) and rail_id = (1)
[0000.187] C> Addr: Reg = [0xe8:0x07]: 336166925
[0009.534] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-c4328dc3)
[0009.542] I> Default Heap @ [0xd486400 - 0xd488400]
[0009.547] I> DMA Heap @ [0x85200000 - 0x86200000]
[0009.552] I> bit @ 0xd480000
[0009.554] I> BR-BCT relocated to 0xd7020000
[0009.559] I> Boot-device: eMMC
[0009.562] I> sdmmc bdev is already initialized
[0009.567] I> pmic: reset reason (nverc)	: 0x0
[0009.571] I> Reading GPT from 512 for device 00000003
[0009.577] I> Reading GPT from 16776704 for device 00000003
[0009.585] I> Found 13 partitions in 00000003 device
[0009.589] I> Reading GPT from 512 for device 00010003
[0009.596] I> Found 29 partitions in 00010003 device
[0009.601] I> OEM carveouts init scrub in progress...
[0009.683] I> Mb2 SDRAM scrub successful
[0009.686] W> No valid slot number is found in scratch register
[0009.692] W> Return default slot: _a
[0009.695] I> A/B: bin_type (16) slot 0
[0009.699] I> Loading partition bpmp-fw at 0xd7800000
[0009.704] I> Reading two headers - addr:0xd7800000 blocks:1
[0009.709] I> Addr: 0xd7800000, start-block: 18915329, num_blocks: 1
[0009.724] I> Binary(16) of size 529040 is loaded @ 0xd7800000
[0009.730] W> No valid slot number is found in scratch register
[0009.736] W> Return default slot: _a
[0009.739] I> A/B: bin_type (17) slot 0
[0009.743] I> Loading partition bpmp-fw-dtb at 0xd79f0000
[0009.748] I> Reading two headers - addr:0xd79f0000 blocks:1
[0009.753] I> Addr: 0xd79f0000, start-block: 18917745, num_blocks: 1
[0009.762] I> Binary(17) of size 75472 is loaded @ 0xd79ed600
[0009.797] I> BPMP-FW load address = 0xd7800000
[0009.801] I> BPMP-FW DTB load address = 0x501ed600
[0009.806] I> Loading SCE-FW ...
[0009.809] W> No valid slot number is found in scratch register
[0009.815] W> Return default slot: _a
[0009.818] I> A/B: bin_type (12) slot 0
[0009.822] I> Loading partition sce-fw at 0xd7300000
[0009.826] I> Reading two headers - addr:0xd7300000 blocks:1
[0009.832] I> Addr: 0xd7300000, start-block: 18919745, num_blocks: 1
[0009.841] I> Binary(12) of size 76592 is loaded @ 0xd7300000
[0009.846] I> Init SCE
[0009.849] I> Copy BTCM section
[0009.851] W> No valid slot number is found in scratch register
[0009.857] W> Return default slot: _a
[0009.861] I> A/B: bin_type (13) slot 0
[0009.864] I> Loading partition cpu-bootloader at 0x96000000
[0009.870] I> Reading two headers - addr:0x96000000 blocks:1
[0009.875] I> Addr: 0x96000000, start-block: 18894849, num_blocks: 1
[0009.887] I> Binary(13) of size 282736 is loaded @ 0x96000000
[0009.892] W> No valid slot number is found in scratch register
[0009.898] W> Return default slot: _a
[0009.901] I> A/B: bin_type (20) slot 0
[0009.905] I> Loading partition bootloader-dtb at 0x8520f400
[0009.910] I> Reading two headers - addr:0x8520f400 blocks:1
[0009.916] I> Addr: 0x8520f400, start-block: 18896897, num_blocks: 1
[0009.926] I> Binary(20) of size 220272 is loaded @ 0x8520f400
[0009.932] I> MB2-params(VA) @ 0xd7000000
[0009.936] I> CPUBL-params(VA) @ 0xd7000000
[0009.940] I> CPUBL-params(PA) @ 0x237000000
[0009.944] I> CPU-BL loaded @ PA 0x96000000
[0009.948] I> Loading TOS ...
[0009.951] W> No valid slot number is found in scratch register
[0009.956] W> Return default slot: _a
[0009.960] I> A/B: bin_type (14) slot 0
[0009.963] I> Loading partition secure-os at 0x8530f600
[0009.969] I> Reading two headers - addr:0x8530f600 blocks:1
[0009.974] I> Addr: 0x8530f600, start-block: 18898945, num_blocks: 1
[0009.982] I> Binary(14) of size 62576 is loaded @ 0x8530f600
[0009.988] I> Copying Monitor (length: 0xf270) from 0x8530f800 to 0x40000000
[0009.995] I> Erasing Monitor @ 0x8530f800
[0010.000] I> Unhalting SCE
[0010.002] I> Primary Memory Start:80000000 Size:70000000
[0010.008] I> Extended Memory Start:f0110000 Size:145ef0000
[0010.014] I> Waypoint2-ACK: 0x52012714
[0010.018] I> MB2(TBoot-BPMP) done

NUnhandled Exception in EL3.
x30 =		0x0000000000000000
x0 =		0x0000000000000000
x1 =		0x0000000000000000
x2 =		0x0000000000000000
x3 =		0x0000000000000000
x4 =		0x0000000000000000
x5 =		0x0000000000000000
x6 =		0x0000000000000000
x7 =		0x0000000000000000
x8 =		0x0000000000000000
x9 =		0x0000000000000000
x10 =		0x0000000000000000
x11 =		0x0000000000000000
x12 =		0x0000000000000000
x13 =		0x0000000000000000
x14 =		0x0000000000000000
x15 =		0x0000000000000000
x16 =		0x0000000000000000
x17 =		0x0000000000000000
x18 =		0x0000000000000000
x19 =		0x0000000000000000
x20 =		0x0000000000000000
x21 =		0x0000000000000000
x22 =		0x0000000000000000
x23 =		0x0000000000000000
x24 =		0x0000000000000000
x25 =		0x0000000000000000
x26 =		0x0000000000000000
x27 =		0x0000000000000000
x28 =		0x0000000000000000
x29 =		0x0000000000000000
scr_el3 =		0x0000000000000000
sctlr_el3 =		0x0000000000000000
cptr_el3 =		0x0000000000000000
tcr_el3 =		0x0000000000000000
daif =		0x0000000000000000
mair_el3 =		0x0000000000000000
spsr_el3 =		0x0000000000000000
elr_el3 =		0x0000000000000000
ttbr0_el3 =		0x0000000000000000
esr_el3 =		0x0000000000000000
far_el3 =		0x0000000000000000
spsr_el1 =		0x0000000000000000
elr_el1 =		0x0000000000000000
spsr_abt =		0x0000000000000000
spsr_und =		0x0000000000000000
spsr_irq =		0x0000000000000000
spsr_fiq =		0x0000000000000000
sctlr_el1 =		0x0000000000000000
actlr_el1 =		0x0000000000000000
cpacr_el1 =		0x0000000000000000
csselr_el1 =		0x0000000000000000
sp_el1 =		0x0000000000000000
esr_el1 =		0x0000000000000000
ttbr0_el1 =		0x0000000000000000
ttbr1_el1 =		0x0000000000000000
mair_el1 =		0x0000000000000000
amair_el1 =		0x0000000000000000
tcr_el1 =		0x0000000000000000
tpidr_el1 =		0x0000000000000000
tpidr_el0 =		0x0000000000000000
tpidrro_el0 =		0x0000000000000000
dacr32_el2 =		0x0000000000000000
ifsr32_el2 =		0x0000000000000000
par_el1 =		0x0000000000000000
mpidr_el1 =		0x0000000000000000
afsr0_el1 =		0x0000000000000000
afsr1_el1 =		0x0000000000000000
contextidr_el1 =		0x0000000000000000
vbar_el1 =		0x0000000000000000
cntp_ctl_el0 =		0x0000000000000000
cntp_cval_el0 =		0x0000000000000000
cntv_ctl_el0 =		0x0000000000000000
cntv_cval_el0 =		0x0000000000000000
cntkctl_el1 =		0x0000000000000000
fpexc32_el2 =		0x0000000000000000
sp_el0 =		0x0000000000000000
isr_el1 =		0x0000000000000000
cpuectlr_el1 =		0x0000000000000000

Normal boot up log is below

[0000.152] C> I2C command failed
[0000.155] C> block index = (4) and rail_id = (1)
[0000.159] C> Addr: Reg = [0xe8:0x07]: 336166925
[0000.164] C> I2C command failed
[0000.167] C> block index = (5) and rail_id = (1)
[0000.172] C> Addr: Reg = [0xe8:0x07]: 336166925
[0009.519] I> Welcome to MB2(TBoot-BPMP)(version: 01.00.160913-t186-M-00.00-mobile-c4328dc3)
[0009.527] I> Default Heap @ [0xd486400 - 0xd488400]
[0009.532] I> DMA Heap @ [0x85200000 - 0x86200000]
[0009.536] I> bit @ 0xd480000
[0009.539] I> BR-BCT relocated to 0xd7020000
[0009.544] I> Boot-device: eMMC
[0009.547] I> sdmmc bdev is already initialized
[0009.552] I> pmic: reset reason (nverc)	: 0x44
[0009.556] I> Reading GPT from 512 for device 00000003
[0009.562] I> Reading GPT from 16776704 for device 00000003
[0009.569] I> Found 13 partitions in 00000003 device
[0009.574] I> Reading GPT from 512 for device 00010003
[0009.581] I> Found 29 partitions in 00010003 device
[0009.586] I> OEM carveouts init scrub in progress...
[0009.667] I> Mb2 SDRAM scrub successful
[0009.671] W> No valid slot number is found in scratch register
[0009.677] W> Return default slot: _a
[0009.680] I> A/B: bin_type (16) slot 0
[0009.684] I> Loading partition bpmp-fw at 0xd7800000
[0009.689] I> Reading two headers - addr:0xd7800000 blocks:1
[0009.694] I> Addr: 0xd7800000, start-block: 18915329, num_blocks: 1
[0009.709] I> Binary(16) of size 529040 is loaded @ 0xd7800000
[0009.715] W> No valid slot number is found in scratch register
[0009.721] W> Return default slot: _a
[0009.724] I> A/B: bin_type (17) slot 0
[0009.728] I> Loading partition bpmp-fw-dtb at 0xd79f0000
[0009.733] I> Reading two headers - addr:0xd79f0000 blocks:1
[0009.738] I> Addr: 0xd79f0000, start-block: 18917745, num_blocks: 1
[0009.747] I> Binary(17) of size 75472 is loaded @ 0xd79ed600
[0009.782] I> BPMP-FW load address = 0xd7800000
[0009.786] I> BPMP-FW DTB load address = 0x501ed600
[0009.791] I> Loading SCE-FW ...
[0009.794] W> No valid slot number is found in scratch register
[0009.800] W> Return default slot: _a
[0009.803] I> A/B: bin_type (12) slot 0
[0009.807] I> Loading partition sce-fw at 0xd7300000
[0009.811] I> Reading two headers - addr:0xd7300000 blocks:1
[0009.817] I> Addr: 0xd7300000, start-block: 18919745, num_blocks: 1
[0009.825] I> Binary(12) of size 76592 is loaded @ 0xd7300000
[0009.831] I> Init SCE
[0009.833] I> Copy BTCM section
[0009.836] W> No valid slot number is found in scratch register
[0009.842] W> Return default slot: _a
[0009.845] I> A/B: bin_type (13) slot 0
[0009.849] I> Loading partition cpu-bootloader at 0x96000000
[0009.855] I> Reading two headers - addr:0x96000000 blocks:1
[0009.860] I> Addr: 0x96000000, start-block: 18894849, num_blocks: 1
[0009.871] I> Binary(13) of size 282736 is loaded @ 0x96000000
[0009.877] W> No valid slot number is found in scratch register
[0009.883] W> Return default slot: _a
[0009.886] I> A/B: bin_type (20) slot 0
[0009.890] I> Loading partition bootloader-dtb at 0x8520f400
[0009.895] I> Reading two headers - addr:0x8520f400 blocks:1
[0009.901] I> Addr: 0x8520f400, start-block: 18896897, num_blocks: 1
[0009.911] I> Binary(20) of size 220272 is loaded @ 0x8520f400
[0009.917] I> MB2-params(VA) @ 0xd7000000
[0009.921] I> CPUBL-params(VA) @ 0xd7000000
[0009.925] I> CPUBL-params(PA) @ 0x237000000
[0009.929] I> CPU-BL loaded @ PA 0x96000000
[0009.933] I> Loading TOS ...
[0009.936] W> No valid slot number is found in scratch register
[0009.941] W> Return default slot: _a
[0009.945] I> A/B: bin_type (14) slot 0
[0009.948] I> Loading partition secure-os at 0x8530f600
[0009.953] I> Reading two headers - addr:0x8530f600 blocks:1
[0009.959] I> Addr: 0x8530f600, start-block: 18898945, num_blocks: 1
[0009.967] I> Binary(14) of size 62576 is loaded @ 0x8530f600
[0009.973] I> Copying Monitor (length: 0xf270) from 0x8530f800 to 0x40000000
[0009.980] I> Erasing Monitor @ 0x8530f800
[0009.985] I> Unhalting SCE
[0009.987] I> Primary Memory Start:80000000 Size:70000000
[0009.992] I> Extended Memory Start:f0110000 Size:145ef0000
[0009.999] I> Waypoint2-ACK: 0x52012714
[0010.003] I> MB2(TBoot-BPMP) done

NOTICE:  BL31: v1.2(release):e1e4477
NOTICE:  BL31: Built : 00:08:30, May 17 2018
NOTICE:  Trusty image missing.
ERROR:   Error initializing runtime service trusty_fast
[0010.208] I> Welcome to Cboot
[0010.211] I> Cboot Version: 00.00.2014.50-t186-0c600f85
[0010.216] I> CPU-BL Params @ 0x237000000
[0010.220] I>  0) Base:0x00000000 Size:0x00000000
[0010.224] I>  1) Base:0x237f00000 Size:0x00100000
[0010.229] I>  2) Base:0x237e00000 Size:0x00100000
[0010.233] I>  3) Base:0x237d00000 Size:0x00100000
[0010.238] I>  4) Base:0x237c00000 Size:0x00100000
[0010.242] I>  5) Base:0x237b00000 Size:0x00100000
[0010.247] I>  6) Base:0x237800000 Size:0x00200000
[0010.251] I>  7) Base:0x237400000 Size:0x00400000
[0010.256] I>  8) Base:0x237a00000 Size:0x00100000
[0010.260] I>  9) Base:0x237300000 Size:0x00100000
[0010.265] I> 10) Base:0x236800000 Size:0x00800000
[0010.269] I> 11) Base:0x30000000 Size:0x00040000
[0010.274] I> 12) Base:0xf0000000 Size:0x00100000
[0010.278] I> 13) Base:0x30040000 Size:0x00001000
[0010.282] I> 14) Base:0x30048000 Size:0x00001000
[0010.287] I> 15) Base:0x30049000 Size:0x00001000
[0010.291] I> 16) Base:0x3004a000 Size:0x00001000
[0010.296] I> 17) Base:0x3004b000 Size:0x00001000
[0010.300] I> 18) Base:0x3004c000 Size:0x00001000
[0010.305] I> 19) Base:0x3004d000 Size:0x00001000
[0010.309] I> 20) Base:0x3004e000 Size:0x00001000
[0010.313] I> 21) Base:0x3004f000 Size:0x00001000
[0010.318] I> 22) Base:0x00000000 Size:0x00000000
[0010.322] I> 23) Base:0xf0100000 Size:0x00010000
[0010.327] I> 24) Base:0x00000000 Size:0x00000000
[0010.331] I> 25) Base:0x00000000 Size:0x00000000
[0010.336] I> 26) Base:0x00000000 Size:0x00000000
[0010.340] I> 27) Base:0x00000000 Size:0x00000000
[0010.344] I> 28) Base:0x84400000 Size:0x00400000
[0010.349] I> 29) Base:0x30000000 Size:0x00010000
[0010.353] I> 30) Base:0x238000000 Size:0x08000000
[0010.358] I> 31) Base:0x00000000 Size:0x00000000
[0010.362] I> 32) Base:0x236000000 Size:0x00600000
[0010.367] I> 33) Base:0x80000000 Size:0x70000000
[0010.371] I> 34) Base:0xf0110000 Size:0x145ef0000
[0010.376] I> 35) Base:0x00000000 Size:0x00000000
[0010.380] I> 36) Base:0x00000000 Size:0x00000000
[0010.385] I> 37) Base:0x2372e0000 Size:0x00020000
[0010.389] I> 38) Base:0x84000000 Size:0x00400000
[0010.394] I> 39) Base:0x96000000 Size:0x02400000
[0010.398] I> 40) Base:0x85000000 Size:0x01200000
[0010.402] I> 41) Base:0x237000000 Size:0x00280000
[0010.407] I> 42) Base:0x00000000 Size:0x00000000
[0010.411] I> 43) Base:0x00000000 Size:0x00000000
[0010.416] GIC-SPI Target CPU: 4
[0010.419] Interrupts Init done
[0010.422] calling constructors
[0010.425] initializing heap
[0010.428] initializing threads
[0010.431] initializing timers
[0010.434] creating bootstrap completion thread
[0010.439] top of bootstrap2()
[0010.442] CPU: ARM Cortex A57
[0010.445] CPU: MIDR: 0x411FD073, MPIDR: 0x80000100
[0010.450] initializing platform
[0010.453] I> Boot-device: eMMC
[0010.457] I> Dram Scrub in progress
[0015.730] I> DRAM Scrub Successfull
[0015.734] I> sdmmc bdev is already initialized
[0015.738] I> Reading GPT from 512 for device 00000003
[0015.744] I> Reading GPT from 16776704 for device 00000003
[0015.750] I> Found 13 partitions in 00000003 device
[0015.755] I> Reading GPT from 512 for device 00010003
[0015.761] I> Found 29 partitions in 00010003 device
[0015.766] W> opt-in fuse is not set, skip fuse_burning
[0015.771] I> Bl_dtb @0x8520f400
[0015.774] I> gpio framework initialized
[0015.777] I> tegrabl_gpio_driver_register: register 'tegra_gpio_main_driver' driver
[0015.785] I> tegrabl_gpio_driver_register: register 'tegra_gpio_aon_driver' driver
[0015.792] I> tegrabl_tca9539_init: i2c bus: 0, slave addr: 0xee
[0015.799] E> i2c dev write failed
[0015.802] E> tca9539_device_init: failed to write polar reg
[0015.808] E> tegrabl_tca9539_init: failed to init device!
[0015.813] E> GPIO TCA9539 driver init failed
[0015.929] I> decompressor handler not found
[0015.936] I> fixed regulator driver initialized
[0015.967] I> register 'maxim' power off handle
[0015.973] I> virtual i2c enabled
[0015.976] I> registered 'maxim,max77620' pmic
[0015.980] I> tegrabl_gpio_driver_register: register 'max77620-gpio' driver
[0015.989] I> Find /i2c@c250000's alias i2c7
[0015.993] I> Reading eeprom i2c=7 address=0x50
[0016.024] I> Device at /i2c@c250000:0x50
[0016.027] I> Reading eeprom i2c=7 address=0x57
[0016.032] E> i2c dev read failed
[0016.035] E> eeprom: Failed to read I2C slave device
[0016.040] I> Eeprom read failed 0x1a89800d
[0016.044] I> Find /i2c@c240000's alias i2c1
[0016.048] I> Reading eeprom i2c=1 address=0x51
[0016.054] E> i2c dev read failed
[0016.057] E> eeprom: Retry to read I2C slave device.
[0016.062] E> i2c dev read failed
[0016.065] E> eeprom: Failed to read I2C slave device
[0016.070] I> Eeprom read failed 0x1a89800d
[0016.075] I> Find /i2c@3160000's alias i2c0
[0016.079] I> Reading eeprom i2c=0 address=0x50
[0016.083] E> i2c dev read failed
[0016.086] E> eeprom: Failed to read I2C slave device
[0016.091] I> Eeprom read failed 0x1a89800d
[0016.096] I> Find /i2c@3180000's alias i2c2
[0016.100] I> Reading eeprom i2c=2 address=0x54
[0016.104] I> Enabling gpio chip_id = 2, gpio pin = 9
[0016.109] C> GPIO driver for chip_id 0x2 could not be found
[0016.114] E> cam_eeprom_read: Can't get gpio driver
[0016.119] I> Eeprom read failed 0x2693400d
[0016.123] I> create_pm_ids: id: 3489-0000-300-K, len: 15
[0016.128] I> config: mem-type:00,power-config:00,misc-config:00,modem-config:00,touch-config:00,display-config:00,, len: 93
[0016.141] I> found one nvdisp nodes at offset = 76116
[0016.146] I> found one nvdisp nodes at offset = 76924
[0016.151] I> found one nvdisp nodes at offset = 77832
[0016.156] I> no valid display unit config found in dtb
[0016.161] W> display init failed
[0016.164] initializing target
[0016.167] calling apps_init()
[0016.170] starting app android_boot_app
[0016.174] I> Gpio keyboard init success
[0016.178] I> Kernel type = Normal
[0016.181] I> Loading kernel/boot.img from storage ...
[0016.186] W> No valid slot number is found in scratch register
[0016.192] W> Return default slot: _a
[0016.195] I> A/B: bin_type (0) slot 0
[0016.198] I> Loading partition kernel at 0xa8000000
[0016.959] I> tegrabl_auth_payload: partition kernel (bin_type 0)
[0016.966] W> No valid slot number is found in scratch register
[0016.972] W> Return default slot: _a
[0016.975] I> A/B: bin_type (1) slot 0
[0016.978] I> Loading partition kernel-dtb at 0x92000000
[0016.990] I> tegrabl_auth_payload: partition kernel-dtb (bin_type 1)
[0016.997] I> Kernel DTB @ 0x92000000
[0017.000] I> Checking boot.img header magic ... [0017.004] I> [OK]
[0017.006] I> Valid boot.img @ 0xa8000000
[0017.010] I> decompressor handler not found
[0017.014] I> Copying kernel image (474322 bytes) from 0xa8000800 to 0x80080000 ... [0017.021] I> Done
[0017.023] I> Move ramdisk (len: 0) from 0xa8074800 to 0x9d000000
[0017.030] I> Updated bpmp info to DTB
[0017.035] I> Ramdisk: Base: 0x9d000000; Size: 0x0
[0017.039] I> Updated initrd info to DTB
[0017.043] E> tegrabl_linuxboot_add_disp_param, du 0 failed to get display params
[0017.050] E> tegrabl_linuxboot_add_disp_param, du 0 failed to get display params
[0017.057] E> tegrabl_linuxboot_add_disp_param, du 0 failed to get display params
[0017.065] I> disabled_core_mask: 0xffffff0c
[0017.069] W> No valid slot number is found in scratch register
[0017.074] W> Return default slot: _a
[0017.078] I> Active slot suffix: 
[0017.081] I> add_boot_slot_suffix: slot_suffix = 
[0017.085] I> add_serialno: Serial Num = 0421918050327
[0017.090] I> Linux Cmdline: root=/dev/mmcblk0p1 rw rootwait console=ttyS0,115200n8 console=tty0 OS=l4t fbcon=map:0 net.ifnames=0 memtype=0 video=tegrafb no_console_suspend=1 earlycon=uart8250,mmio32,0x03100000 nvdumper_reserved=0x2372e0000 gpt tegraid=18.1.2.0.0 tegra_keep_boot_clocks maxcpus=6 boot.slot_suffix= boot.ratchetvalues=0.2.1 androidboot.serialno=0421918050327 bl_prof_dataptr=0x10000@0x237040000 sdhci_tegra.en_boot_part_access=1 
[0017.129] I> Updated bootarg info to DTB
[0017.133] E> "plugin-manager" doesn't exist, creating
[0017.138] E> "odm-data" doesn't exist, creating
[0017.144] I> eeprom_get_mac_addr: MAC (type: 0): 00:ff:ff:ff:ff:ff
[0017.150] I> eeprom_get_mac_addr: MAC (type: 1): 00:ff:ff:ff:ff:ff
[0017.156] I> eeprom_get_mac_addr: MAC (type: 2): 00:04:4b:a9:2d:b1
[0017.162] E> "ids" doesn't exist, creating
[0017.166] E> "connection" doesn't exist, creating
[0017.170] E> "configs" doesn't exist, creating
[0017.175] I> create_pm_ids: id: 3489-0000-300-K, len: 15
[0017.180] I> config: mem-type:00,power-config:00,misc-config:00,modem-config:00,touch-config:00,display-config:00,, len: 93
[0017.191] I> Adding plugin-manager/ids/3489-0000-300=/i2c@c250000:module@0x50
[0017.198] E> "i2c@c250000" doesn't exist, creating
[0017.202] E> "module@0x50" doesn't exist, creating
[0017.208] I> Adding plugin-manager/ids/3489-0000-300-K
[0017.215] I> Adding plugin-manager/configs/3489-mem-type 00
[0017.220] I> Adding plugin-manager/configs/3489-power-config 00
[0017.226] I> Adding plugin-manager/configs/3489-misc-config 00
[0017.232] I> Adding plugin-manager/configs/3489-modem-config 00
[0017.238] I> Adding plugin-manager/configs/3489-touch-config 00
[0017.244] I> Adding plugin-manager/configs/3489-display-config 00
[0017.250] E> "chip-id" doesn't exist, creating
[0017.254] I> Adding plugin-manager/chip-id/A02P
[0017.260] I> added [base:0x80000000, size:0x70000000] to /memory
[0017.265] I> added [base:0xf0200000, size:0x145e00000] to /memory
[0017.271] I> added [base:0x236600000, size:0x200000] to /memory
[0017.277] E> WARNING: Failed to pass NS DRAM ranges to TOS
[0017.282] I> Updated memory info to DTB
[0017.287] E> "reset" doesn't exist, creating
[0017.291] E> "pmc-reset-reason" doesn't exist, creating
[0017.297] E> "pmic-reset-reason" doesn't exist, creating
[0017.302] I> disabled_core_mask: 0xffffff0c
[0017.312] I> Add serial number as DT property
[0017.316] I> tegrabl_load_kernel_and_dtb: Done
[0017.321] E> tegrabl_display_clear: display is not initialized
[0017.326] W> Boot logo display failed...


U-Boot 2016.07-00004-gaa80f75 (Nov 15 2019 - 16:39:45 +0800)

TEGRA186
Model: NVIDIA P2771-0000-500
DRAM:  6.8 GiB
MC:   Tegra SD/MMC: 0, Tegra SD/MMC: 1
*** Warning - bad CRC, using default environment

In:    serial
Out:   serial
Err:   serial
Net:   eth0: ethernet@2490000
Hit A to stop autoboot:  0 
MMC: no card present
switch to partitions #0, OK
mmc0(part 0) is current device
Scanning mmc 0:1...
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
236 bytes read in 90 ms (2 KiB/s)
p2771-0000 eMMC boot options
1:	primary kernel
1:	primary kernel

Hi,

For such issue, my suggestion is please find a TX2 devikt and use the adapters from TX2 devkit to do the stress test.

If you could hit this issue even on devkit, then please share us

  1. How to reproduce your problem? Which application are you using for stress test?
  2. Is there any error log from serial console before error happens?

Also, how to you get your device work again after you see such hang in reboot? Use coldboot?

  1. We use stress-ng to do cpu stess testing and use matrixMul.dat to do GPU stress testing.
  2. Unfortunately there is no error message output from serial console before reboot.

Yes use coldboot will fix this problem.

Anther wired issue is, since cpu hang up during reboot, but the power consumption is still up to 10W. Seems even cpu is dead but GPU stress test still work now.

We notice below info.
For abnormal reboot, the PMIC info is

[0009.567] I> pmic: reset reason (nverc) : 0x0

For normal reboot ,the PMIC info is

[0009.552] I> pmic: reset reason (nverc) : 0x44
What does the PMIC different info mean?

Hi,

Are you using devkit to do this test or not?

Sorry, we didn’t use devkit to do similar test yet.
We need to do internal discuss before start this testing on devkit.

Do you have any suggestion to locate this issue so far?

Hi,

Could you go to below node in the next reboot and dump the pmic reset reason?

/proc/device-tree/chosen/reset/pmic-reset-reason# xxd reason

You mean when problem show next time?
But our reboot will be hang up, and need to cold reboot again.
For that /proc/device-tree/chosen/reset/pmic-reset-reason still meaningful for our case?

Not sure. You could try it first.

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one. Thanks

Hi bdehj,

Have you clarified the cause and resolved the problem?
Any result can be shared?

do you mean the matricMul samples in cuda samples?
can you tell the details about how to do the gpu stress test, thanks a lot.