I am using the Jetson TX2 to communicate over Serial Rapid IO to another SRIO device via a PCIe->SRIO bridge device. I have a test application running in user space which generates and transmits some data to the other device. The application consists of while(true) loop running which sends some test data indefinitely, and prints a status result for each IO transaction to the terminal screen. When I run this application, the while loop seems to go at a rate in the millisecond range (according to the rate at which its printing to the terminal screen), but if I hit the space bar on the keyboard several times or unplug/plug in an HDMI cable, this causes the application to significantly increase in execution speed to about 100us per IO transaction. I want to be able to have consistent behavior and not have execution speed changing abruptly. Then in my application I can add some sort of sleep in my code to slow it down to about 500us. It seems this issue is interrupt-related. Is it possible to be able to control this speed so that its consistent? Such as generating an interrupt somehow in my C++ code before the while loop executes? Through my testing, this issue aside, the TX2 is able to consistently deliver at around 100us.
mrroboto,
Could you fix the power model of tx2 to maximum with nvpmodel and jetson_clcok and recheck your IO performance is stable or not?
Hi WayneWWW, thanks for the quick response. I ran:
sudo nvpmodel -m 0
sudo ./jetson_clocks.sh
Still the issue persists.
Could you share what devices are you using?
I am using the IDT TSI721. It is a PCIe to SRIO bridge device. Are you suspecting something with the device driver?
Hi,
- Could you share rough block diagram of your setup?
- Please share “lspci -vvv” output
- Why are you suspecting interrupts? Please capture “cat /proc/interrupt” before and after issue.
- Is your application based on file system read/write? i.e file write to send data?
- Could you profile your application and find out which operation is taking more time?
Manikanta
Hi Manikanta,
Thanks for your help. Here is the information you requested.
-
Unfortunately I am not able to provide a block diagram, however the setup is as described: Jetson TX2 with PCIe to SRIO bridge device plugged in to the PCI slot, connected to another system via QSFP cable. I do have a USB hub plugged in so I can use mouse and keyboard, and HDMI is plugged in for video display. Ethernet is not being used.
-
See logs pasted below.
-
See logs pasted below. I suspect interrupts because hitting the keyboard or unplugging and plugging in HDMI cable is what led me to discover this issue. After I execute my program, and then trigger an interrupt via one of these means, the program speeds up significantly. If I then terminate the application and then run it again soon after, it will maintain that speedup. Its not predictable when it will be slow or fast upon exection.
-
No there is no filesystem read/write. Data is sent directly to the device driver via its API. I am printing my IO transactions to the console however, which is how I was able to notice the speedup.
-
I cant profile it at this time, but it is a very simple program less than 50 LOC with just a while loop. Its simply to test out the driver, so no significant or important logic is being done in the application at this time until I can pass this testing.
lspci
00:01.0 PCI bridge: NVIDIA Corporation Device 10e5 (rev a1) (prog-if 00 [Normal decode])
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 388
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
I/O behind bridge: 0000f000-00000fff
Memory behind bridge: 50800000-52ffffff
Prefetchable memory behind bridge: 0000000058000000-0000000058ffffff
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
BridgeCtl: Parity- SERR- NoISA- VGA- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Subsystem: NVIDIA Corporation Device 0000
Capabilities: [48] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/2 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [60] HyperTransport: MSI Mapping Enable- Fixed-
Mapping Address Base: 00000000fee00000
Capabilities: [80] Express (v2) Root Port (Slot+), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag+ RBE+
DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x4, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive+ BWMgmt+ ABWMgmt-
SltCap: AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug- Surprise-
Slot #0, PowerLimit 0.000W; Interlock- NoCompl-
SltCtl: Enable: AttnBtn- PwrFlt- MRL- PresDet- CmdCplt- HPIrq- LinkChg-
Control: AttnInd Off, PwrInd On, Power- Interlock-
SltSta: Status: AttnBtn- PowerFlt- MRL- CmdCplt- PresDet+ Interlock-
Changed: MRL- PresDet+ LinkState+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible-
RootCap: CRSVisible-
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR+, OBFF Not Supported ARIFwd-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR+, OBFF Disabled ARIFwd-
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Kernel driver in use: pcieport
01:00.0 Bridge: Integrated Device Technology, Inc. [IDT] Device 80ab (rev 01)
Subsystem: StarBridge, Inc. Device 8011
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 388
Region 0: Memory at 50800000 (32-bit, non-prefetchable)
Region 1: Memory at 51000000 (32-bit, non-prefetchable)
Region 2: Memory at 58000000 (64-bit, prefetchable)
Region 4: Memory at 52000000 (64-bit, non-prefetchable)
Capabilities: [40] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 5GT/s, Width x4, ASPM not supported, Exit Latency L0s <4us, L1 <4us
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 5GT/s, Width x4, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, LTR-, OBFF Not Supported
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [c0] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [d0] MSI: Enable- Count=1/1 Maskable+ 64bit+
Address: 0000000000000000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [f0] Subsystem: Device 0000:0000
Capabilities: [a0] MSI-X: Enable- Count=70 Masked-
Vector table: BAR=0 offset=0002c000
PBA: BAR=0 offset=0002a000
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [180 v1] Device Serial Number 00-00-00-00-00-00-00-00
Kernel driver in use: sreb01_pci
Kernel modules: sreb01
/proc/interrupts before issue
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
3: 0 0 0 0 0 0 GICv2 30 Edge arch_timer
6: 892440 0 0 0 0 0 GICv2 32 Level tegra186_timer0
7: 0 30282 0 0 0 0 GICv2 33 Level tegra186_timer1
8: 0 0 93265 0 0 0 GICv2 34 Level tegra186_timer2
9: 0 0 0 588359 0 0 GICv2 35 Level tegra186_timer3
10: 0 0 0 0 566408 0 GICv2 36 Level tegra186_timer4
11: 0 0 0 0 0 592706 GICv2 37 Level tegra186_timer5
12: 117820 0 0 0 0 0 GICv2 208 Level hsp
13: 0 0 0 0 0 0 GICv2 202 Level arm-smmu global fault
14: 0 0 0 0 0 0 GICv2 203 Level arm-smmu global fault
22: 25303 0 0 0 0 0 GICv2 97 Level 3460000.sdhci
23: 124747 0 0 0 0 0 GICv2 96 Level 3440000.sdhci
24: 0 0 0 0 0 0 GICv2 94 Level 3400000.sdhci
25: 0 0 0 0 0 0 GICv2 229 Level 3507000.ahci-sata
26: 68 0 0 0 0 0 GICv2 57 Level 3160000.i2c
27: 0 0 0 0 0 0 GICv2 58 Level c240000.i2c
28: 4 0 0 0 0 0 GICv2 59 Level 3180000.i2c
29: 2 0 0 0 0 0 GICv2 60 Level 3190000.i2c
30: 0 0 0 0 0 0 GICv2 62 Level 31b0000.i2c
31: 0 0 0 0 0 0 GICv2 63 Level 31c0000.i2c
32: 40 0 0 0 0 0 GICv2 64 Level c250000.i2c
33: 0 0 0 0 0 0 GICv2 65 Level 31e0000.i2c
34: 0 0 0 0 0 0 GICv2 68 Level 3210000.spi
35: 0 0 0 0 0 0 GICv2 69 Level c260000.spi
36: 0 0 0 0 0 0 GICv2 71 Level 3240000.spi
37: 78 0 0 0 0 0 GICv2 144 Level serial
42: 0 0 0 0 0 0 GICv2 226 Level ether_qos.common_irq
44: 0 0 0 0 0 0 GICv2 222 Level 2490000.ether_qos.rx0
45: 0 0 0 0 0 0 GICv2 218 Level 2490000.ether_qos.tx0
52: 0 0 0 0 0 0 GICv2 48 Level b000000.rtcpu
53: 179 0 0 0 0 0 GICv2 242 Level d230000.actmon
54: 0 0 0 0 0 0 PM 42 Level tegra_rtc
55: 0 0 0 0 0 0 GICv2 255 Level mc_status
57: 2 0 0 0 0 0 GICv2 196 Level 3538000.mailbox
59: 3744 0 0 0 0 0 PM 195 Level xhci-hcd:usb1
60: 0 0 0 0 0 0 PM 199 Level 3530000.xhci, xotg
61: 0 0 0 0 0 0 GICv2 198 Level 3550000.xudc
62: 41320 0 0 0 0 0 GICv2 297 Level host_syncpt
63: 0 0 0 0 0 0 GICv2 295 Level host_status
64: 0 0 0 0 0 0 GICv2 151 Level 150c0000.nvcsi
65: 0 0 0 0 0 0 GICv2 233 Level 15700000.vi
68: 0 0 0 0 0 0 GICv2 237 Level tegra-isp-isr
69: 1009520 0 0 0 0 0 GICv2 186 Level 15210000.nvdisplay
73: 5721 0 0 0 0 0 GICv2 102 Level gk20a_stall
74: 0 0 0 0 0 0 GICv2 103 Level gk20a_nonstall
76: 0 0 0 0 0 0 GICv2 315 Level 3ad0000.se_elp
77: 54 0 0 0 0 0 GICv2 173 Level b150000.tegra-hsp
78: 53 0 0 0 0 0 GICv2 174 Level b150000.tegra-hsp, b150000.tegra-hsp
81: 0 0 0 0 0 0 GICv2 165 Level c150000.tegra-hsp
92: 2 0 0 0 0 0 GICv2 107 Level gpcdma.0
93: 2 0 0 0 0 0 GICv2 108 Level gpcdma.1
94: 0 0 0 0 0 0 GICv2 109 Level gpcdma.2
95: 0 0 0 0 0 0 GICv2 110 Level gpcdma.3
96: 0 0 0 0 0 0 GICv2 111 Level gpcdma.4
97: 0 0 0 0 0 0 GICv2 112 Level gpcdma.5
98: 0 0 0 0 0 0 GICv2 113 Level gpcdma.6
99: 0 0 0 0 0 0 GICv2 114 Level gpcdma.7
100: 0 0 0 0 0 0 GICv2 115 Level gpcdma.8
101: 0 0 0 0 0 0 GICv2 116 Level gpcdma.9
102: 0 0 0 0 0 0 GICv2 117 Level gpcdma.10
103: 0 0 0 0 0 0 GICv2 118 Level gpcdma.11
104: 0 0 0 0 0 0 GICv2 119 Level gpcdma.12
105: 0 0 0 0 0 0 GICv2 120 Level gpcdma.13
106: 0 0 0 0 0 0 GICv2 121 Level gpcdma.14
107: 0 0 0 0 0 0 GICv2 122 Level gpcdma.15
108: 0 0 0 0 0 0 GICv2 123 Level gpcdma.16
109: 0 0 0 0 0 0 GICv2 124 Level gpcdma.17
110: 0 0 0 0 0 0 GICv2 125 Level gpcdma.18
111: 0 0 0 0 0 0 GICv2 126 Level gpcdma.19
112: 0 0 0 0 0 0 GICv2 127 Level gpcdma.20
113: 0 0 0 0 0 0 GICv2 128 Level gpcdma.21
114: 0 0 0 0 0 0 GICv2 129 Level gpcdma.22
115: 0 0 0 0 0 0 GICv2 130 Level gpcdma.23
116: 0 0 0 0 0 0 GICv2 131 Level gpcdma.24
117: 0 0 0 0 0 0 GICv2 132 Level gpcdma.25
118: 0 0 0 0 0 0 GICv2 133 Level gpcdma.26
119: 0 0 0 0 0 0 GICv2 134 Level gpcdma.27
120: 0 0 0 0 0 0 GICv2 135 Level gpcdma.28
121: 0 0 0 0 0 0 GICv2 136 Level gpcdma.29
122: 0 0 0 0 0 0 GICv2 137 Level gpcdma.30
123: 0 0 0 0 0 0 GICv2 138 Level gpcdma.31
232: 0 0 0 0 0 0 tegra-gpio 101 Level phy_interrupt
252: 0 0 0 0 0 0 tegra-gpio 121 Edge 15210000.nvdisplay
256: 0 0 0 0 0 0 tegra-gpio 125 Edge 3400000.sdhci cd
290: 0 0 0 0 0 0 tegra-gpio 159 Edge external-connection:extcon@1
340: 0 0 0 0 0 0 tegra-gpio-aon 16 Level tmp451
380: 0 0 0 0 0 0 tegra-gpio-aon 56 Edge Power
381: 0 0 0 0 0 0 tegra-gpio-aon 57 Edge Volume Up
382: 0 0 0 0 0 0 tegra-gpio-aon 58 Edge Volume Down
383: 16310 0 0 0 0 0 tegra-gpio-aon 59 Level bcmsdh_sdmmc
384: 1 0 0 0 0 0 tegra-gpio-aon 60 Edge bluetooth hostwake
388: 0 0 0 0 0 0 GICv2 104 Level PCIE, PCIe PME, aerdrv, sreb01_pci
389: 0 0 0 0 0 0 GICv2 105 Level Tegra PCIe MSI
390: 54 0 0 0 0 0 GIC 32 Level adma.0
391: 53 0 0 0 0 0 GIC 33 Level adma.1
392: 0 0 0 0 0 0 GIC 34 Level adma.2
393: 0 0 0 0 0 0 GIC 35 Level adma.3
394: 0 0 0 0 0 0 GIC 36 Level adma.4
395: 0 0 0 0 0 0 GIC 37 Level adma.5
396: 0 0 0 0 0 0 GIC 38 Level adma.6
397: 0 0 0 0 0 0 GIC 39 Level adma.7
398: 0 0 0 0 0 0 GIC 40 Level adma.8
399: 0 0 0 0 0 0 GIC 41 Level adma.9
400: 714 0 0 0 0 0 GICv2 193 Level snd_hda_tegra
411: 0 0 0 0 0 0 GIC 73 Edge hwmbox1_send_empty
412: 0 0 0 0 0 0 GIC 64 Edge hwmbox0_recv_full
413: 0 0 0 0 0 0 GIC 115 Edge adsp watchdog
414: 0 0 0 0 0 0 GIC 94 Edge adsp wfi
415: 0 0 0 0 0 0 GIC 89 Level AMC error int
424: 559 0 0 0 0 0 GICv2 39 Level 30c0000.watchdog
431: 0 0 0 0 0 0 PM 241 Edge max77620-top
435: 0 0 0 0 0 0 max77620-top 3 Edge max77620-gpio
436: 0 0 0 0 0 0 max77620-top 4 Edge max77686-rtc
440: 0 0 0 0 0 0 max77620-top 8 Edge max77620-thermal
441: 0 0 0 0 0 0 max77620-top 9 Edge max77620-thermal
442: 0 0 0 0 0 0 max77620-gpio 0 Edge external-connection:extcon@1
450: 0 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
IPI0: 233876 283060 743739 318065 290169 267828 Rescheduling interrupts
IPI1: 41 40 27 27 26 32 Function call interrupts
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 0 0 Timer broadcast interrupts
IPI4: 3929 853 1045 4578 3080 3141 IRQ work interrupts
Err: 0
/proc/interrupts after issue
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
3: 0 0 0 0 0 0 GICv2 30 Edge arch_timer
6: 894464 0 0 0 0 0 GICv2 32 Level tegra186_timer0
7: 0 30491 0 0 0 0 GICv2 33 Level tegra186_timer1
8: 0 0 93360 0 0 0 GICv2 34 Level tegra186_timer2
9: 0 0 0 590149 0 0 GICv2 35 Level tegra186_timer3
10: 0 0 0 0 569274 0 GICv2 36 Level tegra186_timer4
11: 0 0 0 0 0 594452 GICv2 37 Level tegra186_timer5
12: 117913 0 0 0 0 0 GICv2 208 Level hsp
13: 0 0 0 0 0 0 GICv2 202 Level arm-smmu global fault
14: 0 0 0 0 0 0 GICv2 203 Level arm-smmu global fault
22: 25322 0 0 0 0 0 GICv2 97 Level 3460000.sdhci
23: 125069 0 0 0 0 0 GICv2 96 Level 3440000.sdhci
24: 0 0 0 0 0 0 GICv2 94 Level 3400000.sdhci
25: 0 0 0 0 0 0 GICv2 229 Level 3507000.ahci-sata
26: 68 0 0 0 0 0 GICv2 57 Level 3160000.i2c
27: 0 0 0 0 0 0 GICv2 58 Level c240000.i2c
28: 4 0 0 0 0 0 GICv2 59 Level 3180000.i2c
29: 2 0 0 0 0 0 GICv2 60 Level 3190000.i2c
30: 0 0 0 0 0 0 GICv2 62 Level 31b0000.i2c
31: 0 0 0 0 0 0 GICv2 63 Level 31c0000.i2c
32: 40 0 0 0 0 0 GICv2 64 Level c250000.i2c
33: 0 0 0 0 0 0 GICv2 65 Level 31e0000.i2c
34: 0 0 0 0 0 0 GICv2 68 Level 3210000.spi
35: 0 0 0 0 0 0 GICv2 69 Level c260000.spi
36: 0 0 0 0 0 0 GICv2 71 Level 3240000.spi
37: 78 0 0 0 0 0 GICv2 144 Level serial
42: 0 0 0 0 0 0 GICv2 226 Level ether_qos.common_irq
44: 0 0 0 0 0 0 GICv2 222 Level 2490000.ether_qos.rx0
45: 0 0 0 0 0 0 GICv2 218 Level 2490000.ether_qos.tx0
52: 0 0 0 0 0 0 GICv2 48 Level b000000.rtcpu
53: 187 0 0 0 0 0 GICv2 242 Level d230000.actmon
54: 0 0 0 0 0 0 PM 42 Level tegra_rtc
55: 0 0 0 0 0 0 GICv2 255 Level mc_status
57: 2 0 0 0 0 0 GICv2 196 Level 3538000.mailbox
59: 3941 0 0 0 0 0 PM 195 Level xhci-hcd:usb1
60: 0 0 0 0 0 0 PM 199 Level 3530000.xhci, xotg
61: 0 0 0 0 0 0 GICv2 198 Level 3550000.xudc
62: 42169 0 0 0 0 0 GICv2 297 Level host_syncpt
63: 0 0 0 0 0 0 GICv2 295 Level host_status
64: 0 0 0 0 0 0 GICv2 151 Level 150c0000.nvcsi
65: 0 0 0 0 0 0 GICv2 233 Level 15700000.vi
68: 0 0 0 0 0 0 GICv2 237 Level tegra-isp-isr
69: 1010450 0 0 0 0 0 GICv2 186 Level 15210000.nvdisplay
73: 6339 0 0 0 0 0 GICv2 102 Level gk20a_stall
74: 0 0 0 0 0 0 GICv2 103 Level gk20a_nonstall
76: 0 0 0 0 0 0 GICv2 315 Level 3ad0000.se_elp
77: 54 0 0 0 0 0 GICv2 173 Level b150000.tegra-hsp
78: 53 0 0 0 0 0 GICv2 174 Level b150000.tegra-hsp, b150000.tegra-hsp
81: 0 0 0 0 0 0 GICv2 165 Level c150000.tegra-hsp
92: 2 0 0 0 0 0 GICv2 107 Level gpcdma.0
93: 2 0 0 0 0 0 GICv2 108 Level gpcdma.1
94: 0 0 0 0 0 0 GICv2 109 Level gpcdma.2
95: 0 0 0 0 0 0 GICv2 110 Level gpcdma.3
96: 0 0 0 0 0 0 GICv2 111 Level gpcdma.4
97: 0 0 0 0 0 0 GICv2 112 Level gpcdma.5
98: 0 0 0 0 0 0 GICv2 113 Level gpcdma.6
99: 0 0 0 0 0 0 GICv2 114 Level gpcdma.7
100: 0 0 0 0 0 0 GICv2 115 Level gpcdma.8
101: 0 0 0 0 0 0 GICv2 116 Level gpcdma.9
102: 0 0 0 0 0 0 GICv2 117 Level gpcdma.10
103: 0 0 0 0 0 0 GICv2 118 Level gpcdma.11
104: 0 0 0 0 0 0 GICv2 119 Level gpcdma.12
105: 0 0 0 0 0 0 GICv2 120 Level gpcdma.13
106: 0 0 0 0 0 0 GICv2 121 Level gpcdma.14
107: 0 0 0 0 0 0 GICv2 122 Level gpcdma.15
108: 0 0 0 0 0 0 GICv2 123 Level gpcdma.16
109: 0 0 0 0 0 0 GICv2 124 Level gpcdma.17
110: 0 0 0 0 0 0 GICv2 125 Level gpcdma.18
111: 0 0 0 0 0 0 GICv2 126 Level gpcdma.19
112: 0 0 0 0 0 0 GICv2 127 Level gpcdma.20
113: 0 0 0 0 0 0 GICv2 128 Level gpcdma.21
114: 0 0 0 0 0 0 GICv2 129 Level gpcdma.22
115: 0 0 0 0 0 0 GICv2 130 Level gpcdma.23
116: 0 0 0 0 0 0 GICv2 131 Level gpcdma.24
117: 0 0 0 0 0 0 GICv2 132 Level gpcdma.25
118: 0 0 0 0 0 0 GICv2 133 Level gpcdma.26
119: 0 0 0 0 0 0 GICv2 134 Level gpcdma.27
120: 0 0 0 0 0 0 GICv2 135 Level gpcdma.28
121: 0 0 0 0 0 0 GICv2 136 Level gpcdma.29
122: 0 0 0 0 0 0 GICv2 137 Level gpcdma.30
123: 0 0 0 0 0 0 GICv2 138 Level gpcdma.31
232: 0 0 0 0 0 0 tegra-gpio 101 Level phy_interrupt
252: 0 0 0 0 0 0 tegra-gpio 121 Edge 15210000.nvdisplay
256: 0 0 0 0 0 0 tegra-gpio 125 Edge 3400000.sdhci cd
290: 0 0 0 0 0 0 tegra-gpio 159 Edge external-connection:extcon@1
340: 0 0 0 0 0 0 tegra-gpio-aon 16 Level tmp451
380: 0 0 0 0 0 0 tegra-gpio-aon 56 Edge Power
381: 0 0 0 0 0 0 tegra-gpio-aon 57 Edge Volume Up
382: 0 0 0 0 0 0 tegra-gpio-aon 58 Edge Volume Down
383: 16356 0 0 0 0 0 tegra-gpio-aon 59 Level bcmsdh_sdmmc
384: 1 0 0 0 0 0 tegra-gpio-aon 60 Edge bluetooth hostwake
388: 0 0 0 0 0 0 GICv2 104 Level PCIE, PCIe PME, aerdrv, sreb01_pci
389: 0 0 0 0 0 0 GICv2 105 Level Tegra PCIe MSI
390: 54 0 0 0 0 0 GIC 32 Level adma.0
391: 53 0 0 0 0 0 GIC 33 Level adma.1
392: 0 0 0 0 0 0 GIC 34 Level adma.2
393: 0 0 0 0 0 0 GIC 35 Level adma.3
394: 0 0 0 0 0 0 GIC 36 Level adma.4
395: 0 0 0 0 0 0 GIC 37 Level adma.5
396: 0 0 0 0 0 0 GIC 38 Level adma.6
397: 0 0 0 0 0 0 GIC 39 Level adma.7
398: 0 0 0 0 0 0 GIC 40 Level adma.8
399: 0 0 0 0 0 0 GIC 41 Level adma.9
400: 714 0 0 0 0 0 GICv2 193 Level snd_hda_tegra
411: 0 0 0 0 0 0 GIC 73 Edge hwmbox1_send_empty
412: 0 0 0 0 0 0 GIC 64 Edge hwmbox0_recv_full
413: 0 0 0 0 0 0 GIC 115 Edge adsp watchdog
414: 0 0 0 0 0 0 GIC 94 Edge adsp wfi
415: 0 0 0 0 0 0 GIC 89 Level AMC error int
424: 559 0 0 0 0 0 GICv2 39 Level 30c0000.watchdog
431: 0 0 0 0 0 0 PM 241 Edge max77620-top
435: 0 0 0 0 0 0 max77620-top 3 Edge max77620-gpio
436: 0 0 0 0 0 0 max77620-top 4 Edge max77686-rtc
440: 0 0 0 0 0 0 max77620-top 8 Edge max77620-thermal
441: 0 0 0 0 0 0 max77620-top 9 Edge max77620-thermal
442: 0 0 0 0 0 0 max77620-gpio 0 Edge external-connection:extcon@1
450: 0 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
IPI0: 234166 285072 744720 322479 294638 272602 Rescheduling interrupts
IPI1: 41 40 27 27 26 32 Function call interrupts
IPI2: 0 0 0 0 0 0 CPU stop interrupts
IPI3: 0 0 0 0 0 0 Timer broadcast interrupts
IPI4: 3951 947 1055 4588 3091 3151 IRQ work interrupts
Err: 0
Hi,
I don’t see any PCIe interrupts raised after your application is executed.
388: 0 0 0 0 0 0 GICv2 104 Level PCIE, PCIe PME, aerdrv, sreb01_pci
389: 0 0 0 0 0 0 GICv2 105 Level Tegra PCIe MSI
What does sreb01_pci driver do when data is sent from application? Is sreb01_pci driver open source driver? Is it possible to share the driver?
I need more information on what sreb01_pci driver does when application send data,
-
Does it do BAR write to the SRIO bridge device or does it do DMA transfer. If it is DMA transfer I am wondering how does sreb01_pci driver knows about DMA completion without getting PCIe interrupt.
= If it is BAR write, we need to make sure that CPU and memory runs at constant clock to avoid change in execution speed. Simply set CPU & EMC clock to max by executing “/home/ubuntu/jetson_clocks.sh” script. If you want to set desired CPU freq then please check Jetson/Performance - eLinux.org for more details.= Execute /home/ubuntu/tegrastats binary to check if there is any change in CPU & EMC clock.
Please share “sudo lspci -vvv” output.
- Manikanta
Hi Manikanta,
Thanks for your quick response. I think you had read my post before I edited it. I added sudo lpsci shortly after I posted. Please have another look if you can. The driver is not open source and is from a third-party, although it is loosely based on the driver that ships with mainline linux since that was written by the device OEM themselves: linux/tsi721.h at master · torvalds/linux · GitHub. To answer your questions:
For the IO operations themselves, DMA is not used. It does a BAR write, and I have ran the jetsonclocks script as mentioned previously in this thread.
I have found something interesting with tegrastats: when the application is running slowly, it seems load is being distributed amongst the CPUs. However, when the application is running quickly, I can see 100% CPU utilization on CPU0. Im not sure how triggering an interrupt could affect resource allocation like that. The logs below were done in a single take, but Ive annotated them for easier reading:
When the TX2 was idle. All CPUs have very low utilization:
RAM 1128/7847MB (lfb 1462x4MB) CPU [1%@2035,0%@2035,0%@2035,2%@2035,1%@2035,1%@2035] BCPU@32C MCPU@32C GPU@30C PLL@32C Tboard@27C Tdiode@28.25C PMIC@100C thermal@30.7C VDD_IN 4003/4342 VDD_CPU 457/670 VDD_GPU 228/248 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1220
RAM 1128/7847MB (lfb 1462x4MB) CPU [1%@2035,0%@2035,0%@2035,1%@2035,1%@2035,1%@2035] BCPU@31.5C MCPU@31.5C GPU@30C PLL@31.5C Tboard@27C Tdiode@28.25C PMIC@100C thermal@30.7C VDD_IN 4003/4331 VDD_CPU 457/663 VDD_GPU 228/247 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1219
RAM 1128/7847MB (lfb 1462x4MB) CPU [1%@2035,0%@2035,0%@2035,0%@2035,0%@2035,2%@2035] BCPU@31.5C MCPU@31.5C GPU@30C PLL@31.5C Tboard@27C Tdiode@28.25C PMIC@100C thermal@30.9C VDD_IN 3965/4320 VDD_CPU 380/654 VDD_GPU 228/247 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1219
RAM 1128/7847MB (lfb 1462x4MB) CPU [2%@2035,0%@2035,0%@2035,2%@2035,2%@2035,1%@2035] BCPU@31.5C MCPU@31.5C GPU@30C PLL@31.5C Tboard@27C Tdiode@28.25C PMIC@100C thermal@30.9C VDD_IN 4003/4310 VDD_CPU 380/645 VDD_GPU 228/246 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1218
Executed my application and experience slow speeds. Some CPUs have load:
RAM 1128/7847MB (lfb 1462x4MB) CPU [1%@2035,0%@2035,65%@2035,3%@2035,2%@2035,0%@2035] BCPU@31.5C MCPU@31.5C GPU@30C PLL@31.5C Tboard@27C Tdiode@28.5C PMIC@100C thermal@30.9C VDD_IN 4459/4314 VDD_CPU 914/653 VDD_GPU 228/245 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1218
RAM 1128/7847MB (lfb 1462x4MB) CPU [3%@2035,0%@2035,47%@2035,2%@2035,2%@2035,2%@2035] BCPU@31.5C MCPU@31.5C GPU@30C PLL@31.5C Tboard@27C Tdiode@28.5C PMIC@100C thermal@30.9C VDD_IN 4459/4319 VDD_CPU 838/659 VDD_GPU 228/245 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1218
RAM 1128/7847MB (lfb 1462x4MB) CPU [1%@2035,0%@2035,49%@2035,1%@2035,2%@2035,2%@2035] BCPU@32C MCPU@32C GPU@30C PLL@32C Tboard@27C Tdiode@28.5C PMIC@100C thermal@30.9C VDD_IN 4420/4321 VDD_CPU 838/664 VDD_GPU 228/244 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1217
RAM 1128/7847MB (lfb 1462x4MB) CPU [8%@2035,0%@2035,50%@2035,6%@2035,6%@2035,3%@2035] BCPU@31.5C MCPU@31.5C GPU@30C PLL@31.5C Tboard@27C Tdiode@28.5C PMIC@100C thermal@30.9C VDD_IN 4611/4329 VDD_CPU 990/672 VDD_GPU 228/244 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1224/1217
RAM 1128/7847MB (lfb 1462x4MB) CPU [2%@2035,0%@2035,48%@2035,1%@2035,2%@2035,2%@2035] BCPU@32C MCPU@32C GPU@30C PLL@32C Tboard@27C Tdiode@28.5C PMIC@100C thermal@31.2C VDD_IN 4459/4333 VDD_CPU 838/677 VDD_GPU 228/244 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1217
RAM 1128/7847MB (lfb 1462x4MB) CPU [1%@2035,0%@2035,47%@2035,1%@2035,1%@2035,1%@2035] BCPU@31.5C MCPU@31.5C GPU@30C PLL@31.5C Tboard@27C Tdiode@28.5C PMIC@100C thermal@30.9C VDD_IN 4420/4335 VDD_CPU 838/681 VDD_GPU 228/243 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1205/1217
RAM 1128/7847MB (lfb 1462x4MB) CPU [27%@2035,0%@2035,25%@2035,1%@2035,5%@2035,11%@2035] BCPU@32C MCPU@32C GPU@30C PLL@32C Tboard@27C Tdiode@28.5C PMIC@100C thermal@31.2C VDD_IN 4725/4345 VDD_CPU 990/689 VDD_GPU 304/245 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1224/1217
Hit the space bar and observed the application speed up. CPU0 has 100% load:
RAM 1128/7847MB (lfb 1462x4MB) CPU [100%@2035,0%@2035,0%@2035,16%@2035,18%@2035,6%@2035] BCPU@32C MCPU@32C GPU@30C PLL@32C Tboard@27C Tdiode@28.75C PMIC@100C thermal@31.2C VDD_IN 5145/4364 VDD_CPU 1066/698 VDD_GPU 456/250 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1320/1219
RAM 1128/7847MB (lfb 1462x4MB) CPU [100%@2035,0%@2035,0%@2035,13%@2035,17%@2035,9%@2035] BCPU@32C MCPU@32C GPU@30.5C PLL@32C Tboard@27C Tdiode@28.5C PMIC@100C thermal@31.1C VDD_IN 4764/4374 VDD_CPU 914/703 VDD_GPU 304/251 VDD_SOC 1066/1066 VDD_WIFI 0/0 VDD_DDR 1262/1220
RAM 1128/7847MB (lfb 1462x4MB) CPU [100%@2035,0%@2035,0%@2035,1%@2035,3%@2035,3%@2035] BCPU@31.5C MCPU@31.5C GPU@30C PLL@31.5C Tboard@27C Tdiode@28.75C PMIC@100C thermal@31.2C VDD_IN 4878/4385 VDD_CPU 838/706 VDD_GPU 228/251 VDD_SOC 1066/1066 VDD_WIFI 325/8 VDD_DDR 1205/1220
It seems like context switching is what is causing the slowdown, and that it is necessary for this to run on CPU0. I have read that CPU0 is the one that handles IRQs. Is this the case here? Is there a script/tool provided by NVIDIA to dedicate a process to CPU0, or should I just use some standard means of doing so? Thanks!
Hi,
I have read that CPU0 is the one that handles IRQs. Is this the case here?
Timer interrupts are per CPU based, rest of them are handled by CPU0. You can check this in cat /proc/interrupts output.
Is there a script/tool provided by NVIDIA to dedicate a process to CPU0, or should I just use some standard means of doing so?
Use “taskset -p ” to set affinity of process.
You can also set CPU affinity as part of user space application using sched_setaffinity(). Refer to CPU Affinity | Linux Journal
- Manikanta
Hi Manikanta,
I have set CPU affinity to CPU0 at the top of my application:
#define _GNU_SOURCE
#include <sched.h>
cpu_set_t mask;
CPU_ZERO(&mask);
CPU_SET(0, &mask);
result = sched_setaffinity(0, sizeof(mask), &mask);
This doesnt work. It actually keeps the application running consistently at the slower speeds, regardless of if I trigger an interrupt using the keyboard. Also, tegrastats shows 50% utilization for CPU0 and almost nothing for the others (1-2%). If I change CPU affinity to CPU1 however, it reverts to the old behavior: slow until I trigger interrupt.
Hi,
If I change CPU affinity to CPU1 however, it reverts to the old behavior: slow until I trigger interrupt.
Is application continued to run on CPU1 @100% or migrated to CPU0?
This doesn’t look like a PCIe issue. I will ask kernel team to look into this issue.
- Manikanta
Hi,
Though you already ran jetson_clock, could you compare the clock summary before and after the slow down happen?
$ sudo su
$ cat /sys/kernel/debug/clk/clk_summary