Bad mode in Synchronous Abort handler detected, code 0x86000004 -- IABT (current EL)

Hi,

I am porting a driver from Jetpack-l4t-3-1 to Jetpack-l4t-3-2, but on Jetpack-l4t-3-2 I getting this error when I try to read some v4l2 controls of the driver I get this error:

root@endor:~# v4l2-ctl --get-ctrl=exposure
[   15.434600] Bad mode in Synchronous Abort handler detected, code 0x86000006 -- IABT (current EL)
[   15.443370] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[   15.449277] Modules linked in: bnep bluetooth bcmdhd
[   15.454280] CPU: 0 PID: 1199 Comm: v4l2-ctl Tainted: G        W       4.4.38-l4t-r28.2+g2657cd8 #21
[   15.463309] Hardware name: quill (DT)
[   15.466962] task: ffffffc1e3759900 ti: ffffffc07ad10000 task.ti: ffffffc07ad10000
[   15.474430] PC is at 0x0
[   15.476961] LR is at v4l2_g_ext_ctrls+0x244/0x2b8
[   15.481656] pc : [<0000000000000000>] lr : [<ffffffc00079d65c>] pstate: 40000045
[   15.489035] sp : ffffffc07ad13b20
[   15.492340] x29: ffffffc07a000000 x28: 0000000000000014 
[   15.497659] x27: 0000000000000000 x26: ffffffc07ad13b90 
[   15.502979] x25: 0000000000000018 x24: 0000000000000000 
[   15.508299] x23: ffffffc07ad13b90 x22: ffffffc07ad13d30 
[   15.513620] x21: ffffffc1ea7d1600 x20: 0000000000000000 
[   15.518940] x19: 0000000000000000 x18: ffffffc0814337da 
[   15.524259] x17: 0000007f85f450c0 x16: ffffffc0001dc060 
[   15.529580] x15: 0000000000000005 x14: 0000000000000006 
[   15.534900] x13: ffffffc0014337e2 x12: ffffffc001412000 
[   15.540221] x11: 000000000000047e x10: 0000000005f5e0ff 
[   15.545539] x9 : ffffffc07ad13700 x8 : 3020727265206e72 
[   15.550860] x7 : 75746552203a3031 x6 : ffffffc001433806 
[   15.556180] x5 : ffffffc1f6640ba8 x4 : 0000000000000001 
[   15.561499] x3 : 0000000000000007 x2 : 0000000000000006 
[   15.566820] x1 : ffffffc1ea7d1600 x0 : ffffffc1e37dcb00 
[   15.572138] 
[   15.573626] Process v4l2-ctl (pid: 1199, stack limit = 0xffffffc07ad10020)
[   15.580485] Call trace:
[   15.582926] [<          (null)>]           (null)
[   15.587772] ---[ end trace f9bbbfa6a58ac7d2 ]---

Looking for information of this problem, I have seen that is related to a low level configuration of the system.

Here is described a similar problem for TX1 and using a PCIe driver, but I am using a TX2 and CSI video capture driver: https://devtalk.nvidia.com/default/topic/996441/?comment=5095529

Here is reported the same error where is changed some assembler code: arm64: Handle el1 synchronous instruction aborts cleanly - Patchwork but I think the cause of the error here is different

I have seen that the error is produced in the kernel, because the callback called in the driver to compute the value requested, returns properly. And the Link Register (LR) reported in the error is v4l2_g_ext_ctrls.
The callback read some register from the camera sensor to compute the value requested but I have seen if I removed the function to read the I2C the error disappears.

void imx298_get_exposure(struct imx298 *priv, u32 * exp_value)
{
	unsigned int coarse_int_time;
	u8 reg_val = 0;

	/* This code is commented to avoid error: Bad mode in Synchronous Abort handler detected */
	/* imx298_read_reg(priv->s_data, IMX298_COARSE_TIME_ADDR_LSB, &reg_val); */
	coarse_int_time = reg_val;

Any idea on how to fix this error?

Thanks.

Could you break down the imx298_read_reg function which line cause it?

Hi ShaneCCC,

The exact line is

imx298_read_reg
    regmap_read 
        _regmap_read
            ret = map->reg_read(context, reg, val);  (regmap.c line 2235)

If this line is removed the error is not produced.

This line only affects 4 controls in the driver, Apparently the error is produced when are read only certain addresses of the sensor, I am wondering if for this address the sensor has a different behavior that affects this kernel version, for example if the answer time slightly different for those address.

There are other controls that are read and they also executes this line but they don’t produces the error.

If you need more information please let me know.

Thanks.

Just check the imx185 sensor driver. You don’t need to implement the get function by sensor driver it self. Just need to implement the set function. Please reference to imx185 to implement your driver.

The IMX185 doesn’t implement volatile controls In my case maybe I can change some controls with problems to avoid to compute the value requested. If the control is no volatile v4l2 returns the default value defined or the last value configured if the control has been configured previously.

But there are some controls that are read only. For example, Frame rate, this control returns the estimated frame rate based on the sensor configuration (PLL, Frame length etc) I would like to keep this features in the driver.

Since this driver was working correctly with Jetpack 3.1 (L4T r28.1) I was wondering what is the cause of the error in order to see if I could avoid to change the driver to avoid it.

Is this problem a known issue for this kernel version?

Thanks.

I don’t have idea if it’s a know issue for this kernel version.
It’s better to break down the reg_read to root out the problem.

I get this sometimes too shutting down a CSI-based encoder application.

Any ideas?

[ 5502.353024] Bad mode in Synchronous Abort handler detected, code 0x86000004 -- IABT (current EL)
[ 5502.353026] Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP
[ 5502.353034] Modules linked in: cti_4chrolan(O) bcmdhd pci_tegra bluedroid_pm
[ 5502.353038] CPU: 4 PID: 2456 Comm: kworker/4:1 Tainted: G           O    4.4.38-tegra #24
[ 5502.353039] Hardware name: quill (DT)
[ 5502.353047] Workqueue: events tegra_channel_status_worker
[ 5502.353049] task: ffffffc1e1840000 ti: ffffffc1cd6a0000 task.ti: ffffffc1cd6a0000
[ 5502.353050] PC is at 0x353235203a2274
[ 5502.353055] LR is at vi_notify_channel_reset+0x54/0x88
[ 5502.353057] pc : [<00353235203a2274>] lr : [<ffffffc0008db114>] pstate: 60000045
[ 5502.353057] sp : ffffffc1cd6a3cf0
[ 5502.353060] x29: ffffffc1cd6a3cf0 x28: 0000000000000000 
[ 5502.353062] x27: 0000000000000000 x26: ffffffc00136af78 
[ 5502.353063] x25: 0000000000000000 x24: 0000000000000000 
[ 5502.353065] x23: ffffffc1f5daa400 x22: 0000000000000000 
[ 5502.353067] x21: ffffffc1e5b0d018 x20: 0000000000000000 
[ 5502.353068] x19: ffffffc1e5b0d000 x18: 0000000000000005 
[ 5502.353070] x17: ffffffc000b16a60 x16: 0000000000000003 
[ 5502.353072] x15: 000000000004f47b x14: 0000000000001000 
[ 5502.353073] x13: 0000000000001000 x12: 0000000000000000 
[ 5502.353075] x11: 000000000047d00b x10: 00000000000008a0 
[ 5502.353077] x9 : ffffffc1cd6a3d20 x8 : ffffffc1e1840900 
[ 5502.353078] x7 : 0000000000000018 x6 : ffffffc0013f3510 
[ 5502.353080] x5 : ffffffc1f5da6200 x4 : 000000000009aa3b 
[ 5502.353081] x3 : 0000000000000000 x2 : 2c353235203a2274 
[ 5502.353083] x1 : 0000000000000000 x0 : 226c656e6e616863 
[ 5502.353083] 
[ 5502.353085] Process kworker/4:1 (pid: 2456, stack limit = 0xffffffc1cd6a0020)
[ 5502.353086] Call trace:
[ 5502.353087] [<00353235203a2274>] 0x353235203a2274
[ 5502.353090] [<ffffffc000771068>] tegra_channel_handle_error+0x5c/0xa0
[ 5502.353093] [<ffffffc0007710bc>] tegra_channel_status_worker+0x10/0x18
[ 5502.353097] [<ffffffc0000bbc24>] process_one_work+0x154/0x434
[ 5502.353099] [<ffffffc0000bc038>] worker_thread+0x134/0x40c
[ 5502.353101] [<ffffffc0000c18c8>] kthread+0xe0/0xf4
[ 5502.353105] [<ffffffc000084f90>] ret_from_fork+0x10/0x40
[ 5502.354454] ---[ end trace 47172f6abdefaece ]---

Hello Cobrien

I am researching the cause of this type of error. I use 2 version of kernel 4.4.38 :

  1. L4T 28.2.1. Source code downloaded from Jetson Download Center | NVIDIA Developer built with Ridgerun instructions (Compiling Tegra Source Code | Jetson Tegra X1 and X2 | RidgeRun)
  2. kernel 4.4.38 with some little modification built with yocto

Using the first kernel version, the kernel has never generated this error, therefore I have
Compared both kernel version but I haven’t found a difference between kernels that is producing the error.
I have seen that Nvidia provides a specific toolchain for each version of L4T, For example for 28.2 the toolchain is http://developer.nvidia.com/embedded/dlc/l4t-gcc-toolchain-64-bit-28-2
In my case Yocto is using a more recently version of toolchain than the version recommended by Nvidia. I am checking if this could be the cause of the error.

In your case

What kernel version are you using?
What toolchain version have you used to compile the kernel?

Thanks.

I am using the NVIDIA supplied kernel for L4T R28.1.0 and the NVIDIA compiler.

0.000000] Linux version 4.4.38-tegra (cary@x) (gcc version 4.8.5 (GCC) ) #25 SMP PREEMPT Mon Jan 21 15:59:26 EST 2019

We have a kernel module to control some hardware, and there’s a custom camera driver compiled in.