Debugging kernel NULL pointer dereference fault in tegra-vi4 and tc358743 (Jetson TX2 with JetPack 3...

Hello,

I’m using Jetson TX2 with JetPack 3.2.1 with tegra-l4t-r28.2.1

I’m having trouble debugging a kernel NULL pointer dereference fault when using the tc358743 driver. After the fault, the TX2 locks up until a watchdog reset. The error I’m getting upon boot is.

[    4.398957] tegra-vi4 15700000.vi: initialized
[    4.400068] tegra-vi4 15700000.vi: subdev 150c0000.nvcsi-0 bound
[    4.400072] tegra-vi4 15700000.vi: subdev tc358743 2-000f bound
[    4.400417] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[    4.400419] pgd = ffffffc001525000
[    4.400423] [00000000] *pgd=000000026cd7f003, *pud=000000026cd7f003, *pmd=000000026cd80003, *pte=00e8000003881707
[    4.400426] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    4.400428] Modules linked in:
[    4.400431] CPU: 5 PID: 54 Comm: kworker/u12:1 Not tainted 4.4.38+ #5
[    4.400433] Hardware name: quill (DT)
[    4.400439] Workqueue: events_unbound async_run_entry_fn
[    4.400440] task: ffffffc1eb499900 ti: ffffffc1eb518000 task.ti: ffffffc1eb518000
[    4.400445] PC is at tegra_channel_init_subdevices+0x464/0x660
[    4.400446] LR is at tegra_channel_init_subdevices+0x414/0x660
[    4.400448] pc : [<ffffffc0007d7e3c>] lr : [<ffffffc0007d7dec>] pstate: 20000045
[    4.400448] sp : ffffffc1eb51b9a0
[    4.400451] x29: ffffffc1eb51b9a0 x28: ffffffc07b2003e0 
[    4.400452] x27: ffffffc1eabe21a0 x26: ffffffc1eabe2070 
[    4.400454] x25: ffffffc07b09ac00 x24: ffffffc000c8b000 
[    4.400456] x23: ffffffc07b202000 x22: ffffffc07b2004f8 
[    4.400457] x21: ffffffc07b00f800 x20: ffffffc07b201000 
[    4.400459] x19: ffffffc07b200018 x18: 0000000000020000 
[    4.400461] x17: 0000000000001a1d x16: 0000000000001a1d 
[    4.400462] x15: 0000000000001490 x14: 0000000000000028 
[    4.400464] x13: ffffff0000000000 x12: 0000000000000090 
[    4.400465] x11: 0000000000000000 x10: ffffffc1eabe2038 
[    4.400467] x9 : fefefefefefefeff x8 : 0000000000000000 
[    4.400469] x7 : ffffffffffffffff x6 : ffffffc07b2010d0 
[    4.400470] x5 : 0000000000000000 x4 : 0000000000000000 
[    4.400472] x3 : 0000000000000001 x2 : ffffffc07b2010d0 
[    4.400473] x1 : 0000000000000000 x0 : ffffffc1e59e8000 
[    4.400473] 
[    4.400475] Process kworker/u12:1 (pid: 54, stack limit = 0xffffffc1eb518020)
[    4.400476] Call trace:
[    4.400478] [<ffffffc0007d7e3c>] tegra_channel_init_subdevices+0x464/0x660
[    4.400480] [<ffffffc0007d8d80>] tegra_vi_graph_notify_complete+0x220/0x660
[    4.400483] [<ffffffc0007c8510>] v4l2_async_test_notify+0xf0/0x100
[    4.400485] [<ffffffc0007c8644>] v4l2_async_notifier_register+0x124/0x190
[    4.400487] [<ffffffc0007d9878>] tegra_vi_graph_init+0x1c8/0x298
[    4.400489] [<ffffffc0007d59d0>] tegra_vi_media_controller_init+0x190/0x200
[    4.400492] [<ffffffc000955258>] tegra_vi4_probe+0x210/0x2c8
[    4.400496] [<ffffffc000598d78>] platform_drv_probe+0x50/0xb8
[    4.400499] [<ffffffc000596874>] driver_probe_device+0xcc/0x428
[    4.400500] [<ffffffc000596c6c>] __driver_attach+0x9c/0xa0
[    4.400503] [<ffffffc000594768>] bus_for_each_dev+0x60/0xa0
[    4.400504] [<ffffffc000596168>] driver_attach+0x20/0x28
[    4.400506] [<ffffffc000594c64>] driver_attach_async+0x14/0x58
[    4.400508] [<ffffffc0000c5000>] async_run_entry_fn+0x40/0x168
[    4.400509] [<ffffffc0000bc218>] process_one_work+0x138/0x4c0
[    4.400511] [<ffffffc0000bc6c4>] worker_thread+0x124/0x498
[    4.400513] [<ffffffc0000c240c>] kthread+0xdc/0xf0
[    4.400516] [<ffffffc000084f90>] ret_from_fork+0x10/0x40
[    4.400619] ---[ end trace d8f373c7ca73e141 ]---

My dtsi is as follows

#define CAM0_RST_L	TEGRA_MAIN_GPIO(R, 5)
    
/ {
    host1x {
        /* vi_base: vi */
		vi@15700000 {
            status = "okay"; 
            num-channels = <1>;
    
            ports {
                #address-cells = <1>;
                #size-cells = <0>;
                status = "okay"; 
    
                vi_port0: port@0 {
                    status = "okay";
                    reg = <0>;
    
                    tc358743_vi_in0: endpoint {
                        status = "okay"; 
                        csi-port = <0>; /* CSI-A */
                        bus-width = <4>; /* Use CSI-A */
                        remote-endpoint = <&tc358743_csi_out0>;
                    };
                };                
            };
        };

		nvcsi@150c0000 {
			num-channels = <1>;
			#address-cells = <1>;
			#size-cells = <0>;
    
			channel@0 {
                status = "okay";
				reg = <0>;
				ports {
					#address-cells = <1>;
					#size-cells = <0>;
					port@0 {
                        status = "okay";
						reg = <0>;
						tc358743_csi_in0: endpoint@0 {
                            status = "okay"; 
							csi-port = <0>;
							bus-width = <4>;
							remote-endpoint = <&tc358743_out0>;
						};
					};
					port@1 {
                        status = "okay"; 
						reg = <1>;
						tc358743_csi_out0: endpoint@1 {
                            status = "okay"; 
							remote-endpoint = <&tc358743_vi_in0>;
						};
					};
				};
			};
		};
    };

    /* does this map to i2c bus 2? */
  	i2c@3180000 { 
    	status = "okay";

            /* The following is based on the example discussed here : https://devtalk.nvidia.com/default/topic/1011640/tc358743-on-tx2 */
            tc358743@0f {
                status = "okay";
                compatible = "toshiba,tc358743";
                reg = <0x0f>; /* shifted by 2 */

                /* clocks copied from tegra186-quill-camera-li-mipi-adpt-a00.dtsi */
                clocks = <&tegra_car TEGRA186_CLK_EXTPERIPH1>;
				clock-names = "extperiph1";
    
                refclk_hz = <27000000>;

                reset-gpios = <&tegra_main_gpio CAM0_RST_L GPIO_ACTIVE_HIGH>;

                /* Physical dimensions of sensor */
                physical_w = "4.713";
                physical_h = "3.494";

                /* Sensor Model */
                sensor_model ="tc358743";

                ports {
                    #address-cells = <1>;
                    #size-cells = <0>;

                    port@0 {
                        status = "okay"; 
                        reg = <0>;
                        tc358743_out0: endpoint {
                            status = "okay"; 
                            csi-port = <0>; 
                            data-lanes = <1 2 3 4>;
                            clock-lanes = <0>;
                            clock-noncontinuous;
                            bus-width = <4>; 
                            link-frequencies = /bits/ 64 <297000000>;
                            remote-endpoint = <&tc358743_csi_in0>;
                        };
                    };
                };
            };
        };
};


/ {
    
    tcp: tegra-camera-platform {
	        status = "okay";
		compatible = "nvidia, tegra-camera-platform";

                /* from tegra186-camera-li-mipi-adpt-a00.dtsi */
             	num_csi_lanes = <4>;
		max_lane_speed = <1500000>;
		min_bits_per_pixel = <10>;
		vi_peak_byte_per_pixel = <2>;
		vi_bw_margin_pct = <25>;
		isp_peak_byte_per_pixel = <5>;
		isp_bw_margin_pct = <25>;

	    	modules {
			module0 {
                                status = "okay";
                                badge = "tc358743_top_i2c1";
				position = "top";
				orientation = "3";
				drivernode0 {
                                        status = "okay";
					pcl_id = "v4l2_sensor";
					devname = "tc358743 30-000f";
					/* Declare the device-tree hierarchy to driver instance */
					proc-device-tree = "/proc/device-tree/i2c@3180000/tc358743@0f";

				};
			};
		};
	};
};

Do you have any advice as to how I can go about debugging this fault?

Thanks,

Why the bus is 30? What APP to run to got kernel dump.

devname = "tc358743 30-000f";

Bus 30 was in the example. I have tried both setting devname to “tc358743 30-000f” and “tc358743 02-000f”, without any change in behavior.

Most of this error cause by the DT incorrect. Please check.

I’ve carefully checked the device tree and tried many permutations of the DT configuration without any progress. Would you please look at the device tree that I included above and let me know what is wrong with it.

Have you validated that the device tree changes you made are actually installed? Take a look in “/proc/device-tree/”. I mention this because the procedure for changing the device tree has changed with each recent release.

The kernel OOPS is occurring in:

drivers/media/platform/tegra/camera/vi/channel.c:
static int tegra_channel_sensorprops_setup(struct tegra_channel *chan)

I think the problem is that the tc358743 driver doesn’t appear to implement modes. This driver seems so different from the NVidia examples that I’m wondering if the most reasonable solution is to write a new tc358743 driver based on the imx185 driver. Any suggestions? From what I’ve read on the forums people have been able to get some version of the tc358743 driver working, but specifics are very unclear. I’ve tried all of the different versions I could find, but they all appear to be missing the mode functionality.

Modes are discussed in NVIDIA Tegra Linux Driver Package Development Guide 28.2 Release, in the Sensor Driver Programming Guide Section.

I would suggestion to reference to tc358840 for your case. Check the tegra186-camera-imx274.dtsi for tc358840

@Visionear
Have you resolved the problem.
Any further information can be shared?

Thanks a lot!

Dear all,

I’m facing same problem, even though my device tree(dtbs) worked well with tegra-l4t-r28.1 and tegra-l4t-r28.2.0 (nearly one year ago) but it has this problem with tegra-l4t-r28.2.1.

[    9.964455] tegra-vi4 15700000.vi: initialized
[    9.965667] tegra-vi4 15700000.vi: subdev 150c0000.nvcsi-0 bound
[    9.965671] tegra-vi4 15700000.vi: subdev ov10635 2-0030 bound
[    9.965732] Unable to handle kernel NULL pointer dereference at virtual address 00000000
[    9.965734] pgd = ffffffc0014be000
[    9.965738] [00000000] *pgd=000000026cdd6003, *pud=000000026cdd6003, *pmd=000000026cdd7003, *pte=00e8000003881707
[    9.965741] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    9.965744] Modules linked in:
[    9.965747] CPU: 3 PID: 151 Comm: kworker/u12:5 Not tainted 4.4.38+ #1
[    9.965748] Hardware name: quill (DT)
[    9.965754] Workqueue: events_unbound async_run_entry_fn
[    9.965756] task: ffffffc1e6020000 ti: ffffffc1e601c000 task.ti: ffffffc1e601c000
[    9.965761] PC is at camera_common_g_fmt+0x34/0xa8
[    9.965765] LR is at ov10635_get_fmt+0x10/0x18
[    9.965766] pc : [<ffffffc0007af478>] lr : [<ffffffc000781df8>] pstate: 80000045

Any suggestion would be appreciated.

Thanks and Best Regards,
Vu Nguyen

Have compare the tegra186-camera-imx274.dtsi with your DT.

Hi SaneCCC,

Thank for your reply. After adding UYVY format to camera_common.c the system booted up properly but I’m facing the other issue then I created new thread in order to discuss.

Could you please refer:

https://devtalk.nvidia.com/default/topic/1044963/jetson-tx2/can-not-render-camera-preview-with-l4t-r28-2-1/

Thanks and Best Regards,
Vu Nguyen