Enabling IOMMU for the GPU on Jetson Xavier NX Dev Kit

Hi everyone!

Can I enable IOMMU for the GPU on Jetson Xavier NX Dev Kit?

I added an “iommus” property in the device tree (tegra194-soc-base.dtsi) like below.
(According to the docs in mainline kernel and nvidia,gv11b.txt in jetson-specific kernel, it is allowed to optionally add “iommus” property.)

tegra_gv11b: gv11b {
	compatible = "nvidia,gv11b";
	#cooling-cells = <2>;
	reg = <0x0 0x17000000 0x0 0x1000000
	       0x0 0x18000000 0x0 0x1000000
	       0x0 0x03b41000 0x0 0x00001000>;
	interrupts = <0 70 0x04
	              0 71 0x04>;
	dma-noncontig;
	interrupt-names = "stall", "nonstall";
	nvidia,host1x = <&host1x>;
	access-vpr-phys;
	power-domains = <&bpmp TEGRA194_POWER_DOMAIN_GPU>;
	clocks = <&bpmp_clks TEGRA194_CLK_GPCCLK>,
		<&bpmp_clks TEGRA194_CLK_GPU_PWR>,
		<&bpmp_clks TEGRA194_CLK_FUSE>;
	clock-names = "gpu", "pwr", "fuse";
	resets = <&bpmp_resets TEGRA194_RESET_GPU>;
	dma-coherent;
	status = "disabled";
	iommus = <&smmu TEGRA194_SID_GPU>;	# <----------
};

The GPU device appears in /sys/kernel/iommu_groups/, but the NVGPU driver fails to boot the device with the error message below.

[   93.431824] nvgpu: 17000000.gv11b     nvgpu_timeout_expired_msg_cpu:112  [ERR]  Timeout detected @ 00000000ed17fdb5 
[   93.432203] nvgpu: 17000000.gv11b           nvgpu_pmu_wait_fw_ready:156  [ERR]  PMU is not ready yet
[   93.432494] nvgpu: 17000000.gv11b               lsfm_int_wpr_region:64   [ERR]  PMU not ready to process requests
[   93.432778] nvgpu: 17000000.gv11b nvgpu_pmu_lsfm_bootstrap_ls_falcon:106  [ERR]  LSF init WPR region failed
[   93.433044] nvgpu: 17000000.gv11b nvgpu_pmu_lsfm_bootstrap_ls_falcon:127  [ERR]  LSF Load failed
[   93.433265] nvgpu: 17000000.gv11b nvgpu_gr_falcon_load_secure_ctxsw_ucode:707  [ERR]  Unable to recover GR falcon
[   93.433525] nvgpu: 17000000.gv11b        nvgpu_gr_falcon_init_ctxsw:152  [ERR]  fail
[   93.433725] nvgpu: 17000000.gv11b            nvgpu_finalize_poweron:951  [ERR]  Failed initialization for: g->ops.gr.gr_init_support
[   93.439122] nvgpu: 17000000.gv11b                 gk20a_power_write:127  [ERR]  power_node_write failed at busy

Thanks!

The error message from L4T 35.1 (5.10.104-tegra) seems to be more informative:

[   32.295489] nvgpu: 17000000.gv11b                 tpc_pg_mask_store:1067 [INFO]  no value change, same mask already set
[   32.394891] mc-err: (255) csr_nvl1r: EMEM address decode error
[   32.396133] mc-err:   status = 0x200000b8; addr = 0x       fff7c0800; hi_adr_reg=0xf
[   32.397297] mc-err:   secure: no, access-type: read
[   32.398424] mc-err: unknown mcerr fault, int_status=0x00000000, ch_int_status=0x00000000, hubc_int_status=0x00000000 sbs_int_status=0x00000000, hub_int_status=0x00000000

[   35.404898] nvgpu: 17000000.gv11b     nvgpu_timeout_expired_msg_cpu:94   [ERR]  Timeout detected @ nvgpu_pmu_wait_fw_ack_status+0xbc/0x130 [nvgpu] 
[   35.406438] nvgpu: 17000000.gv11b           nvgpu_pmu_wait_fw_ready:167  [ERR]  PMU is not ready yet
[   35.408292] nvgpu: 17000000.gv11b               lsfm_int_wpr_region:65   [ERR]  PMU not ready to process requests
[   35.409661] nvgpu: 17000000.gv11b nvgpu_pmu_lsfm_bootstrap_ls_falcon:107  [ERR]  LSF init WPR region failed
[   35.411071] nvgpu: 17000000.gv11b nvgpu_pmu_lsfm_bootstrap_ls_falcon:128  [ERR]  LSF Load failed
[   35.412923] nvgpu: 17000000.gv11b nvgpu_gr_falcon_load_secure_ctxsw_ucode:727  [ERR]  Unable to boot GPCCS
[   35.414330] nvgpu: 17000000.gv11b        nvgpu_gr_falcon_init_ctxsw:159  [ERR]  fail
[   35.415996] nvgpu: 17000000.gv11b           nvgpu_report_err_to_sdl:66   [ERR]  Failed to report an error: hw_unit_id = 0x2, err_id=0x6, ss_err_id = 0x262
[   35.417571] nvgpu: 17000000.gv11b      gr_init_ctxsw_falcon_support:833  [ERR]  FECS context switch init error
[   35.419028] nvgpu: 17000000.gv11b            nvgpu_finalize_poweron:1010 [ERR]  Failed initialization for: g->ops.gr.gr_init_support

[   38.816488] nvgpu: 17000000.gv11b                 gk20a_power_write:127  [ERR]  power_node_write failed at busy

Sorry for the late response, our team will do the investigation and provide suggestions soon. Thanks

1 Like

Hi,

GPU HW is not tied to IOMMU so GPU will not work (as expected) if IOMMU is enabled. It should not be enabled. nvidia,gv11b.txt needs update.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.