Bug report: `nvv4l2h264enc` fails in Docker on Jetson Orin NX (JetPack 6.2.2); root cause is missing `kmod` / `lsmod`

Summary: Hardware H.264 encoding via GStreamer nvv4l2h264enc works on the Jetson host but fails inside NVIDIA DeepStream L4T containers on JetPack 6.2.2. Investigation shows libnvtvmr.so shells out to lsmod | grep nvgpu during encoder initialization. When lsmod is absent (no kmod package in the image), the check fails and NVENC context creation aborts with a generic error.

Workaround: apt-get install -y kmod (or add kmod to the container image).


Product / scope

Area Details
Platform Jetson Orin NX (Engineering Reference Developer Kit)
BSP JetPack 6.2.2 (L4T R36.5.0, kernel 5.15.185-tegra, OOT kernel modules)
Container image nvcr.io/nvidia/deepstream-l4t:7.1-samples-multiarch
Runtime Docker with --runtime=nvidia --privileged
Component GStreamer nvv4l2h264enc / NVMM (libnvtvmr.so)

Symptom

Host: The following pipeline completes successfully:

gst-launch-1.0 videotestsrc ! nvvidconv ! 'video/x-raw(memory:NVMM), framerate=5/1' \
! nvv4l2h264enc ! fakesink

Expected log excerpt includes NvMMLiteOpen, ===== NvVideo: NVENC =====, and normal preroll.

Container (same command, same board, privileged NVIDIA runtime): Pipeline fails before NVENC initializes:

Opening in BLOCKING MODE
ENC_CTX(0xffff88008460) Error in initializing nvenc context
ERROR: from element /GstPipeline:pipeline0/nvv4l2h264enc:nvv4l2h264enc0:
Could not get/set settings from/on resource.
Device is in streaming mode

There is no NvMMLiteOpen / NvVideo: NVENC sequence; initialization stops earlier in the stack.


What was ruled out

The following were verified not to explain the failure:

  1. Device nodes — Present in-container with host-matching major:minor, including /dev/nvmap, /dev/host1x-fence, /dev/dri/renderD128, /dev/nvgpu/igpu0/*, /dev/v4l2-nvenc.

  2. NVIDIA user libraries — Bind-mounted from host via nvidia-container-runtime; checksums matched for libnvtvmr.so, libnvrm_host1x.so, libtegrav4l2.so, libv4l2_nvvideocodec.so, libnvmmlite_video.so, etc.

  3. DRM / host1x — Direct test opening /dev/dri/renderD128 and issuing DRM_IOCTL_TEGRA_CHANNEL_OPEN for NVENC class (0x21) succeeded on both host and container (context=1, version=35, capabilities=1).

  4. Permissions / cgroups — With --privileged, seccomp, AppArmor, and cgroup device filtering are not blocking access; NVIDIA_VISIBLE_DEVICES=all did not change behavior.

  5. sysfs / SoC identity — Relevant sysfs paths readable; SoC reports as Tegra234 (Orin), soc_id=35.


Investigation

IOCTL comparison (host vs container)

strace -f -e trace=ioctl showed a large gap in DRM activity during encoder bring-up:

Metric Host Container
Total ioctl calls ~2016 ~1644
DRM_IOCTL calls ~236 ~36
DRM_IOCTL_TEGRA_CHANNEL_OPEN Called Never called

On the host, the encoder-related path opens a Tegra DRM channel to NVENC. In the container, the stack queries DRM_IOCTL_VERSION on render nodes but does not proceed to channel open; initialization aborts earlier.

Thread-level behavior (strace -f)

The thread responsible for encoder initialization was traced on both environments. On the host it proceeds from NvMMLiteOpen into DRM channel setup. In the container it spawned child processes via clone() + execve() and then failed immediately.

Child processes observed in-container:

  1. lsmod — exit code 127 (command not found).

  2. grep nvgpu — read an empty pipe, non-zero exit.

Inferred call chain

The NVMM / video resource path appears to invoke a shell pipeline equivalent to:

lsmod | grep nvgpu

to confirm the nvgpu kernel module is loaded before NVENC context creation. When lsmod is missing (package kmod not installed in the image), the pipeline fails and the library treats the GPU stack as unavailable.

Approximate dependency chain (observed via behavior and tracing):


libnvv4l2.so
    → libv4l2_nvvideocodec.so
           → libnvtvmr.so (Tegra Video Resource Manager)
                   → shell-out: lsmod | grep nvgpu

The failure mode is absence of the lsmod userspace binary, not missing kernel module or inaccessible devices (the module is loaded on the host kernel shared by the container).


Root cause

The container image does not install the kmod package, so /sbin/lsmod (or equivalent) is not present. NVIDIA’s closed-source libnvtvmr.so relies on that utility during encoder initialization instead of reading /proc/modules or another in-process API. A failed shell-out leads to aborted NVENC init and the generic GStreamer error above.

Why it shows up on JetPack 6.x containers: On older JetPack 5.x–era base images, transitive dependencies often pulled in kmod. The DeepStream L4T 7.1 image for JetPack 6.x uses a leaner base where kmod may be absent. The Jetson host filesystem always includes kmod from the Ubuntu/L4T rootfs, so the issue is container-specific.

Observability: stderr from the failed lsmod invocation may be easy to miss (e.g. mixed with plugin-scanner noise or suppressed). The runtime error (Error in initializing nvenc context) does not mention missing lsmod.


Resolution / workaround

Install kmod inside the container (or bake it into the image):

apt-get update && apt-get install -y kmod

Dockerfile example:

RUN apt-get update && apt-get install -y kmod && rm -rf /var/lib/apt/lists/*

After installation, the same GStreamer pipeline exhibits the same successful initialization sequence as on the host (NvMMLiteOpen, NvVideo: NVENC, preroll, etc.).


Recommendations (for NVIDIA and integrators)

  1. Documentation: Call out kmod (or guaranteed presence of lsmod) as a dependency for Jetson multimedia / NVENC in container images, or document the lsmod | grep nvgpu check in libnvtvmr.so if that is intentional.

  2. Images: Consider adding kmod as an explicit dependency in official L4T / DeepStream container bases where NVMM/NVENC is supported.

  3. Robustness: Prefer checking /proc/modules or an ioctl/sysfs-based probe instead of shelling to lsmod, to avoid fragile container minimal images and silent failures.

  4. Diagnostics: If gst-plugin-scanner or similar logs show lsmod: not found, treat it as a likely hard failure for later NVENC/NVMM use in that image.


Reproduction checklist (for verification)

  • Jetson Orin NX, JetPack 6.2.2 (L4T R36.5.0).

  • Pull nvcr.io/nvidia/deepstream-l4t:7.1-samples-multiarch.

  • Run container: --runtime=nvidia --privileged (plus usual device/GPU env if required by your setup).

  • Confirm which lsmod is empty or lsmod returns 127 before fix.

  • Run the gst-launch-1.0 pipeline above; expect NVENC init failure.

  • apt-get install -y kmod; rerun pipeline; expect success.

I do not know who is accountable for such a fragile implementation but it is not what people usually expect from the system software. You should not deliver a system lib which requires “lsmod” utility to work. Consider re-implementing with /proc/modules. What is also a question, how it passed your quality gates.

Thank you for your feedback.

  1. We only recommend running DS-7.1 on JP-6.1, as stated in the documentation. We have found some compatibility issues between JP-6.2.1/6.2.2 and DS-7.1.

2.This issue is unrelated to whether lsmod is installed in the container, nor is it related to libnvtvmr.so, as libnvtvmr.so does not call lsmod | grep nvgpu.

In fact, I specifically upgraded JP-6.2.2 to verify this issue, and it only requires adding one environment variable to work.

export AARCH64_IGPU=1
docker run -it --rm --runtime nvidia  -e AARCH64_IGPU=$AARCH64_IGPU nvcr.io/nvidia/deepstream-l4t:7.1-samples-multiarch

Hello. lsmod is invoked. This is obvious, and invoked only when nvenc is in place:

gst-launch-1.0 videotestsrc ! nvvidconv ! 'video/x-raw(memory:NVMM), framerate=5/1' \
! nvv4l2h264enc ! fakesink

The problem is only specific to nvv4l2h264enc element.

docker run --runtime nvidia --gpus all -it --rm nvcr.io/nvidia/deepstream-l4t:7.1-samples-multiarch

==========
== CUDA ==
==========

CUDA Version 12.6.11

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

root@6059186c5e01:/opt/nvidia/deepstream/deepstream-7.1# gst-launch-1.0 videotestsrc ! nvvidconv ! 'video/x-raw(memory:NVMM), framerate=5/1' ! nvv4l2h264enc ! fakesink

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.337: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstmplex.so': libmjpegutils-2.1.so.0: cannot open shared object file: No such file or directory

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.429: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstsndfile.so': libFLAC.so.8: cannot open shared object file: No such file or directory

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.471: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstopenmpt.so': libmpg123.so.0: cannot open shared object file: No such file or directory

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.478: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstmpg123.so': libmpg123.so.0: cannot open shared object file: No such file or directory

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.484: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstlame.so': libmp3lame.so.0: cannot open shared object file: No such file or directory

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.513: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstpulseaudio.so': libFLAC.so.8: cannot open shared object file: No such file or directory

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.529: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstfluidsynthmidi.so': libFLAC.so.8: cannot open shared object file: No such file or directory

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.549: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/libgstchromaprint.so': libavcodec.so.58: cannot open shared object file: No such file or directory
sh: 1: lsmod: not found
(Argus) Error FileOperationFailed: Connecting to nvargus-daemon failed: No such file or directory (in src/rpc/socket/client/SocketClientDispatch.cpp, function openSocketConnection(), line 205)
(Argus) Error FileOperationFailed: Cannot create camera provider (in src/rpc/socket/client/SocketClientDispatch.cpp, function createCameraProvider(), line 107)

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.690: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_inferserver.so': libtritonserver.so: cannot open shared object file: No such file or directory

(gst-plugin-scanner:45): GStreamer-WARNING **: 11:50:00.706: Failed to load plugin '/usr/lib/aarch64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_udp.so': librivermax.so.0: cannot open shared object file: No such file or directory
sh: 1: lsmod: not found
Setting pipeline to PAUSED ...
Opening in BLOCKING MODE 
Pipeline is PREROLLING ...
ENC_CTX(0xffff8c0084a0) Error in initializing nvenc context 
Redistribute latency...
Redistribute latency...
ERROR: from element /GstPipeline:pipeline0/nvv4l2h264enc:nvv4l2h264enc0: Could not get/set settings from/on resource.
Additional debug info:
/dvs/git/dirty/git-master_linux/3rdparty/gst/gst-v4l2/gst-v4l2/gstv4l2object.c(3579): gst_v4l2_object_set_format_full (): /GstPipeline:pipeline0/nvv4l2h264enc:nvv4l2h264enc0:
Device is in streaming mode
ERROR: pipeline doesn't want to preroll.
Setting pipeline to NULL ...
Freeing pipeline ...

Even with your extra env suggestion it still invokes lsmod:

root@6a1f730f1a19:/opt/nvidia/deepstream/deepstream-7.1# gst-launch-1.0 videotestsrc ! nvvidconv ! 'video/x-raw(memory:NVMM), framerate=5/1' ! nvv4l2h264enc ! fakesink
sh: 1: lsmod: not found
Setting pipeline to PAUSED ...
Opening in BLOCKING MODE 
Pipeline is PREROLLING ...
Redistribute latency...
NvMMLiteOpen : Block : BlockType = 4 
===== NvVideo: NVENC =====
NvMMLiteBlockCreate : Block : BlockType = 4 
H264: Profile = 66 Level = 0 
NVMEDIA: Need to set EMC bandwidth : 21000 
NvVideo: bBlitMode is set to TRUE 
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
Redistribute latency...
New clock: GstSystemClock
^Chandling interrupt.
Interrupt: Stopping pipeline ...
Execution ended after 0:00:03.530673715
Setting pipeline to NULL ...
Freeing pipeline ...

Adding lsmod to the container fixes the problem even without the environment variable.

This is a warning that will appear in almost all Jetpack releases.

This is merely an issue related to the compatibility between JP-6.2 and DS-7.1; other issues are also known. If stable operation is desired without considering these workarounds, only JP-6.1 is recommended.

The root cause of the nvv4l2h264enc error is that dGPU was incorrectly detected in JP-6.2 on Jetson.

1 Like

Nevertheless, with lsmod installed, everything works successfully, including NVENC, and it is traced down to NVIDIA system libs.