NVMAP_IOC_WRITE failed: Interrupted system call

kko-smol · October 11, 2018, 1:17pm

Hello!
We use Jetson TX2 for capture and preprocess 6 cameras and send frames over network.
Now I try to run JPEG or H264 encoding and faced with problem:
when I run 6 cameras, I got messages in console:
“NVMAP_IOC_WRITE failed: Interrupted system call”
When this messages appears - image come broken(In JPEG - part of data at end lost. In H264 - image got artifacts until next I-frame)
When I run single camera - all ok.
Looks like some goes wrong under high load.
I attached htop screenshot when 6 application instances(per camera) run.
Each instance capture frames via V4L, do cuda processing, send image over network, convert colors to YUV420, encode to H264 or JPEG and send encoded image over network,
I found place in kernel sources, where error occurs(rw_handle in nvmap_ioctl.c), but dont know how fix it.

DaneLLL · October 12, 2018, 1:59am

Hi kko-smol,
Do you use r28.2.1?
Do you run 6 Bayer sensors? Or YUV sensors? Or USB cameras?
Do you use MMAPIs or gstreamer pipelines?

kko-smol · October 12, 2018, 6:12am

r28.2 (as I see 28.2.1 and 28.2 have same kernel)
YUV sensors connected via CSI.
I use your wrappers from tegra_multimedia_samples for MMAPI
Captute(USERPTR) to buffer from cudaMallocManaged for sharing with cuda
JPEG Encode: Convert from userptr(capture buffer) to MMapped buffer, then jpeg encodeFromFd
H264 Encode: convert from userptr(capture buffer) to userptr(of buffer from NvBufferCreateEx) and encode from userptr(NvBufferCreateEx) to mmapped buffer

DaneLLL · October 12, 2018, 7:16am

Hi kko-smol,
Please share how you connect the 6 cameras to CSI ports(A B C D E F port).
Are you able to run 6 cameras simultaneously via ‘v4l2-ctl’ commands?
Also can you try and check if you can run 6 cameras via 12_camera_v4l2_cuda sample? For using HW encoders, we suggest allocate NvBuffers.

kko-smol · October 12, 2018, 10:45am

At each port connected same cameras via 2 CSI-lanes. All ports are used.

Yes, we able run all 6 cameras simultaneously. When we capture images and send it over network - all works fine.
When I added cuda processing - it works too.

12_camera_v4l2_cuda sample cannot run: we have no connected display and work over ssh

root@jetson:~/tegra_multimedia_api/samples/12_camera_v4l2_cuda# ./camera_v4l2_cuda -d /dev/video0 -s 1280x1080 -f UYVY -c -v
INFO: camera_initialize(): (line:249) Camera ouput format: (1280 x 1080)  stride: 2560, imagesize: 2764800
No protocol specified
[ERROR] (NvEglRenderer.cpp:97) <renderer0> Error in opening display
[ERROR] (NvEglRenderer.cpp:152) <renderer0> Got ERROR closing display
ERROR: display_initialize(): (line:261) Failed to create EGL renderer
ERROR: init_components(): (line:286) Failed to initialize display
ERROR: main(): (line:530) Failed to initialize v4l2 components
nvbuf_utils: dmabuf_fd 0 mapped entry NOT found
nvbuf_utils: Can not get HW buffer from FD... Exiting...
App run failed

kko-smol · October 15, 2018, 9:34pm

I made some research about problem.
I found, that frame are copied inside driver when i use USERPTR. And sometimes copy_from_user cannot copy data and return not 0.

I changed memory-type of converter output_plane to MMAP and try copy frame in my application.
Like in previous situation, I sometimes get error if CPU heavy loaded. But now application failed with SIGSEGV, at “memcpy () at …/sysdeps/aarch64/memcpy.S:157” when it copy frame from camera buffer to MMAP-allocated NvBuffer from NvVideoConverter. Something strange occurs with buffer or mmu subsystem. 6 frames before this, copying done successful, but now copy from same address to same address failed

DaneLLL · October 16, 2018, 2:04am

Hi kko-smol,
Here is a sample demonstrating V4L2 camera → NvBuffer(fd) → VIC → NvVideoEncoder
[url]tegra_multimedia_API:dq buffer from encoder output_plane can not completed - Jetson TX2 - NVIDIA Developer Forums

If you don’t need NVEglRenderer, please remove it from the sample.

kko-smol · October 17, 2018, 12:22pm

Yes, this samples work when I run 6 cameras. But this sample not use cuda processing and not send data over network. i.e. not makes full system load.

Now I rewrote my app based on this sample:
capture to dmabuf, create EglImage and CUeglFrame for each capture buffer. Cuda result buffer allocated with cudaMallocManaged.

Now pipeline:

cam->dmabuf->cuda(cuEglFrame) -|->(cuda result) -> network
                               |->(source, captured dmabuf)->NvBufferTransform->NvVideoEncoder->Network

if I run one camera - it works fine(i.e. no errors while test)
when I run 6 cameras - after some seconds (10-20) of normal work, i see errors in some instances. Now from cuda:

cuCtxSynchronize Error: driver shutting down

cuCtxSynchronize call placed before and after call cuda kernel

If run 4 cameras - they work 5-10 minutes and one of instance gets same error.

Maybe is important: we use 10Gbit PCI-E card and send ~1Gbit/s per camera. It make significant load in kernel-time. Are jetson drivers stable in this conditions?

DaneLLL · October 18, 2018, 2:30am

Hi kko,
We have CUDA post-processing in 12_camera_v4l2_cuda:

static bool
cuda_postprocess(context_t *ctx, int fd)
{
    if (ctx->enable_cuda)
    {
        // Create EGLImage from dmabuf fd
        ctx->egl_image = NvEGLImageFromFd(ctx->egl_display, fd);
        if (ctx->egl_image == NULL)
            ERROR_RETURN("Failed to map dmabuf fd (0x%X) to EGLImage",
                    ctx->render_dmabuf_fd);

        // Running algo process with EGLImage via GPU multi cores
        HandleEGLImage(&ctx->egl_image);

        // Destroy EGLImage
        NvDestroyEGLImage(ctx->egl_display, ctx->egl_image);
        ctx->egl_image = NULL;
    }

    return true;
}

So are yo able to reproduce the issue if you run six cameras in six 12_camera_v4l2_cuda processes? Launch one camera in one process and run simultaneous 6 processes.

Topic		Replies	Views
cannot run example: 12_camera_v4l2_cuda Jetson TX2	10	3011	October 18, 2021
capture + encode on TX2 Jetson TX2	10	2108	October 18, 2021
tegra multimedia samples not working properly Jetson TX2	13	3055	October 18, 2021
NVMAP_IOC_READ error when using hardware decoding Jetson AGX Xavier encoder	9	1456	October 18, 2021
Capture video in H264 file using V4L2 and NVIDIA Multimedia APIs with UYVY sensor (e-CAM130_CUXVR) Jetson AGX Xavier	6	1145	October 18, 2021
NVBufUtils DMABUF_FD not found Jetson TX2 hw , encoder	4	1757	October 18, 2021
Mpeg encoding on Jetson Nano for UYVY video Jetson Nano encoder	10	1266	October 18, 2021
TX2 Camera convert/encode using Multimedia API issue Jetson TX2 camera , encoder	16	2035	October 18, 2021
Take pictures with CSI camera on TX2 Jetson TX2	7	2831	June 5, 2018
[RCE] ERROR: camera-ip/vi5/vi5.c:745 [vi5_handle_eof] Jetson Xavier NX camera	16	970	August 18, 2023

NVMAP_IOC_WRITE failed: Interrupted system call

Related topics