Jetpack 4.6: Maximum stacksize limited to 1024 causing stackoverflow in gcc

Hello,

On Jetpack 4.6, during compilation of our code we ran into some stackoverflow segfaults when compiling with gcc. After lots of debugging we found that the file /etc/security/limits.conf was no longer the default one from ubuntu18.04. In this file you changed the stacksizes to 1024 while the default is 8192.
Why is this? And if we change this back to the default, will it cause issues for us?

@jan.verschaeren
Due to performance change it to 1M

    l4t: release: set stacksize to 1 MBytes

    Our driver should declare less stacksize requirement to reserve memory
    usage if the app is calling mlockall() for the consideration of
    performance.

    The current stacksize is 8M

    Bug 3282372

    
diff --git a/rfs/etc/systemd/nvfb.sh b/rfs/etc/systemd/nvfb.sh
index 7d5dacb..69c27b2 100755
--- a/rfs/etc/systemd/nvfb.sh
+++ b/rfs/etc/systemd/nvfb.sh
@@ -1,7 +1,7 @@
 #!/bin/bash

 #
-# Copyright (c) 2016-2020, NVIDIA CORPORATION.  All rights reserved.
+# Copyright (c) 2016-2021, NVIDIA CORPORATION.  All rights reserved.
 #
 # Redistribution and use in source and binary forms, with or without
 # modification, are permitted provided that the following conditions
@@ -47,6 +47,16 @@ if [ -e "/lib/systemd/system/ssh.service" ]; then
        done
 fi

+# Set default stacksize to 1 MBytes
+{
+    echo "nvidia hard stack 1024"
+    echo "nvidia soft stack 1024"
+    echo "ubuntu hard stack 1024"
+    echo "ubuntu soft stack 1024"
+    echo "root hard stack 1024"
+    echo "root soft stack 1024"
+} >> "/etc/security/limits.conf"
+

Does some nvidia-provided program or library use ‘mlockall’ ?

Looks like it’s kernel API.

https://man7.org/linux/man-pages/man2/mlock.2.html

I know ‘mlockall’ is a kernel API (aka system-call). What I wonder and ask is : why did Nvidia change the ‘ulimit -s’ value. Does it make some nvidia-provided binary (which one ?) work better or impact less the rest of the system ?

Some MM and NVGPU libs impact with this patch.

Could those libraries (which ones, precisely ?) implement the ulimit themselves inside their code, instead of applying such a general “fix”. That “fix” is like using a hammer to kill a fly.

Below are possible libs.

- libtegrav4l2.so
- libv4l2_nvvideocodec.so
- libnvmmlite*.so
- libnvmm*.so
- libnvparser.so

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.