Runtime Scheduling error and pthread_create - Nvidia SDK Components

Dear community,

I have a Jetson Orin Nano, running with JetPack 5.1.2.

Every time I run a personal application, I get the following error:
pthread_create returned 1 in file "<file directory>/<name_of_the_program>.cpp"
ETL: nullptr pointer dereference: nullpointer ETL ScopedPtr

This error only happens when running my app if the NVIDIA SDK Components are included in the system (installed when flashing my Jetson). Whenever I flash my Jetson only with the Jetson Linux OS, excluding the SDK Components, I can run my app without getting the pthread_create error.

The only way to solve the issue when the NVIDIA SDK Components are installed is by running the following command:
sudo sysctl -w kernel.sched_rt_runtime_us=-1
Default value is 950000.

This workaround is not optimal for two reasons:

  1. By that I am deactivating the Real-time throttling, which may lead to the system blocking all other tasks and scheduling a faulty task with a CPU load of 100 percent
  2. The changes are not permanent, so I need to change this value everytime I reboot the system in order to run my app again.

Do you know how I can avoid this issue?

Do you know if there is a way to remove the Nvidia SDK components without needing to reflash the system again?

Best regards,

pabimor

Hi pabimor,

Are you using the devkit or custom board for Orin Nano?

You could add that command to the script which would be run automatically during boot up.

Could you try the following command to remove SDK components on your board?

$ sudo apt purge nvidia-jetpack

Dear @KevinFFF ,

Thank you for your early reply. Here some answers to your questions and suggestions, plus some additional information about the tests that I am performing:

I am using the dev kit: NVIDIA Orin Nano Developer Kit, with Linux version 5.10.120-tegra.

True, I think this will have to be the final solution if there is no other workaround, despite the inconvinient mentioned above about taking the risk of, then, not being able to stop a “faulty” program running into an infinite loop.

Thank you for the hint about running the apt purge command. I tried it out, followed by
sudo apt autoremove to clean up all the unused libraries from the Nvidia SDK Components - I attach the apt and system logs - nevertheless, I still have the same issue when runnning my app even after system reboot.

I could figure out that despite running my app as sudo, the system is not granting it with sudo privileges (my app is not able to create a thread). I did a test compiling and running the scrip in C test_pthread_arm64 attached to this entry, which let me test the sudo capability to create a pthread (it should return an error1 if you run the script as a standard user, and it should run successfully if you run it as sudo), but in my case it returns an error1 in any case.

Another test that I wanted to perform is setting the kernel.unprivileged_bpf_disabled to 0 by running the following command:

sudo sysctl -w kernel.unprivileged_bpf_disabled=0

The idea behind this is that maybe after changing that entry, my app could be able to create the threads , but whenever I run that command I get the following message:

sysctl: setting key "kernel.unprivileged_bpf_disabled": Operation not permitted

As a summary, there seems to be an issue with my app’s capability to create threads running it as sudo… and I am a bit clueless about where this can come from…

test_pthread_arm64.txt (1.5 KB)
logs.zip (1.3 MB)

If your issue is about the permission, have you tried to modify the owner/group with chown and also give 777 to your application with chmod?

Or could you provide the detailed steps for me to verify on the devkit locally?

Hi @KevinFFF ,

Excuse me for the late reply. I tried setting the owner/group to root and gave 777 to the application with chmod, but I still get the same error…

Below the description of the steps I followed:

  1. I flashed the Jetson Orin Nano Dev Kit with Jetpack 5.1.2 using the Nvidia SDK Manager in the host PC, and powering the Jetson Orin Nano Dev Kit in “Recovery mode”.
  2. After flashing the dev kit, I removed the jumper from the REC and GND pins, rebooted the system and made sudo apt update and sudo apt upgrade in the dev kit.
  3. Using the ethernet interface to connect the dev kit and the host PC again, I proceeded to install the SDK components and the runtime components. I used the Nvidia SDK Manager as well.
  4. I placed the file test_pthread_arm64 in the Downloads folder and made sudo chmod 777 to the file.
  5. I compiled the file using gcc (I removed the .txt extension before and then ran the following command)
gcc -c test_pthread_arm64
  1. I ran the C program doing sudo ./test_pthread_arm64. The result should be a message saying that the program was able to succesfully create a pthread if you run it with sudo privileges. The program should return “error 1” if it was run by an user without sudo privileges
  2. Nevertheless, the result was “pthread_create error 1” in any case…
  3. I tried then to set the chown to root:root for the test_pthread_arm64 program and run it again as sudo, but still got the same message.

Finally, to proove that those components are causing trouble to the pthread_create function, I reflashed the dev kit but only with the Jetson Linux OS v35.4.1, excluding all the runtime and SDK component from the installation. After flashing the dev kit, I repeated the same procedure as stated above. This time the program returned a successfull message when runnning it as sudo…

I hope this information is useful for you. If you need anything else, please just ask me.

Best regards,
pabimor

test_pthread_arm64.txt (1.5 KB)

In addition, I get the “main run successfully” message from the execution as sudo of the script test_pthread_arm64 if previously I run the following command:

sudo sysctl -w kernel.sched_rt_runtime_us=-1

This is exactly the same behaviour that I get whit my app.

Best regards,
pabimor

It is not supported for the Jetson device with L4T.
If you want to update a Jetson device, please refer to the following instruction.
Updating from the NVIDIA APT Server

Since you could run it after re-flash the board, could you skip this step(run apt upgrade in step2) and verify again?

Hi @KevinFFF ,

I tried but I got the same result…

Nevertheless, after reflashing I could run the program, but just because I just installed the Jetson Linux OS without the Nvidia SDK and runtime components. Once I install those components again, the program is not able to run anymore…

Could you share how you build the source file?
(on jetson or host?)

I run it on the Jetson but gets the following error.

nvidia@Jetson:~$ sudo ./test_pthread_arm64 
./test_pthread_arm64: 5: Syntax error: "(" unexpected

Hi @KevinFFF ,

Sorry that I did not explain that before when I attached the file. Basically the code is this:

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

static void *_Thread(void *arg)
{
    (void)arg;
    printf("Thread running!\n");
    return NULL;
}

int main(void)
{
    int retVal;
    pthread_attr_t attr;
    struct sched_param schedParam;
    pthread_t thread;

    retVal = pthread_attr_init(&attr);
    if (retVal)
    {
        fprintf(stderr, "pthread_attr_init error %d\n", retVal);
        exit(1);
    }

    retVal = pthread_attr_setinheritsched(&attr, PTHREAD_EXPLICIT_SCHED);
    if (retVal)
    {
        fprintf(stderr, "pthread_attr_setinheritsched error %d\n", retVal);
        exit(1);
    }

    retVal = pthread_attr_setschedpolicy(&attr, SCHED_FIFO);
    if (retVal)
    {
        fprintf(stderr, "pthread_attr_setschedpolicy error %d\n", retVal);
        exit(1);
    }

    schedParam.sched_priority = 1;
    retVal = pthread_attr_setschedparam(&attr, &schedParam);
    if (retVal)
    {
        fprintf(stderr, "pthread_attr_setschedparam error %d\n", retVal);
        exit(1);
    }

    retVal = pthread_create(&thread,
                            &attr,
                            _Thread,
                            NULL);
    if (retVal)
    {
        fprintf(stderr, "pthread_create error %d\n", retVal);
        exit(1);
    }

    retVal = pthread_join(thread, NULL);
    if (retVal)
    {
        fprintf(stderr, "pthread_join error %d\n", retVal);
        exit(1);
    }

    printf("main run successfully\n");
    return 0;
}

You can just copy paste it in any file editor and save it as a C source file. Then I just compiled it with:

gcc -c test_pthread_arm64.c

Then you can run it as you did:
sudo ./test_pthread_arm64
or just:
./test_pthread_arm64

Do you run this command on host or on Orin Nano?

I could not run this on Orin Nano devkit.

Hi @KevinFFF ,

Excuse me for the delay in my reply again. I struggled to compile the test_pthread_arm64 program in my Orin Nano. Actually, the right command to be able to compile the program in your Orin Nano dev kit is:

gcc -o test_pthread_arm64 test_pthread_arm64.c -pthread

After -o, the first argument is the name of the execution file you want to output, and the second argument is the name of the C source file. You need to include -pthread at the end so the compiler has a reference to that library. Otherwise, it will not recognise the associated functions.

You do not need to compile it in the host machine, you can directly compile it in the Orin Nano dev kit.

Best regards,
pabimor

1 Like

Do you still hit the pthread_create error after you compile it on Jetson Orin Nano devkit?

Yeah, same result as before unless I reflash the dev kit removing the SDK and runtime components, just letting the Jetson Linux OS to be installed…

I could reproduce the issue as yours on the devkit.

It seems the issue coming from attr for scheduling.

Could you help to verify if the following modification work for your case?

- retVal = pthread_create(&thread, &attr, _Thread, NULL);
+ retVal = pthread_create(&thread, NULL, _Thread, NULL);

Hi @KevinFFF ,

Thanks for the hint. Removing attr from the pthread_create function solves the issue. The problem is that in my app, I need to set some attibutes to my process… Do you have any idea why adding attributes to the process is generating that conflict with the Nvidia runtime and SDK components?

May I know what’s your use case? What do you want to add in attribute?

I’m not clear about the exact package affecting this issue.

But we don’t suggest running the above command, which might occupy the resource of your CPU.

Hi @KevinFFF ,

Sure, the point is that my customer does not want my application to be run on their PC by root user, so I have to create a different user, which will “own” the process from my application. The solution is to provide my application with some capabilities, which are given through this attribute. If I cannot set any attribute, I cannot set any capability to my process, and therefore I cannot run my app with a normal user. Hope the explanation is clear enough.

Would it be possible for you to find out which packages are affecting the pthread creation? I have absolutely no clue, but if I had to point out any of them, I would say maybe CUDA is involved.

May I know what’s your use case? What do you want to add in attribute?

Right, I agree with you. That’s why I would appreciate some feedback from NVIDIA about that topic, so we can find a better solution to avoid the pthread_create issue. For the moment, setting this parameter value to -1 is the only workaround to run my app on the Jetson Orin Nano…

Could you use different user/group for the permission issue?
It seems the issue from user space so that you would find out the solution for your use case.

There are too many packages included in SDK components.
Maybe you could try to install/remove them one-by-one to investigate this issue further.

This is kind of “out there”, but if you want more information, perhaps you could put an assert statement on the return of the offending pthread_create, and then run with strace. The strace log tells you what the system calls are, and if the trace stops right at the point of a failed thread creation, then the log you will be interested in will have its output within the last few hundred lines or so. Possibly it won’t tell you anything, but it is a simple way to find out what the system calls think.

You would probably use “strace -v -s 128” (the “-s 128” increases the max string length to 128; you want to avoid strings that are cut too short since it makes reading the logs more difficult; in fact you might end up changing it to something like 1024 and only saving the last 100 lines of the log).

Note: strace has a “-u user” option so you can use sudo on strace while the end user running the program is someone else.

1 Like