[STM ERROR] Thread pools exited with a timeout error

Software Version
DRIVE OS 6.0.6

Target Operating System
[*] Linux

SDK Manager Version
1.9.2.10884

Host Machine Version
native Ubuntu Linux 20.04 Host installed with DRIVE OS DOCKER Containers

!!!
Description:
I test cpu_simple demo in stm/src.Running 16 Runnables instead of 2.As a result, a timeout error occurred.

Code:

int main(int argc, const char** argv)
{
    (void)argc;
    (void)argv;

    stmClientInit("client"); // Needs to be called before registration


    stmRegisterCpuRunnable(test1, "test1", NULL);
    stmRegisterCpuRunnable(test2, "test2", NULL);
    stmRegisterCpuRunnable(test3, "test3", NULL);
    stmRegisterCpuRunnable(test4, "test4", NULL);
    stmRegisterCpuRunnable(test5, "test5", NULL);
    stmRegisterCpuRunnable(test6, "test6", NULL);
    stmRegisterCpuRunnable(test7, "test7", NULL);
    stmRegisterCpuRunnable(test8, "test8", NULL);
    stmRegisterCpuRunnable(test9, "test9", NULL);
    stmRegisterCpuRunnable(test10, "test10", NULL);
    stmRegisterCpuRunnable(test11, "test11", NULL);
    stmRegisterCpuRunnable(test12, "test12", NULL);
    stmRegisterCpuRunnable(test13, "test13", NULL);
    stmRegisterCpuRunnable(test14, "test14", NULL);
    stmRegisterCpuRunnable(test15, "test15", NULL);
    stmRegisterCpuRunnable(test16, "test16", NULL);

Error:

[STM ERROR]:[av/stm/runtime/src/core/synchronization.c][stmCondSemTimedWaitForValue] [122]: CondSem Wait Timed Out. err=4
[STM ERROR]:[av/stm/runtime/src/client/sync.c][stmFenceWait] [69]: Could not wait on stmOsFence f2 in iteration: 1 stmError=4
[STM ERROR]:[av/stm/runtime/src/client/commands/wof.c][opExecuteWof] [28]: [Thread pool1] [Time 1675252631493921152] Fence timeout: f2. Iteration: 1
[STM] Thread pool1 exiting, thread exit count 1, time : 1675252631493929056
[STM ERROR]:[av/stm/runtime/src/core/synchronization.c][stmCondSemTimedWaitForValue] [122]: CondSem Wait Timed Out. err=4
[STM ERROR]:[av/stm/runtime/src/client/sync.c][stmFenceWait] [69]: Could not wait on stmOsFence f2 in iteration: 1 stmError=4
[STM ERROR]:[av/stm/runtime/src/client/commands/wof.c][opExecuteWof] [28]: [Thread pool0] [Time 1675252631493978304] Fence timeout: f2. Iteration: 1
[STM] Thread pool0 exiting, thread exit count 2, time : 1675252631493985088
[STM][ERROR] pthread_join() failed; errno: 2 (No such file or directory)
[STM][ERROR] pthread_join() failed; errno: 2 (No such file or directory)

Question:
1. How can I increase the number of thread pools?
2.The maximum number of thread pools that a client can support is?
3.How to better solve this error?

Dear @lizhaoyi,
I am checking internally on this.
Meanwhile, could you check for how many instances you start noticing timeout issue?

[STM] Initializing sync objects for schedule cpu_test.stm
[STM] STM clients starting scheduling loop.
[STM] Waiting for specified number of epochs to complete or any STM client to terminate.
[STM] Starting thread pool0
[STM] Starting thread pool1
[STM] PID 4819, new state STM_STATE_READY, time : 1674983745216529472
[STM] PID 4819, Schedule Id 10, new state STM_STATE_RUNNING, time : 1674983745217557216
Inside test 3: 4824
Inside test 5: 4824
Inside test 7: 4824
Inside test 9: 4824
Inside test 1: 4822
Inside test 4: 4822
Inside test 6: 4822
Inside test 8: 4822
Inside test 11: 4822
Inside test 9: 4822
Inside test 10: 4824
Inside test 9: 4824
Inside test 2: 4822
[STM ERROR]:[av/stm/runtime/src/core/synchronization.c][stmCondSemTimedWaitForValue] [122]: CondSem Wait Timed Out. err=4
[STM ERROR]:[av/stm/runtime/src/client/sync.c][stmFenceWait] [69]: Could not wait on stmOsFence f2 in iteration: 1 stmError=4
[STM ERROR]:[av/stm/runtime/src/client/commands/wof.c][opExecuteWof] [28]: [Thread pool1] [Time 1674983750217940928] Fence timeout: f2. Iteration: 1
[STM] Thread pool1 exiting, thread exit count 1, time : 1674983750217947840
[STM ERROR]:[av/stm/runtime/src/core/synchronization.c][stmCondSemTimedWaitForValue] [122]: CondSem Wait Timed Out. err=4
[STM ERROR]:[av/stm/runtime/src/client/sync.c][stmFenceWait] [69]: Could not wait on stmOsFence f2 in iteration: 1 stmError=4
[STM ERROR]:[av/stm/runtime/src/client/commands/wof.c][opExecuteWof] [28]: [Thread pool0] [Time 1674983750217990784] Fence timeout: f2. Iteration: 1
[STM] Thread pool0 exiting, thread exit count 2, time : 1674983750217996992
[STM][ERROR] pthread_join() failed; errno: 2 (No such file or directory)
[STM][ERROR] pthread_join() failed; errno: 2 (No such file or directory)

I checked.There are 11 instances before timeout.@SivaRamaKrishnaNV

yaml

Runnables:
- test1:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test2:
WCET: 10ms
StartTime: 0
Dependencies:
- client.test1
Resources:
- CPU
- test3:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test4:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test5:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test6:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test7:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test8:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test9:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test10:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test11:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test12:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test13:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test14:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test15:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU
- test16:
WCET: 10ms
StartTime: 0
Dependencies:
Resources:
- CPU

Dear @lizhaoyi,
Thanks for checking the number of instances. Can you attach the used input graph (.yaml file) or schedule file (.stm file) and the logs (stm_master.log and framesync_*.txt files)?

Dear @SivaRamaKrishnaNV
There is only .stm file can attach.No logs are generated.
cpu_test.stm (9.2 KB)

Dear @SivaRamaKrishnaNV ,
I’ve fixed this issue.Because the hyperepoch0.Period is set too large.
Just answer my three questions below:

1.How can I increase the number of thread pools?
2.The maximum number of thread pools that a client can support is?
3.What is the maximum valid value of the period for a hyperepoch?

Thanks!

Dear @lizhaoyi,

can I increase the number of thread pools?

STM creates a thread pool for every CPU core defined in the yaml file. So, you can define more CPU cores in the global resources to increase the number of thread pools. For example, in the yaml file you can define:
cpu_simple:
Identifier: 10
Resources:
CPU:

- CPU0
- CPU1
- CPU2
- CPU3

2.The maximum number of thread pools that a client can support is?

The number of thread pools is equal to the number of CPU cores. So if a user defines more CPU cores, STM will create more thread pools. STM doesn’t set a limit on the CPU cores that can be defined in the yaml.

3.What is the maximum valid value of the period for a hyperepoch?

The default fence timeout value for STM is 5s. So, the hyperepoch period should be less than 5s.
If the user wants to set the period to something greater than 5s, the fence timeout value should be changed accordingly by using stm_master’s -t flag.

1 Like

Dear @SivaRamaKrishnaNV ,

I see.Thank you very much for your answer.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.