[BUG] cgf helloworld sample nvsci+interprocess communication hangs with exception error

@SivaRamaKrishnaNV Thanks for your reply!

Yes, it’s ok for you to change the case from A to E.

May I ask the driveos version of your drive-orin-platform? In the last topic I saw you just use the DO6090, which has dw516 rather than the developer version DO6081+dw514. [BUG] sample_cgf_dwchannel from dw5.14 failed with stuck in inter-process-socket communication (nvidia.com)

I will double check the CASE B and Case C, and check the log you post.

Glad to see the issue repro, when should it be resolved, and in which release?

Thanks.

DRIVE 6.0.8.1

When will Case E issue be resolved? In which release?

Thanks.

There is no documentation about how to config the nvsci communication.

That’s why I ask in the forum.

Could you get the sample run based on the code base I shared to you?
All I need is a runnable correct example.

Thanks.

Please take the responsibility to provide the fixed cgf 1:N nvsci communication helloworld example.
@SivaRamaKrishnaNV @VickNV

https://forums.developer.nvidia.com/t/re-bug-cgf-helloworld-sample-nvsci-interprocess-communication-hangs-with-exception-error/296231?u=lizhensheng

That’s the private msg sent to me from @SivaRamaKrishnaNV
@VickNV Do you think that is an approiate response?

Hi,
Today, I received below feedback on case E configuration for core team.
Just two comments from a quick look:
The error message “ChannelNvSciStreamParams: limiter maxPackets index out of range” might be caused by the connection parameter “limits”: -1. I don’t know why that was set in the first place, maybe try it without that key or set to a different (non-negative) value.
The values for “srcEndpoint” and “destEndpoint” look weird. Each should only contain the name of one endpoint - not two separated by a colon.
Can you check the suggestion and update?

Hi @lizhensheng ,
did you try the above suggestion receive from engineering team? Any update can be shared.
It looks to me like case E is not supported. I need to give try before coming to conclusion. I planned to try this week and update the thread.

There is no documentation about how to config the nvsci communication.
If you want me to try, show me the sample telling me how I can edit the json.
The json schema shows no information(just a simple string object), so I really don’t know how would I try editing the json.

It looks to me like case E is not supported.

good point. The engineering team need to take a response to comfirm that.

I need to give try before coming to conclusion. I planned to try this week and update the thread.

appreciated, thanks.

Hi @lizhensheng ,
I tried below config changes and seems to work. could you confirm?

nvidia@tegra-ubuntu:~/samples3$ sudo ./bin/cgf_custom_nodes/example/runHelloworld.sh
[sudo] password for nvidia:
20240708_183413
20240708_183413
DW_TOP_PATH=/home/nvidia/samples3
RR_TOP_PATH=/usr/local/driveworks/bin
CGF_SYS_PATH=
RR_LOG_PATH=/home/nvidia/samples3/LogFolder/cgf_custom_nodes/Helloworld
RR_RUN_CFG_PATH=/home/nvidia/samples3/RunFolder/cgf_custom_nodes/Helloworld
DATA_PATH=/home/nvidia/samples3/data/cgf_custom_nodes
Current DISPLAY is :0.0
fs.mqueue.msg_max = 4096
|--> Tuning network stack
LD_LIBRARY_PATH: /home/nvidia/samples3/../../xlab/sysroot/lib:/usr/local/driveworks/bin/lib::/home/nvidia/samples3/lib
total 2920
drwxr-xr-x. 3 root root    4096 Aug 25  2023 .
drwxr-xr-x. 7 root root    4096 Aug 25  2023 ..
-r-xr-xr-x. 1 root root   53320 Jan  1  2000 CGFSyncControlClientLite
-r-xr-xr-x. 1 root root   42600 Jan  1  2000 CGFSyncServer
-r-xr-xr-x. 1 root root 1687256 Jan  1  2000 LoaderLite
drwxr-xr-x. 2 root root    4096 Aug 25  2023 SSM
-r-xr-xr-x. 1 root root   14728 Jan  1  2000 ScheduleManager
lrwxrwxrwx. 1 root root      19 Jan  1  2000 get_ssm_state -> ./SSM/get_ssm_state
-r-xr-xr-x. 1 root root   39872 Jan  1  2000 launcher
-r-xr-xr-x. 1 root root   27904 Jan  1  2000 libFaultHandler.so
-r-xr-xr-x. 1 root root   31096 Jan  1  2000 libRRSEHDetails.so
-r-xr-xr-x. 1 root root    8400 Jan  1  2000 libconnection_helper.so
-r-xr-xr-x. 1 root root  507384 Jan  1  2000 libschedule_manager.so
-r-xr-xr-x. 1 root root  131496 Jan  1  2000 libsensor_sync_server_shared.so
-r-xr-xr-x. 1 root root   53568 Jan  1  2000 sensor_sync_server
-r-xr-xr-x. 1 root root   14336 Jan  1  2000 stm_managed_client
-r-xr-xr-x. 1 root root  336552 Jan  1  2000 stm_master
lrwxrwxrwx. 1 root root      16 Jan  1  2000 vanillassm -> ./SSM/vanillassm
Running command: /usr/local/driveworks/bin/launcher --binPath=/usr/local/driveworks/bin --spec=/home/nvidia/samples3/graphs/cgf_custom_nodes/apps/example/appHelloworld/DWCGFHelloworld.app.json --logPath=/home/nvidia/samples3/LogFolder/cgf_custom_nodes/Helloworld --path=/usr/local/driveworks/bin --datapath=/home/nvidia/samples3/data/cgf_custom_nodes --dwdatapath=/home/nvidia/samples3/data --schedule=/home/nvidia/samples3/bin/cgf_custom_nodes/DWCGFHelloworld__standardSchedule.stm --start_timestamp=0 --mapPath=maps/sample/sanjose_loop --loglevel=DW_LOG_VERBOSE --fullscreen=1 --winSizeW=1280 --winSizeH=800 --virtual=0 --disableStmControlLogger=1 --gdb_debug=0 --app_parameter= --useLCM=0 --memTraceEnabled=1 --stmControlTracing=0 --traceChannelMask=0xFFFFFFFF --traceFilePath=/home/nvidia/samples3/LogFolder/cgf_custom_nodes/Helloworld --traceLevel=0 > /home/nvidia/samples3/LogFolder/cgf_custom_nodes/Helloworld/launcher.log 2>&1
Check if reset NetworkStack needed
Restore LD_LIBRARY_PATH to
=======================================================================
launcher exit status: 0

Graphlet.Json

root@6.0.8.1-0006-build-linux-sdk:/home/nvidia/GAC/nv_driveworks/driveworks-5.14/samples/src/cgf_nodes/graphs/apps/example/appHelloworld# cat DWCGFHelloworld.graphlet.json
{
    "name": "DWCGFHelloworld",
    "inputPorts": {},
    "outputPorts": {},
    "parameters": {
        "paraName": { "type": "std::string", "default": "helloworld_name" }
    },
    "subcomponents": {
        "helloWorldNode": {
            "componentType": "../../../nodes/example/helloworld/HelloWorldNode.node.json",
            "parameters": {
                "name": "$paraName"
            }
        },
        "multipleNode": {
            "componentType": "../../../nodes/example/helloworld/MultipleNode.node.json"
        },
        "sumNode": {
            "componentType": "../../../nodes/example/helloworld/SumNode.node.json"
        }
    },
    "connections": [
        {
            "src": "helloWorldNode.VALUE_0",
            "dests": {
                "sumNode.VALUE_0": {
                    "mailbox": true,
                    "reuse": true
                },
                "multipleNode.VALUE_0": {
                    "mailbox": true,
                    "reuse": true
                }
            }
        },
        {
            "src": "helloWorldNode.VALUE_1",
            "dests": {
                "sumNode.VALUE_1": {
                    "mailbox": true,
                    "srcEndpoint": "nvscistream_4",
                    "destEndpoint": "nvscistream_5",
                    "reuse": true
                },
                "multipleNode.VALUE_1": {
                    "mailbox": true,
                    "srcEndpoint": "nvscistream_6",
                    "destEndpoint": "nvscistream_7",
                    "reuse": true
                }
            },
            "params": {
                "type": "nvsci"
            }
        }
    ]
}
1 Like

Dear @lizhensheng,
Did you get chance to check ? Please update your observation after test.

I will give it a test in a week. Thanks.

1 Like