Creating a custom hello world application and integration - Driveworks CGF

Please provide the following info (tick the boxes after creating this topic):
Software Version
[o] DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
[o] Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-300)
DRIVE AGX Orin Developer Kit (940-63710-0010-200)
DRIVE AGX Orin Developer Kit (940-63710-0010-100)
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
[o] DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.2.10884
[o] other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
[o] native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Hello,
I am trying to follow this documentation: https://developer.nvidia.com/docs/drive/drive-os/6.0.6/public/driveworks-nvcgf/index.html to setup a custom c++ node that logs a simple string using the dw::core::Logger in its PROCESS PASS. However, the documentation to integrate it with the STM and Schedule manager seems to be outdated as unlike the documentation the system description has moved from *.system.json to *.app.json and the default YAML files generated a no longer *.yaml but *__standardSchedule.yaml. I tried running a custom hello world application but I am getting errors on the schedule manager/STM with limited info in the docs to debug these. I am hoping to get some help with this.

I am uploading the log files and source code:
JackIONode.hpp (3.4 KB)
JackIONode.cpp (2.3 KB)
JackIO.stm (4.2 KB)
JackIO.graphlet.json (582 Bytes)
JackIO.app.json (2.5 KB)

JackIO__standardSchedule.yaml (1.3 KB)
JackIONodeImpl.hpp (2.4 KB)
JackIONodeImpl.cpp (2.6 KB)
jackio.node.json (752 Bytes)

LogFolder.zip (581.9 KB)

Thank you for reaching out and providing the relevant files and information. I’ll check with our team about the outdated documentation issue.

To assist you better, could you please provide more details about the specific errors you encountered? Any additional information or error messages you can share will be helpful in understanding the issue and providing appropriate guidance for debugging.

Please peak at the DriveWorks 5.12 documentation, which is compatible with DRIVE OS 6.0.7 (although it’s not a devzone release). I’d like you to check if the documentation issues you identified have been resolved in that version.

the stm_master.log shows this error:

STM][ERROR] Failed to receive mqueue message; errno: 110 (Connection timed out)
[STM ERROR]:[av/stm/runtime/src/master/main.c][main] [689]: Could not receive message from CGF-ScheduleManager. This may be caused by CGF-ScheduleManager having crashed before its call to 'stmScheduleManagerInit()' - please check its health.
av/stm/runtime/src/master/main.c:690 assertion failure, errno=110 (Connection timed out)

The schedule manager shows this and gets stuck at waiting for client…

<15>1 2023-06-19T23:13:27.456838Z - schedule_manager 17256 - - [0us][VERBOSE][tid:1][SocketClientServer.cpp:235][NO_TAG] SocketServer(@port:40100): accepted 127.0.0.1:46973
<15>1 2023-06-19T23:13:27.456877Z - schedule_manager 17256 - - [0us][VERBOSE][tid:1][ChannelSocket.hpp:613][ChannelSocketProducer] Connection for port 40100 accepted; status=DW_SUCCESS(0)
<15>1 2023-06-19T23:13:27.456912Z - schedule_manager 17256 - - [0us][VERBOSE][tid:1][ChannelSocket.hpp:658][ChannelSocketProducer] Connection for port 40100 sent metadata
<13>1 2023-06-19T23:13:27.462672Z - schedule_manager 17256 - - [0us][DEBUG][tid:0][ScheduleManager.cpp:123][ScheduleManager] waiting for the clients to connect..
<13>1 2023-06-19T23:13:27.472761Z - schedule_manager 17256 - - [0us][DEBUG][tid:0][ScheduleManager.cpp:123][ScheduleManager] waiting for the clients to connect..
<13>1 2023-06-19T23:13:27.482950Z - schedule_manager 17256 - - [0us][DEBUG][tid:0][ScheduleManager.cpp:123][ScheduleManager] waiting for the clients to connect..

and looking at the node process (jack_io_master)

<14>1 2023-06-19T23:13:27.474577Z - jackio_master 17255 - - [1687216407474573us][INFO][tid:scheduleManagerReceiver][ScheduleManagerReceiver.cpp:169][Receiver] [Receiver] connected
<13>1 2023-06-19T23:13:27.474632Z - jackio_master 17255 - - [1687216407474632us][DEBUG][tid:scheduleManagerReceiver][ScheduleManagerReceiver.cpp:87][Receiver] [Sender] semaphore name: /cgf_schedulemanager_semaphore
<12>1 2023-06-19T23:13:44.158081Z - jackio_master 17255 - - [1687216424158074us][WARN][tid:38][HealthService.cpp:222][EPL_Interface] Connection with a SEH x86 Client timed out after 15000 ms. Please check if a SEH x86 Client is launched.
<13>1 2023-06-19T23:13:44.158146Z - jackio_master 17255 - - [1687216424158146us][DEBUG][tid:38][ChannelConnectorImpl.cpp:103][ChannelConnector] ChannelConnector: thread 140717380853760 stopping producer and consumer connect threads
<12>1 2023-06-19T23:13:44.180478Z - jackio_master 17255 - - [1687216424180475us][WARN][tid:parameterServiceLifeCycle][ParameterServerImpl.cpp:335][DynamicParameterServer] Connection with a Dynamic parameter Client timed out after 15000 ms. Please check if a Dynamic Parameter Client is launched.
<13>1 2023-06-19T23:13:44.180489Z - jackio_master 17255 - - [1687216424180489us][DEBUG][tid:parameterServiceLifeCycle][ChannelConnectorImpl.cpp:103][ChannelConnector] ChannelConnector: thread 140716407775232 stopping producer and consumer connect threads
<12>1 2023-06-19T23:13:44.341393Z - jackio_master 17255 - - [1687216424341386us][WARN][tid:38][TopExecutor.hpp:3775][TopExecutor] Failed to connect x86 SEH communication with error DW_TIME_OUT. Please ignore this error message if SEH x86 is not launched with RoadRunner.
<12>1 2023-06-19T23:13:44.477957Z - jackio_master 17255 - - [1687216424477949us][WARN][tid:parameterServiceLifeCycle][TopExecutor.hpp:3749][TopExecutor] Failed to start Dynamic Parameter Service with error DW_TIME_OUT. Please ignore this error message if a Dynamic Parameter Client is not launched with RoadRunner.
<11>1 2023-06-19T23:13:54.649296Z - jackio_master 17255 - - [1687216434649289us][ERROR][tid:rr2_main][TopExecutor.hpp:539][TopExecutor] Caught signal 15 sent by pid 17253
<13>1 2023-06-19T23:13:54.649373Z - jackio_master 17255 - - [1687216434649373us][DEBUG][tid:rr2_main][TopExecutor.hpp:563][TopExecutor] Caught user interruption signal
<13>1 2023-06-19T23:13:54.649390Z - jackio_master 17255 - - [1687216434649390us][DEBUG][tid:rr2_main][TopExecutor.hpp:2407][TopExecutor] TopExecutor: main thread informs stm server to exit schedule
<13>1 2023-06-19T23:13:54.724322Z - jackio_master 17255 - - [1687216434724312us][DEBUG][tid:scheduleManagerReceiver][ScheduleManagerReceiver.cpp:206][Receiver] [Receiver] done exiting!

Meanwhile I will look into the 5.12 documentation. I am currently running the DriveOS 6.0.6 docker container.

Dear @stefan65,
As Vick mentioned, could you check following Compute Graph Framework SDK Reference: Custom Node and Integration .

Thanks for the response,

Yes, that is the tutorial I followed to integrate my custom node into an app.

Could you give some pointers to what the error logs mean for jack_io_master or what could be the possible root causes the schedule manager is waiting for a client to connect even though the 1st line of the jack_io_master logs show that the schedule manager is connected to this client. Thanks.

Also, looking at 5.12 documentation, the command to run the sample application is ’ sudo /usr/local/driveworks/bin/run_cgf_demo.sh but in driveos 6.0.6 this script does not exist. The script given in driveos 6.0.6 for demo pipeline does not work inside the docker container. nvidia-smi shows the GPU and cuda, and the container has access to the display. I haven’t made a custom dockerfile, i’m directly using the container from NGC.

Dear @stefan65,
We have not tested CGF Demo in docker. Could you check the steps in presentation as reference to run the sample on target and give a try and share complete logs in case issues?
CGF-presentation.pdf (1.2 MB)

1 Like

The PDF seems helpful, can you provide the HelloWorld.app.json that is used in the tutorial or the source code in general so that I can run it. Most of it is covered in the PDF except creation of HelloWorld.app.json. Thanks.

there is a open source project ZhenshengLee/nv_driveworks_demo: Nvidia Driveworks Demo with CGF, ROS2 and Docker. (github.com) that has some more example cgf apps.

Star it if it helps. thanks.

2 Likes

Is cgf over Dual DriveOrins supported by DriveSDK 6.0.6?
If so, Could you provide guide for the helloworld app running on Dual DriveOrins?

@VickNV Thanks.

Dear @stefan65,
The PDF has steps to integrate custom node. You can follow the same steps(as DW 5.10 documentation has some doc issues) and give a try. Let us know if you notice any issue.

Dear @lizhensheng,
The Devzone DRIVE OS + DW release is for only Drive Orin devkit.

Dual DriveOrins means two DriveOrinDevkit chained with PCIe.
Can cgf running over two DriveOrinDevkit with inter-machine communication with DriveSDK 6.0.6?

@SivaRamaKrishnaNV

@SivaRamaKrishnaNV @lizhensheng

I am curious about the usage of ROS2 in the repository you sent. Do we really need ROS2 as a middleware to run a complete workflow for a robotic system. Do we have tools from Nvidia that does what ROS2 does?
for example launching (what roslaunch does) and running a complete state machine or behavior tree of a robot

@stefan65

It provides the possibility of inter-operational between dwcgf and pubsub-middleware.

The answer from NVIDIA can be no, but it really depends on your project needs.

Simple answer: no.

Simple answer: launching: no, running a state machine: no.

P.S. There is a comparison table between cgf and ros2 from nvidia official document.

Dear @lizhensheng,
No

1 Like

@SivaRamaKrishnaNV @lizhensheng
Thanks, the examples on the open source repository were helpful. It has solved the issue.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.