How to establish NvSciC2cPcie communication using buffers allocated to GPU

kizaki · May 18, 2023, 9:03am

Please provide the following info (tick the boxes after creating this topic):
Software Version
DRIVE OS 6.0.6
DRIVE OS 6.0.5
DRIVE OS 6.0.4 (rev. 1)
DRIVE OS 6.0.4 SDK
other

Target Operating System
Linux
QNX
other

Hardware Platform
DRIVE AGX Orin Developer Kit (940-63710-0010-D00)
DRIVE AGX Orin Developer Kit (940-63710-0010-C00)
DRIVE AGX Orin Developer Kit (not sure its number)
other

SDK Manager Version
1.9.2.10884
other

Host Machine Version
native Ubuntu Linux 20.04 Host installed with SDK Manager
native Ubuntu Linux 20.04 Host installed with DRIVE OS Docker Containers
native Ubuntu Linux 18.04 Host installed with DRIVE OS Docker Containers
other

Hello nvidia expert:

I’m trying to establish NvSciC2cPcie communication, and stream buffers allocated to GPU between the producer and consumer on different SoCs.
In this case, it seems to fail to reconcile the buffers allocated to the GPU.
Is this use case feasible in DriveOS 6.0.6?
If it is feasible, I wonder if you could tell me how to implement it?

The following is how I tried it.

The Linux kernel module was inserted and the PCIe was hot-plugged, referring to Chip to Chip Communication.
The sample application (multi-process cuda/cuda stream with one consumer on another SoC) was run, referring to NvSciStream Performance Test Application.

On chip s0:
```
$ ./nvscistream_event_sample -P 0 nvscic2c_pcie_s0_c5_1 -Q 0 f
```
On chip s1
```
$ ./nvscistream_event_sample -C 0 nvscic2c_pcie_s0_c6_1 -F 0 3
```
Then, “abbort” occurs in chip s0.
This is due to the reason that NvSciBufAttrListReconcile() at line 154 in drive-linux/samples/nvsci/nvscistream/event/block_pool.c failed because NvSciCommonPanic occurred.
I used default shipped sample application.

VickNV · May 19, 2023, 2:32pm

Please see if PCIe Hot-Plug not working has anything to do with this. Thanks.

kizaki · May 22, 2023, 8:40am

Dear @VickNV ,

I could see that PCIe Hot-Plug works with the use case of streaming buffers allocated CPU between different SoCs.
I checked PCIe Hot-Plug not working, but it doesn’t seem to be related to it.
It seems to make a difference whether it is CPU or GPU that allocates to the buffer.

How to check it is described below.
On chip s0:

./test_nvscistream_perf -P 0 nvscic2c_pcie_s0_c5_1 -l -b 12.5 -f 10

On chip s1:

./test_nvscistream_perf -C 0 nvscic2c_pcie_s0_c6_1 -l -b 12.5 -f

Thanks.

VickNV · May 22, 2023, 7:47pm

kizaki:

The sample application (multi-process cuda/cuda stream with one consumer on another SoC) was run, referring to NvSciStream Performance Test Application .

On chip s0:
$ ./nvscistream_event_sample -P 0 nvscic2c_pcie_s0_c5_1 -Q 0 f
On chip s1
$ ./nvscistream_event_sample -C 0 nvscic2c_pcie_s0_c6_1 -F 0 3
Then, “abbort” occurs in chip s0.
This is due to the reason that NvSciBufAttrListReconcile() at line 154 in drive-linux/samples/nvsci/nvscistream/event/block_pool.c failed because NvSciCommonPanic occurred.
I used default shipped sample application.

kizaki:

How to check it is described below.
On chip s0:

./test_nvscistream_perf -P 0 nvscic2c_pcie_s0_c5_1 -l -b 12.5 -f 10

On chip s1:

./test_nvscistream_perf -C 0 nvscic2c_pcie_s0_c6_1 -l -b 12.5 -f

Please clarify which application you’re using? nvscistream_event_sample or test_nvscistream_perf? Please share the commands you executes with the full outputs of them. Thanks.

kizaki · May 23, 2023, 10:54am

Dear @VickNV ,

There is an error in my above. I am so sorry it’s confusing.

I would like to work nvscistream_event_sample, described on NvSciStream Sample Application .
The commands executed with full output are as follows.

On chip s0:

$ ./nvscistream_event_sample -P 0 nvscic2c_pcie_s0_c5_1 -Q 0 f
Aborted (core dumped)
$ gdb nvscistream_event_sample core.nvscistream_eve.2775 
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from nvscistream_event_sample...
(No debugging symbols found in nvscistream_event_sample)

warning: core file may not match specified executable file.
[New LWP 2775]
[New LWP 2776]
[New LWP 2777]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `./nvscistream_event_sample -P 0 nvscic2c_pcie_s0_c5_1 -Q 0 f'.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0xffff94bcc900 (LWP 2775))]
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x0000ffff94e5caac in __GI_abort () at abort.c:79
#2  0x0000ffff96baa768 in NvSciCommonPanic () from /usr/lib/libnvscicommon.so.1
#3  0x0000ffff96c5e420 in ?? () from /usr/lib/libnvscibuf.so.1
#4  0x0000ffff96c41580 in ?? () from /usr/lib/libnvscibuf.so.1
#5  0x0000ffff96c42010 in ?? () from /usr/lib/libnvscibuf.so.1
#6  0x0000ffff96c4ad88 in ?? () from /usr/lib/libnvscibuf.so.1
#7  0x0000ffff96c4bb04 in NvSciBufAttrListReconcile () from /usr/lib/libnvscibuf.so.1
#8  0x000000000040825c in handlePool ()
#9  0x0000000000404438 in eventServiceLoop ()
#10 0x0000000000403e58 in main ()

On chip s1:

$ ./nvscistream_event_sample -C 0 nvscic2c_pcie_s0_c6_1 -F 0 3

Regarding PCIe Hot-Plug not working you presented, I checked that Hot-Plug is working by seeing that test_nvscistream_perf (described in NvSciStream Performance Test Application) can work.

Thanks.

VickNV · May 23, 2023, 11:17pm

I’ll check this with our team. While we investigate, could you please confirm whether you are using the cable resolving PCIe Hot-Plug not working previously created by your colleague?

kizaki · May 24, 2023, 2:26am

Dear @VickNV ,
I checked with my colleague who previously cretated PCIe hotplug doesn’t work and the cable resolving it is certainly being used.

VickNV · May 24, 2023, 2:45pm

We assume the abort error is due to the same SoC IDs on both sides.

Have you followed that topic to specify different SoC IDs for the two devkits? If not, I would suggest you work with @shibata-a to set up the environment first. Thanks.

kizaki · May 31, 2023, 2:08am

Dear @VickNV ,
We tried respecifying different SoC IDs for the two devkits.
However, we got the same result as above.

VickNV · May 31, 2023, 2:29am

To obtain the SoC IDs of both devkits, please execute the command on each of them and share the results. Thanks.

root@tegra-ubuntu:/home/nvidia# xxd -b /proc/device-tree/soc_id

VickNV · June 8, 2023, 3:59pm

Hi @kizaki, any update on this? Thanks.

kayccc · June 27, 2023, 1:12am

There is no update from you for a period, assuming this is not an issue any more.
Hence we are closing this topic. If need further support, please open a new one.
Thanks

Topic		Replies	Views
NvStreams C2C CUDA/CUDA doesn't work DRIVE AGX Orin General driveos-nvstreams	10	321	February 19, 2025
Samples for NvSci inter thread communication DRIVE AGX Xavier General driveos-nvstreams	19	3293	October 8, 2020
[BUG] Official Documentation inconsistency/conflicts of NvSciC2cPcie DRIVE AGX Orin General driveos-io , drive-docs	3	793	June 30, 2023
[BUG] cgf helloworld sample nvsci+interprocess communication hangs with exception error DRIVE AGX Orin General driveworks-cgf	31	1142	July 29, 2024
Nvscistream sample gives incorrect results DRIVE AGX Xavier General driveos-nvstreams	21	2770	October 19, 2020
Kernel panic when sharing NvSciBuf between processes DRIVE AGX Orin General driveos-nvmedia	9	841	March 22, 2024
PCIe Hot-Plug not working DRIVE AGX Orin General driveos-io	24	4347	November 1, 2022
Inter chip communication on NVIDIA DRIVE™ Software 10.0 (Linux) DRIVE AGX Xavier General driveos	18	2560	August 19, 2021
NVSCIIPC error found when negotiate the nvscistream DRIVE AGX Xavier General driveos-nvstreams	8	1201	January 3, 2022
Request a test program of PCI communication Jetson AGX Xavier pcie	13	1654	July 24, 2020

How to establish NvSciC2cPcie communication using buffers allocated to GPU

Related topics