Implementation of Precise Time Synchronization between two Xaviers over WLAN

my need is to implement splitting of a stream to process it on a cluster with further merge.
That is the use case I am looking to find a design for. Timestamp with precision is meant to cut and merge in that context. And to avoid incoherence after assembling parts.

Use case:

[i]"The problem is
The processors can not handle 4k 25fps aruco
They can handle probably 5 FPS
So we need to have each processor handle 5 frames

We don’t
Want to port it to a Gpu
That is the point
We want to process it on CPU’s"[/i]

Update:
Idea is e.g. split 4K 25fps to 5 streams with 5fps each, they will come to separate CPUs where they will be processed as 5fps streams. Then the result will be assembled somehow.
Update:
Found some sample

/nvsample_cudaprocess$ make
/usr/local/cuda/bin/nvcc -ccbin g++   --shared  -Xcompiler -fPIC  -Xlinker --dynamic-linker=/lib/ld-linux-aarch64.so.1  -gencode arch=compute_30,code=sm_30 -gencode arch=compute_32,code=sm_32 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_53,code=sm_53 -gencode arch=compute_62,code=sm_62 -gencode arch=compute_72,code=sm_72 -gencode arch=compute_50,code=compute_50 -gencode arch=compute_72,code=compute_72 -o libnvsample_cudaprocess.so nvsample_cudaprocess.o -L/usr/lib/aarch64-linux-gnu -lEGL -lGLESv2 -L/usr/lib/aarch64-linux-gnu/tegra -lcuda -lrt
/usr/bin/ld: nvsample_cudaprocess.o: Relocations in generic ELF (EM: 62)
/usr/bin/ld: nvsample_cudaprocess.o: Relocations in generic ELF (EM: 62)
/usr/bin/ld: nvsample_cudaprocess.o: Relocations in generic ELF (EM: 62)
/usr/bin/ld: nvsample_cudaprocess.o: Relocations in generic ELF (EM: 62)
/usr/bin/ld: nvsample_cudaprocess.o: Relocations in generic ELF (EM: 62)
nvsample_cudaprocess.o: error adding symbols: File in wrong format
collect2: error: ld returned 1 exit status
Makefile:138: recipe for target 'libnvsample_cudaprocess.so' failed
make: *** [libnvsample_cudaprocess.so] Error 1

References:
https://github.com/fehlfarbe/python-aruco/issues/33

Hi Andrei,
[s]Not sure this is the root cause of your failure, but for Xavier you would try to set CUDA arch to 70.

-gencode arch=compute_70,code=sm_70[/s]

Are you facing this error natively or cross compiling ?

thank you for pointing out!

sorry for confusion…Xavier is indeed CUDA arch 72.
Probably not related, but I’d advise to remove old archs < 50 however.

it was an attempt to look into cuda feature that supports some sort of hook with gstreamer.
An example from public sources has been taken and approached - that worked for executing make at host pc.

public_sources$ ls
dtc-1.4.0.tar.bz2                    kernel_src.tbz2
dtc-1.4.0.tar.bz2.sha1sum            kernel_src.tbz2.sha1sum
FreeRTOSV8.1.2_src.tbz2              libgstnvvideosinks_src.tbz2
FreeRTOSV8.1.2_src.tbz2.sha1sum      libgstnvvideosinks_src.tbz2.sha1sum
gstegl_src.tbz2                      nvgstapps_src.tbz2
gstegl_src.tbz2.sha1sum              nvgstapps_src.tbz2.sha1sum
gstjpeg_src.tbz2                     nvsample_cudaprocess
gstjpeg_src.tbz2.sha1sum             nvsample_cudaprocess_src.tbz2
gst-nvvideo4linux2_src.tbz2          nvsample_cudaprocess_src.tbz2.sha1sum
gst-nvvideo4linux2_src.tbz2.sha1sum  public_sources_sha.txt
gstomx1_src.tbz2                     u-boot_src.tbz2
gstomx1_src.tbz2.sha1sum             u-boot_src.tbz2.sha1sum
hardware                             v4l2_libs_src.tbz2
kernel                               v4l2_libs_src.tbz2.sha1sum

idea was to execute the code below at Xavier:

gst-launch-1.0 nvcamerasrc fpsRange="30 30" ! 'video/xraw(memory:NVMM), width=(int)3840, height=(int)2160,
format=(string)I420, framerate=(fraction)30/1' ! nvtee ! nvivafilter
cuda-process=true customer-lib-name="libnvsample_cudaprocess.so" !
'video/x-raw(memory:NVMM), format=(string)NV12' ! nvoverlaysink -e

the cause is that I have taken the obsolete source: https://developer.download.nvidia.com/embedded/L4T/r24_Release_v2.1/Docs/Accelerated_GStreamer_User_Guide_Release_24.2.1.pdf

Thank you for pointing out!

Hm, probably it is not the cause, as the public sources release was not obsolete.
Though that should be not the case as the newest release of the documentation has the same code as the previous one:

gst-launch-1.0 nvarguscamerasrc ! \
'video/x-raw(memory:NVMM), width=(int)3840, height=(int)2160, \
format=(string)NV12, framerate=(fraction)30/1' ! \
nvivafilter cuda-process=true \
customer-lib-name="libnvsample_cudaprocess.so" ! \
'video/x-raw(memory:NVMM), format=(string)NV12' ! nvoverlaysink -e

Last time I tried nvivafilter on Xavier R31.1 it was not working, but I’d expect it has been fixed since then.
It should be able to build at least natively. You may read this topic for some details.

probably the feature below could be used to synchronize cameras connected to the same device, in case cameras have hardware support of it

v4l2-ctl -c frame_sync=1 -d /dev/videoX

Reference threads:

https://devtalk.nvidia.com/default/topic/1038513/jetson-tx2/jetson-tx2-ptp-capture-synchronization/post/5299551/#5299551
https://devtalk.nvidia.com/default/topic/1007933/jetson-tx1/camera-frame-synchronization/
https://devtalk.nvidia.com/default/topic/1039183/jetson-tx2/argus-syncing-multiple-capture-sessions/
https://devtalk.nvidia.com/default/topic/1014443/jetson-tx2/recording-from-multiple-cameras-simultaneously/post/5170306/#5170306
https://devtalk.nvidia.com/default/topic/1054993/jetson-nano/pps-on-jetson-nano/post/5348673/#5348673