Hi,
Im building a Gstreamer based application on Jetson Nano where we are using nvivafilter to overlay real-time data on the video feed. I a newbie in CUDA and in GPU programing, so I guess there would be lot opportunities to make the current solution more performant. So my question would be: how can I profile my code?
This is our pipeline:
gst-launch-1.0 -e
nvarguscamerasrc sensor-id=“$SENSOR_ID” sensor-mode=0 gainrange=“1 16” ispdigitalgainrange=“1 1” name=“${APP_NAME}pipeline_overrides${SENSOR_ID}”
! “video/x-raw(memory:NVMM), width=(int)${capture_width}, height=(int)${capture_height}, format=(string)NV12, framerate=(fraction)${capture_framerate}/1”
! nvvidconv ! nvivafilter cuda-process=true customer-lib-name=“$customer_lib” ! ‘video/x-raw(memory:NVMM), format=(string)NV12’
! nvvidconv ! nvv4l2vp8enc bitrate=“${capture_bitrate}” control-rate=1 ! rtpvp8pay mtu=1400
! udpsink auto-multicast=true clients=“${udp_clients}”
Where I would like get insights from the ‘nvivafilter’ element.
As a naive solution I checked the output of top and jtop to check the CPU, GPU and memory usages but its not that accurate.
I also tried with some remote debugging (starting the connection from MacOS) with visual profiler, Nsight Systems and Nsight Compute but I was not able to connect to the Jetson.
The visual profiler gives me the following error on startup:
!ENTRY org.eclipse.osgi 4 0 2021-05-24 16:33:53.658
!MESSAGE Application error
!STACK 1
java.lang.RuntimeException: Application “com.nvidia.viper.application.application” could not be found in the registry. The applications available are: org.eclipse.ant.core.antRunner, org.eclipse.birt.report.engine.ReportExecutor, org.eclipse.e4.ui.workbench.swt.E4Application, org.eclipse.e4.ui.workbench.swt.GenTopic, org.eclipse.equinox.app.error, org.eclipse.equinox.p2.director, org.eclipse.equinox.p2.garbagecollector.application, org.eclipse.equinox.p2.publisher.InstallPublisher, org.eclipse.equinox.p2.publisher.EclipseGenerator, org.eclipse.equinox.p2.publisher.ProductPublisher, org.eclipse.equinox.p2.publisher.FeaturesAndBundlesPublisher, org.eclipse.equinox.p2.reconciler.application, org.eclipse.equinox.p2.repository.repo2runnable, org.eclipse.equinox.p2.repository.metadataverifier, org.eclipse.equinox.p2.artifact.repository.mirrorApplication, org.eclipse.equinox.p2.metadata.repository.mirrorApplication, org.eclipse.equinox.p2.updatesite.UpdateSitePublisher, org.eclipse.equinox.p2.publisher.UpdateSitePublisher, org.eclipse.equinox.p2.publisher.CategoryPublisher, org.eclipse.help.base.infocenterApplication, org.eclipse.help.base.helpApplication, org.eclipse.help.base.indexTool, org.eclipse.ui.ide.workbench.
at org.eclipse.equinox.internal.app.EclipseAppContainer.startDefaultApp(EclipseAppContainer.java:248)
at org.eclipse.equinox.internal.app.MainApplicationLauncher.run(MainApplicationLauncher.java:29)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:134)
at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:104)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:380)
at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:235)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base/java.lang.reflect.Method.invoke(Method.java:566)
at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:648)
at org.eclipse.equinox.launcher.Main.basicRun(Main.java:603)
at org.eclipse.equinox.launcher.Main.run(Main.java:1465)
The Nsight systems:
DirectoryNotFoundError (150) {
OriginalExceptionClass: N5boost16exception_detail10clone_implIN11QuadDCommon26DirectoryNotFoundExceptionEEE
OriginalFile: /Users/devtools/buildAgent/work/20a3cfcd1c25021d/QuadD/Host/Analysis/PosixDeviceValidator.cpp
OriginalLine: 26
OriginalFunction: bool QuadDAnalysis::PosixDeviceValidator::CheckHostSupport(const QuadDAnalysis::DevicePtr &)
Filename: /Applications/NVIDIA Nsight Systems.app/Contents/target-linux-armv8
ErrorText: Deploy directory does not exist
}Missing directory with target binaries:
/Applications/NVIDIA Nsight Systems.app/Contents/target-linux-armv8NVIDIA Nsight Systems
2021.2.1.58-642947b OSX
- falcon@falcondev.devices:
[Error] Target is not supported.
This version of Nsight Systems does not support profiling on the selected target.
Missing directory with target binaries:
/Applications/NVIDIA Nsight Systems.app/Contents/target-linux-armv8
And finally the Nsight Compute is able to connect to the remote but it is stuck with this message:
Trying to connect to process…
Searching for attachable processes on falcondev.devices:49152-49215…
But maybe Im on a wrong track, I mean Im not even sure if any of these tools would work for me as my CUDA code is compiled to a shared lib and will be called by the nvivafilter from a Gstreamer pipeline.
Do you have any recommendation what would be the best way to profile the application?
Thanks!
Bests,
Peter