Deepstream-audio like jetson_inference (direct input-file without stream)

Information applicable to my setup.

• Hardware Platform (Jetson / GPU) Xavier NX
• DeepStream Version deepstream-5.1
• JetPack Version (valid for Jetson only) 4.6.2
• TensorRT Version 8.2.1-1+cuda10.2
• Issue Type( questions, new requirements, bugs) questions or new requirements

Hello!

I would like to try nvinferaudio for classification patterns in wav files. I tested the sample_apps/deepstream-audio/ application, with success, but I found it difficult to send files directly without creating pipeline souce0 wuth fixed uri=…/…/…/…/…/samples/streams/sonyc_mixed_audio.wav type=6

I couldn’t understand how to configure the -i option, (–input-file Set the input file), to do something similar to imageNet (jetson_inference) where it is possible to send the file with Classify()

Is there any NVidia tool/documentation to just classify audio using Python without SpeechRecognition?

DeepStream is based on GStreamer, how much do you know about GStreamer?

Do you mean you want to use your own model instead of our sample model “sonyc_audio_classify.onnx”?

If you use audio classifier with deepstream-audio, it will do the classification only. deepstream-audio app is a sample app to show how to integrate audio models for audio inferencing.

It depends on your model. Can you describe the input layers and output layers and the postprocessing algorithm of your model?

Thanks for reply,

I will rephrase the question.

Is there a possibility to load the wav file by the function as a parameter, inside the main program?

In the example, the wav file entry is in the configuration file, and therefore it is not possible to send individual samples. I would like to to send isolated files audio_one.wav, before audio_two.wav, before audio_three.wav…

See jetson_inference.net.Classify(), in there is possible to send isolated files inside the program (IMG_ONE, IMG_TWO), for example:

result_one = jetson_inference.net.Classify( IMG_ONE , 224, 224)
result_two = jetson_inference.net.Classify( IMG_TWO , 224, 224)

The option deepstream-audio -h I see --input-file (Set the input file), but I can not know how to use, or pass all configuration parameters by the function.

deepstream-audio sample app does not support it. You can write a simple app to implement it.

https://gstreamer.freedesktop.org/documentation/wavparse/index.html?gi-language=c

The “-i” option is not supported. You can refer to the source code under /opt/nvidia/deepstream/deepstream/sources/apps/sample_apps/deepstream-audio

Please make sure you are familiar with GStreamer before you start with DeepStream, it is very easy to write a simple app to construct the pipeline you need, and you can implement the wav file uri as your app’s command argument.

Thanks Fiona,
I am studding to try this.

The last question,
The jetson-voice (jetson-voice/asr.py at master · dusty-nv/jetson-voice · GitHub) is supported by Nvidia ?

If yes, it is possible generate CMakeLists to install on Xavier NX ?

There is no update from you for a period, assuming this is not an issue anymore. Hence we are closing this topic. If need further support, please open a new one. Thanks

No. We do not support personal projects in Nvidia forum.

You can write CMake files by yourself.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.