Setting up a gstreamer pipeline and deppstream to transform audio into an spectogram image

Hi I am looking for a way to build a pipeline allowing to input an audio file and to generete a spectrogram out of it. i think I may use gst-nvinferaudio, but could not grasp how to do it for the moment. Whole things sjould be made in python.
The nano has already gstreamer and deepstream 6 installed but the creation of the pipeline is the problem
Any helps would be useful at this stage.
Later, the system should be able top recognize sounds in put them in categories.
Many thanks in advance

We have the reference sample:


Be default it is to recognize sounds in audio stream. Transforming into a spectrogram is not supported by default. DeepStream SDK is based on gstreamer, so you can check if there is 3rdparty plugin which can do the transformation, and integrate it with DeepStream SDK.

Hi Thanks for the information,
i guess this application is using nvinferaudio ?
But it is a C application, I would be highly interested i having a python application, but in the deepstream_python_apps/apps at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub repository there is none.

This topic is really not covered a lot in the documentation and the samples.

Will it be a good way to take the soundfile, convert it to spectogramm images and than infer to compare with known spectogramms ? are there models for such a solution or how can I train such a model ?

Best regards

I also checked the source code of the deepstream-audio app. but i only find references to infer_audio in comments ?

An alternative solution to my problem could also be to train a model to recognize the sounds and then to use this audio model to classify trhe sounds in my files.
But it is hard to find any documentation using inferaudio