Nvinferaudio will not create audio transform

Please provide complete information as applicable to your setup.

**• Hardware Platform NVIDIA Corporation GA102GL [RTX A6000] (rev a1)
**• DeepStream Version 6.2
**• TensorRT Version 8.6.1
**• NVIDIA GPU Driver Version (valid for GPU only) 525.116.04
• Issue Type( questions, new requirements, bugs)
The current issue I am having is when I try to run the following Gstreamer pipeline:

USE_NEW_NVSTREAMMUX=yes GST_DEBUG=*:3 gst-launch-1.0 filesrc location=/opt/nvidia/deepstream/deepstream-6.2/samples/stre
ams/sonyc_mixed_audio.wav ! audioconvert ! audioresample ! “audio/x-raw, rate=(int)44100” ! queue ! m.sink_0 nvstreammux name=m batch-size=1 ! queue ! nvi
nferaudio audio-framesize=44100 audio-hopsize=11025 batch-size=1 config-file-path= /opt/nvidia/deepstream/deepstream-6.2/sources/SONYCAudioClassifier/conf
ig_infer_audio_sonyc.txt ! fakesink sync=true

I get the following error:
max_fps_dur 8.33333e+06 min_fps_dur 2e+08
Setting pipeline to PAUSED …
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See “Lazy Loading” section of CUDA documentation CUDA C++ Programming Guide
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See “Lazy Loading” section of CUDA documentation CUDA C++ Programming Guide
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:610 [FullDims Engine Info]: layers num: 2
0 INPUT kFLOAT input.1 1x635x128 min: 1x1x635x128 opt: 2x1x635x128 Max: 2x1x635x128
1 OUTPUT kFLOAT output1 31 min: 0 opt: 0 Max: 0

0:00:04.783780313 72216 0x564f35b5b6d0 WARN nvinferaudio gstnvinferaudio.cpp:282:gst_nvinferaudio_start: error: Failed to create audio transform
ERROR: from element /GstPipeline:pipeline0/GstNvInferAudio:nvinferaudio0: Failed to create audio transform
Additional debug info:
gstnvinferaudio.cpp(282): gst_nvinferaudio_start (): /GstPipeline:pipeline0/GstNvInferAudio:nvinferaudio0
ERROR: pipeline doesn’t want to preroll.
Pipeline is PREROLLING …
0:00:04.784482636 72216 0x564f35b5c300 WARN basesrc gstbasesrc.c:3132:gst_base_src_loop: error: Internal data stream error.
0:00:04.784498936 72216 0x564f35b5c300 WARN basesrc gstbasesrc.c:3132:gst_base_src_loop: error: streaming stopped, reason not-negotiated (-4)
ERROR: from element /GstPipeline:pipeline0/GstFileSrc:filesrc0: Internal data stream error.
Additional debug info:
…/libs/gst/base/gstbasesrc.c(3132): gst_base_src_loop (): /GstPipeline:pipeline0/GstFileSrc:filesrc0:
streaming stopped, reason not-negotiated (-4)
ERROR: pipeline doesn’t want to preroll.
Setting pipeline to NULL …

(gst-launch-1.0:72216): GStreamer-CRITICAL **: 10:54:33.297: gst_object_unref: assertion ‘object != NULL’ failed
Freeing pipeline …

I am confused as I am using only sources that are provided by the SDK. It doesn’t seem to be a pipeline error either and its the nvinferaudio element that is not working properly.

please refer to deepstream audio sample opt\nvidia\deepstream\deepstream-6.2\sources\apps\sample_apps\deepstream-audio\deepstream_audio_main.c
if still can’t work, could you share the nvinferaudio’s configuration file?

After looking at the code I have adjusted my pipeline to be:

USE_NEW_NVSTREAMMUX=yes GST_DEBUG=*:3 gst-launch-1.0 filesrc location=/opt/nvidia/deepstream/deepstream-6.2/samples/streams/sonyc_mixed_audio.wav ! audioconvert ! audioresample ! “audio/x-raw, rate=(int)44100” ! queue ! m.sink_0 nvstreammux name=m batch-size=2 ! queue ! nvinferaudio audio-framesize=44100 audio-hopsize=11025 batch-size=1 config-file-path= /opt/nvidia/deepstream/deepstream-6.2/sources/apps/sample_apps/deepstream-audio/configs/config_infer_audio_sonyc.txt ! fakesink sync=true filesrc location=/opt/nvidia/deepstream/deepstream-6.2/samples/streams/sonyc_mixed_audio.wav ! audioconvert ! audioresample ! “audio/x-raw, rate=(int)44100” ! queue ! m.sink_1

However I am still running into the same problems as before.

below is the config file used:

[property]
gpu-id=0
net-scale-factor=1
onnx-file=…/…/…/…/…/samples/models/SONYC_Audio_Classifier/sonyc_audio_classify.onnx
model-engine-file=…/…/…/…/…/samples/models/SONYC_Audio_Classifier/sonyc_audio_classify.onnx_b2_gpu0_fp32.engine
labelfile-path=…/…/…/…/…/samples/models/SONYC_Audio_Classifier/audio_labels.txt
batch-size=2

0=FP32, 1=INT8, 2=FP16 mode

network-mode=0
num-detected-classes=31
gie-unique-id=1
output-blob-names=output1
network-type=1

[class-attrs-all]
threshold=0.4

The error I get is also below:

max_fps_dur 8.33333e+06 min_fps_dur 2e+08
Setting pipeline to PAUSED …
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See “Lazy Loading” section of CUDA documentation CUDA C++ Programming Guide
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See “Lazy Loading” section of CUDA documentation CUDA C++ Programming Guide
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:610 [FullDims Engine Info]: layers num: 2
0 INPUT kFLOAT input.1 1x635x128 min: 1x1x635x128 opt: 2x1x635x128 Max: 2x1x635x128
1 OUTPUT kFLOAT output1 31 min: 0 opt: 0 Max: 0

0:00:04.760803577 215096 0x55728d6b8530 WARN nvinferaudio gstnvinferaudio.cpp:282:gst_nvinferaudio_start: error: Failed to create audio transform
ERROR: from element /GstPipeline:pipeline0/GstNvInferAudio:nvinferaudio0: Failed to create audio transform
Additional debug info:
gstnvinferaudio.cpp(282): gst_nvinferaudio_start (): /GstPipeline:pipeline0/GstNvInferAudio:nvinferaudio0
ERROR: pipeline doesn’t want to preroll.
Pipeline is PREROLLING …
0:00:04.761707381 215096 0x55728d118360 WARN basesrc gstbasesrc.c:3132:gst_base_src_loop: error: Internal data stream error.
0:00:04.761724263 215096 0x55728d118360 WARN basesrc gstbasesrc.c:3132:gst_base_src_loop: error: streaming stopped, reason not-negotiated (-4)
0:00:04.761742427 215096 0x55728d118400 WARN basesrc gstbasesrc.c:3132:gst_base_src_loop: error: Internal data stream error.
0:00:04.761761162 215096 0x55728d118400 WARN basesrc gstbasesrc.c:3132:gst_base_src_loop: error: streaming stopped, reason not-negotiated (-4)
ERROR: from element /GstPipeline:pipeline0/GstFileSrc:filesrc0: Internal data stream error.
Additional debug info:
…/libs/gst/base/gstbasesrc.c(3132): gst_base_src_loop (): /GstPipeline:pipeline0/GstFileSrc:filesrc0:
streaming stopped, reason not-negotiated (-4)
ERROR: pipeline doesn’t want to preroll.
ERROR: from element /GstPipeline:pipeline0/GstFileSrc:filesrc1: Internal data stream error.
Additional debug info:
…/libs/gst/base/gstbasesrc.c(3132): gst_base_src_loop (): /GstPipeline:pipeline0/GstFileSrc:filesrc1:
streaming stopped, reason not-negotiated (-4)
Setting pipeline to NULL …
ERROR: pipeline doesn’t want to preroll.

(gst-launch-1.0:215096): GStreamer-CRITICAL **: 13:15:12.299: gst_object_unref: assertion ‘object != NULL’ failed
Freeing pipeline …

it is because audio-transform is not set in configuration file. please refer to readme in deepstream-audio and sample in \opt\nvidia\deepstream\deepstream-6.2\sources\apps\sample_apps\deepstream-audio\configs\ds_audio_sonyc_test_config.txt

Alright I have adjusted it now and the new pipeline is this:

USE_NEW_NVSTREAMMUX=yes GST_DEBUG=*:3 gst-launch-1.0 filesrc location=/opt/nvidia/deepstream/deepstream-6.2/samples/streams/so
nyc_mixed_audio.wav ! audioconvert ! audioresample ! “audio/x-raw, rate=(int)44100” ! queue ! m.sink_0 nvstreammux name=m batch-size=1 ! queue ! nvinferau
dio audio-framesize=44100 audio-hopsize=11025 batch-size=1 config-file-path= /opt/nvidia/deepstream/deepstream-6.2/sources/SONYCAudioClassifier/config_inf
er_audio_sonyc.txt audio-transform=melsdb,fft_length=2560,hop_size=692,dsp_window=hann,num_mels=128,sample_rate=44100,p2db_ref=1.0,p2db_min_power=0.0,p2db
_top_db=80.0 ! fakesink sync=true

but now I am coming into another issue that being:
max_fps_dur 8.33333e+06 min_fps_dur 2e+08
Setting pipeline to PAUSED …
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See “Lazy Loading” section of CUDA documentation CUDA C++ Programming Guide
WARNING: [TRT]: CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See “Lazy Loading” section of CUDA documentation CUDA C++ Programming Guide
INFO: …/nvdsinfer/nvdsinfer_model_builder.cpp:610 [FullDims Engine Info]: layers num: 2
0 INPUT kFLOAT input.1 1x635x128 min: 1x1x635x128 opt: 2x1x635x128 Max: 2x1x635x128
1 OUTPUT kFLOAT output1 31 min: 0 opt: 0 Max: 0

unsupported param type

** (gst-launch-1.0:250147): ERROR **: 09:32:15.694: Failed to transform parameter

Trace/breakpoint trap (core dumped)

Is this because the parameters for the audio-transform are not exactly configured to what the audio given into is? or is this something entirely different?

Furthermore, When I run gst-inspect-1.0 nvinferaudio
in the Element Properties section this is what is provided for this:
audio-transform : Transform name and parameters
flags: readable, writable
Boxed pointer of type “GstStructure”
How would I express a Boxed pointer of type “GstStructure” in the CLI?

In audio-transform setting, please add “(float)” before number, for example, please use p2db_ref=(float)1.0 instead of p2db_ref=1.0.

USE_NEW_NVSTREAMMUX=yes GST_DEBUG=*:3 gst-launch-1.0 filesrc location=/opt/nvidia/deepstream/deepstream-6.2/samples/streams/sonyc_mixed_audio
.wav ! wavparse ! audioconvert ! audioresample ! “audio/x-raw, rate=(int)44100” ! queue ! m.sink_0 nvstreammux name=m batch-size=1 ! queue ! nvinferaudio audio-framesize=44100 audio-hopsize=11025 batch-size=1 config-file-path= /opt/nvidia/deepstream/deepstream-6.2/sources/SONYCAudioClassifier/config_infer_audio_sonyc.txt “audio-tr
ansform=melsdb,fft_length=2560,hop_size=692,dsp_window=hann,num_mels=128,sample_rate=44100,p2db_ref=(float)1.0,p2db_min_power=(float)0.0,p2db_top_db=(float)80.0” ! fakesink sync=true

After some more testing I found that if you put the audio transform gst property all in quotations that works. Like this: “audio-tr
ansform=melsdb,fft_length=2560,hop_size=692,dsp_window=hann,num_mels=128,sample_rate=44100,p2db_ref=(float)1.0,p2db_min_power=(float)0.0,p2db_top_db=(float)80.0”

however when I run this pipeline it seems to only deal with the first 3 seconds of the entire 10 minute video.

can the sample deepstream-audio without modification run fine? can you compre the pipeline with the sample to narrow down this issue?

When I run the sample program it seems to work fine. I attempted at recreating the pipeline as best as I could in the example provided but no luck

let’s focus on this topic Nvstreammux receives a random EOS
Seems the nvstreammux changes the timestamp. We are investigating the issue. Will be back when there is any progress.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.