Hardware Platform: Jetson Nano
DeepStream: Version 6.0
JetPack Version: 4.6
TensorRT Version: 8.0.1
Issue Type: questions
we have several questions about using of the nvinferaudio plugin for inferencing.
We have our trained CNN Tensorflow model for voices classification and we want to use TensorRT on Jetson Nano for inferencing. For our model we are creating mel spectrograms from vaw files, which than will be passed into the CNN. For creating mel spectrograms we use Python librosa library.
That’s why we have an idea to use nvinferaudio plugin, which also firstly creates the mel spectrograms.
But we want to compare the spectrograms from the plugin with the spectrograms from librosa library (to be sure, that if by the inferencing we’ll be getting smth unexpected, it doesn’t depend on spectrograms). Is there any possibility to visualize the spectrogram for a frame, that will be created after audio-transform in nvinferaudio plugin?
Now we have the pipeline, that is working in Terminal and in Python, it’s relatively similar to the pipeline from deepstream-audio model.
my_pipeline_2.pdf (34.9 KB)
The only one idea that we have, that it can be possible to access a spectrogram from metadata, may be from NvDsInferTensorMeta? But is it really possible or not?
I’ve tried to access NvDsInferTensorMeta as described here deepstream_python_apps/custom_parser_guide.md at master · NVIDIA-AI-IOT/deepstream_python_apps · GitHub ,
also with setting ‘output-tensor-meta=1’,
and metadata was requested for the src pad of nvinferaudio: nvinferaudio.get_static_pad(“src”),
but l_user = frame_meta.frame_user_meta_list is always None for all the frames (frame_meta itself is not None).
But will it be really possible to visualize a spectrogram for a frame from any Metadata? And if yes, what type of metadata is it?
If it’s not possible, could you please suggest any solution, how we can compare librosa spectrograms with your’s from nvinferaudio plugin? May be you could share this part of code of audio-transform, where the spectrograms will be created, such that we could only use it for comparison? Or may any other suggestions?
Thanks in advance!