Please provide complete information as applicable to your setup.
• Hardware Platform (Jetson / GPU) dGPU
• DeepStream Version 6.1.1
• JetPack Version (valid for Jetson only) None
• TensorRT Version 8.4
• NVIDIA GPU Driver Version (valid for GPU only) 515.65.01
• Issue Type( questions, new requirements, bugs) quastion
• How to reproduce the issue ? (This is for bugs. Including which sample app is using, the configuration files content, the command line used and other details for reproducing) None
• Requirement details( This is for new requirement. Including the module name-for which plugin or for which sample application, the function description) None
I want to use nvinferaudio with my model, but I found that audio-transform
lacks some featues (setting fmin/fmax of mel filter-bank, pre-emphasis).
Therefore, I am trying to feed raw audio info the model and let the model do the rest (stft, conv, classify)
I assumed nvinferaudio support raw audio input because doc says it supports “Encoder Decoder RNN Architecture” and RNN usually use raw audio rather than mel spectogram, as far as I know.
My question is that
- How to write config to feed raw audio data into the model?
- What is the input shape in that case (e.g., BATCH x LEN x 1) ?
- What is the input value range? (e.g.,
<16bit-raw-value>
/net-scale-factor
) ?
Another question is that
- How can I use
parse-classifier-func-name
? Is the function I/F the same as nvinfer?