Hi Dusy,
I have a problem when I run asr.py with model matchboxnet
in dustynv/jetson-voice:r32.6.1.
This is the command,
$ python3 examples/asr.py --model matchboxnet --wav data/audio/commands.wav
And I got an error
[2021-09-04 01:45:55] audio.py:82 - loading audio 'data/audio/commands.wav'
Traceback (most recent call last):
File "examples/asr.py", line 34, in <module>
results = asr(samples)
File "/jetson-voice/jetson_voice/models/asr/asr_engine.py", line 166, in __call__
length=torch.as_tensor(self.buffer.size, dtype=torch.int64).unsqueeze(dim=0)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/nemo_toolkit-1.0.0rc1-py3.6.egg/nemo/core/classes/common.py", line 770, in __call__
outputs = wrapped(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torch/autograd/grad_mode.py", line 28, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/nemo_toolkit-1.0.0rc1-py3.6.egg/nemo/collections/asr/modules/audio_preprocessing.py", line 80, in forward
processed_signal, processed_length = self.get_features(input_signal, length)
File "/usr/local/lib/python3.6/dist-packages/nemo_toolkit-1.0.0rc1-py3.6.egg/nemo/collections/asr/modules/audio_preprocessing.py", line 389, in get_features
features = self.featurizer(input_signal)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torchaudio-0.9.0a0+33b2469-py3.6-linux-aarch64.egg/torchaudio/transforms.py", line 583, in forward
mel_specgram = self.MelSpectrogram(waveform)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torchaudio-0.9.0a0+33b2469-py3.6-linux-aarch64.egg/torchaudio/transforms.py", line 520, in forward
specgram = self.spectrogram(waveform)
File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/torchaudio-0.9.0a0+33b2469-py3.6-linux-aarch64.egg/torchaudio/transforms.py", line 122, in forward
self.return_complex,
File "/usr/local/lib/python3.6/dist-packages/torchaudio-0.9.0a0+33b2469-py3.6-linux-aarch64.egg/torchaudio/functional/functional.py", line 118, in spectrogram
spec_f = spec_f.reshape(shape[:-1] + spec_f.shape[-2:])
RuntimeError: shape '[1, 154, 2]' is invalid for input of size 79156
[TensorRT] INFO: [MemUsageChange] Init cuBLAS/cuBLASLt: CPU +0, GPU +0, now: CPU 1534, GPU 3511 (MiB)
Would you please give me some advice to fix, thanks.
I can run it in JetPack 4.5