#Title# CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)
Greetings. Everyone.
-
I installed pytorch wheel files from this link provided by the Moderator:
PyTorch for Jetson
I am using Jetson AGX Orin 64G and JetPack 6, so I have:
pytorch-wpe 0.0.1
torch 2.3.0
torch-complex 0.4.4
torchaudio 2.3.0+952ea74
torchvision 0.18.0a0+6043bc2 -
Run apt-cache show nvidia-jetpack, I have:
apt-cache show nvidia-jetpack
Package: nvidia-jetpack
Source: nvidia-jetpack (6.2)
Version: 6.2+b77
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 6.2+b77), nvidia-jetpack-dev (= 6.2+b77)
Homepage: Jetson - Embedded AI Computing Platform | NVIDIA Developer
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_6.2+b77_arm64.deb
Size: 29298
SHA256: 70553d4b5a802057f9436677ef8ce255db386fd3b5d24ff2c0a8ec0e485c59cd
SHA1: 9deab64d12eef0e788471e05856c84bf2a0cf6e6
MD5sum: 4db65dc36434fe1f84176843384aee23
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8
Package: nvidia-jetpack
Source: nvidia-jetpack (6.1)
Version: 6.1+b123
Architecture: arm64
Maintainer: NVIDIA Corporation
Installed-Size: 194
Depends: nvidia-jetpack-runtime (= 6.1+b123), nvidia-jetpack-dev (= 6.1+b123)
Homepage: Jetson - Embedded AI Computing Platform | NVIDIA Developer
Priority: standard
Section: metapackages
Filename: pool/main/n/nvidia-jetpack/nvidia-jetpack_6.1+b123_arm64.deb
Size: 29312
SHA256: b6475a6108aeabc5b16af7c102162b7c46c36361239fef6293535d05ee2c2929
SHA1: f0984a6272c8f3a70ae14cb2ca6716b8c1a09543
MD5sum: a167745e1d88a8d7597454c8003fa9a4
Description: NVIDIA Jetpack Meta Package
Description-md5: ad1462289bdbc54909ae109d1d32c0a8
- I am trying FunASR project and my code is like:
from funasr import AutoModel
chunk_size = [0, 10, 5] #[0, 10, 5] 600ms, [0, 8, 4] 480ms
encoder_chunk_look_back = 4 #number of chunks to lookback for encoder self-attention
decoder_chunk_look_back = 1 #number of encoder chunks to lookback for decoder cross-attention
model = AutoModel(model=“paraformer-zh-streaming”)
import soundfile
import os
wav_file = os.path.join(model.model_path, “example/asr_example.wav”)
speech, sample_rate = soundfile.read(wav_file)
chunk_stride = chunk_size[1] * 960 # 600ms
cache = {}
total_chunk_num = int(len((speech)-1)/chunk_stride+1)
for i in range(total_chunk_num):
speech_chunk = speech[i*chunk_stride:(i+1)*chunk_stride]
is_final = i == total_chunk_num - 1
res = model.generate(input=speech_chunk, cache=cache, is_final=is_final, chunk_size=chunk_size, encoder_chunk_look_back=encoder_chunk_look_back, decoder_chunk_look_back=decoder_chunk_look_back)
print(res)
You can find the source code and the project at this link: modelscope/FunASR: A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc. (github.com)
Run the code, and I have:
python funasr_speech_recognition_streaming.py
funasr version: 1.2.4.
Check update of funasr, and it would cost few times. You may disable it by set disable_update=True
in AutoModel
You are using the latest version of funasr-1.2.4
Downloading Model to directory: /home/nvidia/.cache/modelscope/hub/models/iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online
2025-02-26 23:38:36,008 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
0%| | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
File “/home/nvidia/projects/playground/funasr_speech_recognition_streaming.py”, line 21, in
res = model.generate(input=speech_chunk, cache=cache, is_final=is_final, chunk_size=chunk_size, encoder_chunk_look_back=encoder_chunk_look_back, decoder_chunk_look_back=decoder_chunk_look_back)
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/auto/auto_model.py”, line 303, in generate
return self.inference(input, input_len=input_len, **cfg)
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/auto/auto_model.py”, line 345, in inference
res = model.inference(**batch, **kwargs)
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/models/paraformer_streaming/model.py”, line 629, in inference
tokens_i = self.generate_chunk(
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/models/paraformer_streaming/model.py”, line 482, in generate_chunk
encoder_out, encoder_out_lens = self.encode_chunk(
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/models/paraformer_streaming/model.py”, line 175, in encode_chunk
encoder_out, encoder_out_lens, _ = self.encoder.forward_chunk(
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/models/scama/encoder.py”, line 480, in forward_chunk
encoder_outs = encoder_layer.forward_chunk(
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/models/scama/encoder.py”, line 172, in forward_chunk
x, cache = self.self_attn.forward_chunk(x, cache, chunk_size, look_back)
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/models/sanm/attention.py”, line 327, in forward_chunk
q_h, k_h, v_h, v = self.forward_qkv(x)
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/funasr/models/sanm/attention.py”, line 240, in forward_qkv
q_k_v = self.linear_q_k_v(x)
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/torch/nn/modules/module.py”, line 1541, in _call_impl
return forward_call(*args, **kwargs)
File “/home/nvidia/anaconda3/envs/funasr/lib/python3.10/site-packages/torch/nn/modules/linear.py”, line 116, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling cublasCreate(handle)
After spending some time investigating this issue by searching the internet around, I got nothing useful. So, could anyone please point out what I was missing there?
Thank you very much for your help and kindness.