Streaming audio data format queston

lifeng.ferdin · May 31, 2024, 9:46am

I have converted one int16 sound bytes to float32 bytes (in Python), when streaming the float32 4 bytes data to the audio2face tool, it does not work as desired.
My question is what is the exact stream data format? According to comments in the code, it only says: “audio_data: bytes, containing audio data for the whole track, where each sample is encoded as 4 bytes (float32)”.
I binary-compared my sound file, and the values are “equal” and “orders” are the same(I tried to normalize the float32 data, but it didn’t work), only different in data type, one as int16 and another as float32.
The attachment zip contains 2 files, sound_tts_sf.wav for int16 and it played well, another never worked correctly.
Can anyone give me any suggestions?

samples.zip (25.6 KB)

gammatrix5 · May 31, 2024, 2:09pm

Hi

I have successfully streamed audio stream to A2F. You need use gRPC API.
There are two approaches, chunk by chunk , or long-running streaming.
Either way would work. The audio data supplied to stream is MONO channel in the format of AV_SAMPLE_FMT_FLTP, sample rate must be specified in gRPC request.

syntax = "proto3";

package nvidia.audio2face;

service Audio2Face {
    rpc PushAudio(PushAudioRequest) returns (PushAudioResponse) {}
    rpc PushAudioStream(stream PushAudioStreamRequest) returns (PushAudioStreamResponse) {}
}

message PushAudioRequest {
    string instance_name = 1;
    int32 samplerate = 2;
    bytes audio_data = 3;
    bool block_until_playback_is_finished = 4;
}

message PushAudioResponse {
    bool success = 1;
    string message = 2;
}

message PushAudioStreamRequest {
    oneof streaming_request {
        PushAudioRequestStart start_marker = 1;
        bytes audio_data = 2;
    }
}

message PushAudioRequestStart {
    string instance_name = 1;
    int32 samplerate = 2;
    bool block_until_playback_is_finished = 3;
}

message PushAudioStreamResponse {
    bool success = 1;
    string message = 2;
}

lifeng.ferdin · June 3, 2024, 3:23am

I use the test_client.py demo code, sample rate has been set to 24000(in my case 24kHz), and it always plays the sound distorted and noise.

Topic		Replies	Views
Issue with data type from the Riva client to Audio2face(2022.2.1) Audio2Face (closed)	1	602	October 17, 2023
Audio2Face Streaming Proto File General Discussion audio , audio2face	2	533	May 22, 2024
Distorted audio stream from python to stream player Audio2Face (closed)	2	437	May 1, 2024
On the Noise Problem in Auido2face Flow Mode Audio2Face (closed)	9	947	April 28, 2024
Send Audio Stream to A2F over networking Audio2Face (closed)	1	475	May 9, 2024
Problem about streaming and live-link Audio2Face (closed)	4	920	January 11, 2024
Error with gRPC calls to streaming player Audio2Face (closed)	13	2953	May 9, 2024
A2F 2023.2 gRPC Audio2Face (closed)	2	884	December 11, 2023
Audio2face in PixelStreaming application Audio2Face (closed)	3	696	October 14, 2023
Client for Streaming Audio Player Audio2Face (closed)	7	1680	January 3, 2023

Streaming audio data format queston

Related topics