Custom Audio Capture and Rendering

2024-02-27

Introduction

Custom Audio Capture

In the following situations, it is recommended to use the custom audio capture feature:

Customers need to get captured input from existing audio streams, audio files, or customized capture systems and hand it over to the SDK for transmission.
Customers have their own needs to do special audio effect processing on PCM input sources, and input after audio effect processing, handing it over to the SDK for transmission.

Custom Audio Rendering

When customers have their own rendering needs, for example, doing special applications or processing on the obtained raw PCM data before rendering, it is recommended to use the SDK's custom audio rendering feature.

Warning

Audio capture and rendering are divided into 3 situations:

Internal capture, internal rendering
Custom capture, custom rendering
Custom capture, internal rendering

Developers please choose the appropriate audio capture and rendering method according to their business scenarios.

Download Sample Source Code

Please refer to Download Sample Source Code to obtain the source code.

For related source code, please check the files in the "/ZegoExpressExample/Examples/AdvancedAudioProcessing/CustomAudioCaptureAndRendering" directory.

Usage Steps

The following figure is the API interface call sequence diagram:

1 Initialize the SDK

Please refer to "Create the engine" in Quick Start - Implementation Process.

2 Enable custom audio capture and rendering

Warning

enableCustomAudioIO needs to be called before startPublishingStream, startPlayingStream, startPreview, createMediaPlayer, createAudioEffectPlayer, and createRealTimeSequentialDataManager to take effect.

You can call ZegoCustomAudioConfig to set sourceType = ZEGO_AUDIO_SOURCE_TYPE_CUSTOM, and then call the enableCustomAudioIO interface to enable the custom audio IO feature.

Call example:

// Set audio source to custom capture and rendering
ZegoCustomAudioConfig audioConfig;
audioConfig.sourceType = ZEGO_AUDIO_SOURCE_TYPE_CUSTOM;
engine->enableCustomAudioIO(true, &audioConfig);

// Set audio source to custom capture and rendering
ZegoCustomAudioConfig audioConfig;
audioConfig.sourceType = ZEGO_AUDIO_SOURCE_TYPE_CUSTOM;
engine->enableCustomAudioIO(true, &audioConfig);

Please refer to "Login room", "Publish stream", and "Play stream" in Quick Start - Implementation Process.

4 Capture audio data

Pass the captured audio data to the engine through sendCustomAudioCaptureAACData or sendCustomAudioCapturePCMData.

5 Render audio data

Use fetchCustomAudioRenderPCMData to get audio data from the engine, and then play it through rendering devices after getting the audio data.

FAQ

Timing for calling custom audio capture and rendering related interfaces?
- enableCustomAudioIO: Should be called before starting the engine, that is, before starting preview, publishing, and playing streams.
- sendCustomAudioCaptureAACData/sendCustomAudioCapturePCMData: Should be called after starting preview and publishing streams. If called before starting preview and publishing streams, the SDK will directly discard the received data.
- fetchCustomAudioRenderPCMData: Should be called after starting to play stream. Data obtained before starting to play stream is invalid silent data.
Frequency for calling custom audio capture and rendering related interfaces?

The optimal way is to drive according to the clock of physical audio devices, call sendCustomAudioCaptureAACData and sendCustomAudioCapturePCMData when the physical capture device captures data; call fetchCustomAudioRenderPCMData when the physical rendering device needs data. If there is no specific physical device to drive in the developer's actual scenario, it is recommended to call the above interfaces every 10 ~ 20 ms.
When calling fetchCustomAudioRenderPCMData, if the data inside the SDK is insufficient for "dataLength", how does the SDK handle it?

When "param" is filled normally, if the data inside the SDK is insufficient for "dataLength", the remaining insufficient length will be filled with silent data.

Custom Audio Capture and Rendering

Introduction