Custom Audio Capture and Rendering

2024-02-27

Feature Overview

Custom audio capture

In the following scenarios, it is recommended to use the custom audio capture feature:

Developers need to obtain captured input from existing audio streams, audio files, or customized capture systems and hand it over to the SDK for transmission.
Developers have their own requirements for special audio effect processing on PCM input sources, and input after audio effect processing is handed over to the SDK for transmission.

Custom audio rendering

When developers have their own rendering requirements, such as special applications or processing on the captured raw PCM data before rendering, it is recommended to use the SDK's custom audio rendering feature.

Warning

Audio capture and rendering are divided into 3 situations:

Internal capture, internal rendering
Custom capture, custom rendering
Custom capture, internal rendering

Please choose the appropriate audio capture and rendering methods according to your business scenarios.

Example Source Code Download

Please refer to Download Example Source Code to get the source code.

For related source code, please check the files in the "/ZegoExpressExample/AdvancedAudioProcessing/src/main/java/im/zego/advancedaudioprocessing/customaudiocaptureandrendering" directory.

Prerequisites

Before implementing custom audio capture and rendering, please ensure:

You have created a project in the ZEGOCLOUD Console and applied for a valid AppID and AppSign. For details, please refer to Console - Project Information.
You have integrated the ZEGO Express SDK in your project and implemented basic audio and video streaming functionality. For details, please refer to Quick Start - Integration and Quick Start - Implementation.

Usage Steps

The following figure shows the API interface call sequence diagram:

1 Initialize SDK

Please refer to "Create Engine" in Quick Start - Implementation.

2 Enable custom audio capture and rendering

Warning

enableCustomAudioIO needs to be called before startPublishingStream, startPlayingStream, startPreview, createMediaPlayer, createAudioEffectPlayer, and createRealTimeSequentialDataManager to take effect.

You can call ZegoCustomAudioConfig to set sourceType = ZegoAudioSourceType.CUSTOM, and then call the enableCustomAudioIO interface to enable custom audio IO functionality.

// Set audio source to custom capture and rendering
ZegoCustomAudioConfig config=new ZegoCustomAudioConfig();
config.sourceType= ZegoAudioSourceType.CUSTOM;
engine.enableCustomAudioIO(true,config);

// Set audio source to custom capture and rendering
ZegoCustomAudioConfig config=new ZegoCustomAudioConfig();
config.sourceType= ZegoAudioSourceType.CUSTOM;
engine.enableCustomAudioIO(true,config);

Please refer to "Login Room", "Publish Stream", and "Play Stream" in Quick Start - Implementation.

4 Capture audio data

Open the audio capture device, and pass the captured audio data to the engine through sendCustomAudioCaptureAACData or sendCustomAudioCapturePCMData.

Notes

When using the sendCustomAudioCaptureAACData or sendCustomAudioCapturePCMData interface to capture audio, the ByteBuffer type finally provided must be directBuffer (not this type by default), that is, it needs to be initialized through the allocateDirect method, otherwise it cannot be used normally.

5 Render audio data

Use fetchCustomAudioRenderPCMData to get audio data from the engine, and after getting the audio data, play it through the rendering device.

FAQ

When to call custom audio capture and rendering related interfaces?
- enableCustomAudioIO: Should be called before the engine starts, that is, before starting preview, publishing and playing streams.
- sendCustomAudioCaptureAACData/sendCustomAudioCapturePCMData: Should be called after starting preview and publishing stream. If called before starting preview and publishing stream, the SDK will directly discard the received data.
- fetchCustomAudioRenderPCMData: Should be called after starting playing stream. Data obtained before starting playing stream is invalid mute data.
Frequency of calling custom audio capture and rendering related interfaces?

The optimal way is to drive according to the clock of the physical audio device. When the physical capture device captures data, call sendCustomAudioCaptureAACData and sendCustomAudioCapturePCMData; when the physical rendering device needs data, call fetchCustomAudioRenderPCMData.

If there is no specific physical device to drive in the actual scenario, it is recommended to call the above interfaces every 10 ms to 20 ms.
When calling fetchCustomAudioRenderPCMData, if the data inside the SDK is insufficient for "dataLength", how does the SDK handle it?

When the "param" is filled normally, if the data inside the SDK is insufficient for "dataLength", the remaining insufficient length will be filled with mute data.
Android device with external microphone, using custom audio capture and rendering, if the user puts on Bluetooth headphones midway, how to use Express SDK to capture audio?

Since the Express SDK will not automatically switch to internal capture, developers need to handle business logic: stop external capture. The mobile SDK will select devices based on the current route (audio route) of the system. If the system's route is Bluetooth, it will use Bluetooth for capture.

Custom Audio Capture and Rendering

Feature Overview