Custom Audio Processing

2024-02-27

Feature Overview

Custom audio processing is generally used to remove interference from speech. Since the SDK has already performed echo cancellation, noise suppression, and other processing on the captured raw audio data, developers usually do not need to process it again.

If developers want to implement special features through custom processing after capturing audio data or before rendering remote audio data (e.g., voice changer, voice beautification, etc.), they can refer to this document.

Notes

The data for custom audio processing is audio data after raw audio has been processed by 3A (AEC Acoustic Echo Cancellation, AGC Automatic Gain Control, ANS Noise Suppression):

If developers need to process raw data, please first call the enableAEC, enableAGC, and enableANS interfaces to disable audio 3A processing. If audio effect processing such as voice changer, reverb, stereo, etc. is enabled (disabled by default), you also need to disable them first before you can obtain raw audio data.
If developers need to obtain both raw data and audio data after 3A processing for processing, please refer to Custom Audio Capture and Rendering.

Prerequisites

Before custom audio processing, please ensure:

You have created a project in the ZEGOCLOUD Console and applied for a valid AppID and AppSign. For details, please refer to Console - Project Information.
You have integrated the ZEGO Express SDK in your project and implemented basic audio and video streaming functionality. For details, please refer to Quick Start - Integration and Quick Start - Implementation.

Usage Steps

1 Create SDK engine

Call the createEngine interface to create an SDK engine instance. For details, please refer to "Create Engine" in Quick Start - Implementation.

ZegoEngineProfile profile = new ZegoEngineProfile();
profile.appID = appID;
profile.appSign = appSign;
profile.scenario = ZegoScenario.DEFAULT;
profile.application = getApplication();
ZegoExpressEngine.createEngine(profile, null);

ZegoEngineProfile profile = new ZegoEngineProfile();
profile.appID = appID;
profile.appSign = appSign;
profile.scenario = ZegoScenario.DEFAULT;
profile.application = getApplication();
ZegoExpressEngine.createEngine(profile, null);

2 Set audio custom processing handler and implement callback methods

Call the setCustomAudioProcessHandler interface to set the audio custom processing handler, and implement the callback methods: custom audio processing local captured PCM audio frame callback onProcessCapturedAudioData and custom audio processing remote playing stream PCM audio frame callback onProcessRemoteAudioData. Directly process the obtained data in the callback method to implement processing of publishing and playing stream audio data.

ZegoExpressEngine.getEngine().setCustomAudioProcessHandler(new IZegoCustomAudioProcessHandler() {
    @Override
    public void onProcessCapturedAudioData(ByteBuffer data, int dataLength, ZegoAudioFrameParam param, double timestamp) {

    }

    @Override
    public void onProcessRemoteAudioData(ByteBuffer data, int dataLength, ZegoAudioFrameParam param, String streamID, double timestamp) {

    }
});

ZegoExpressEngine.getEngine().setCustomAudioProcessHandler(new IZegoCustomAudioProcessHandler() {
    @Override
    public void onProcessCapturedAudioData(ByteBuffer data, int dataLength, ZegoAudioFrameParam param, double timestamp) {

    }

    @Override
    public void onProcessRemoteAudioData(ByteBuffer data, int dataLength, ZegoAudioFrameParam param, String streamID, double timestamp) {

    }
});

3 Custom audio processing

Before starting publishing stream or starting local preview, call the enableCustomAudioCaptureProcessing interface to enable local captured custom audio processing. After enabling, developers can receive locally captured audio frames through the onProcessCapturedAudioData callback and modify the audio data.
Before starting playing stream, call the enableCustomAudioRemoteProcessing interface to enable remote playing stream custom audio processing. After enabling, developers can receive remote playing stream audio frames through onProcessRemoteAudioData and modify the audio data.

ZegoCustomAudioProcessConfig zegoCustomAudioProcessConfig = new ZegoCustomAudioProcessConfig();
zegoCustomAudioProcessConfig.channel = ZegoAudioChannel.MONO;
zegoCustomAudioProcessConfig.sampleRate = ZegoAudioSampleRate.ZEGO_AUDIO_SAMPLE_RATE_16K;
zegoCustomAudioProcessConfig.samples = 0;
ZegoExpressEngine.getEngine().enableCustomAudioCaptureProcessing(true, zegoCustomAudioProcessConfig);
ZegoExpressEngine.getEngine().enableCustomAudioRemoteProcessing(true, zegoCustomAudioProcessConfig);

ZegoCustomAudioProcessConfig zegoCustomAudioProcessConfig = new ZegoCustomAudioProcessConfig();
zegoCustomAudioProcessConfig.channel = ZegoAudioChannel.MONO;
zegoCustomAudioProcessConfig.sampleRate = ZegoAudioSampleRate.ZEGO_AUDIO_SAMPLE_RATE_16K;
zegoCustomAudioProcessConfig.samples = 0;
ZegoExpressEngine.getEngine().enableCustomAudioCaptureProcessing(true, zegoCustomAudioProcessConfig);
ZegoExpressEngine.getEngine().enableCustomAudioRemoteProcessing(true, zegoCustomAudioProcessConfig);