Scenario-based AI Noise Reduction

2024-07-27

Scenario-based AI noise reduction refers to real-time automatic identification of different scenarios, intelligently adjusting AI noise reduction strategies to provide the best noise reduction and sound quality effects. Currently supports two common noise reduction scenarios:

In call scenarios, all sounds except human voice are identified as noise and eliminated. On the basis of eliminating steady-state noise (for details, please refer to Audio 3A Processing), it effectively eliminates non-steady-state noise and achieves human voice high fidelity. Main noises include mouse, keyboard, tapping, air conditioning, kitchen dishes, noisy restaurants, environmental wind, coughing, blowing, and other non-human voice noises, as well as human voice reverberation in small rooms.
In music scenarios, automatically adjust noise reduction effects to restore music sound quality. Real-time music detection on mic input. In sound card, impromptu singing, or near-field music scenarios, automatically adjust the noise reduction level to ensure high-fidelity music sound quality.

Warning

Before using the AI noise reduction feature, please contact ZEGO Technical Support for special packaging.
Starting from version 3.0.0, ZEGO Express SDK supports intelligent identification of music scenarios. In music scenarios, AI noise reduction can automatically reduce the noise reduction level to improve sound quality experience. To use this feature, please contact ZEGO Technical Support for special packaging and configuration.

Functional Advantages

Can eliminate 80% of noise.
Low latency.
Low memory usage, basically the same as traditional noise reduction.
Low CPU usage.
Music scenario recognition accuracy reaches 99%.

Usage Scenarios

This feature is suitable for 1v1 or multi-person audio/video call scenarios such as voice rooms, meetings, voice gaming, etc., as well as live streaming or online KTV scenarios with sound cards, impromptu singing, or near-field music.

Warning

Music scenario identification requires turning on the music detection switch. Please contact ZEGO Technical Support to configure and enable the music detection feature.

Removable Noises

Developers can use this feature to eliminate the following noises:

Scenario	Some Typical Noises
Meeting Room	Keyboard sound Table tapping sound
Office	Keyboard sound Surrounding colleagues talking sounds
Vehicle	Whistle sound Car passing whooshing sound Car music sound Rain sound and wiper sound
Internet Cafe	Keyboard sound Surrounding people voice sounds
Coffee Shop	Chair dragging sound Surrounding people talking sounds Sharp collision sound

Effect Demonstration

Office

Original audio includes: mouse clicking sounds, keyboard sounds, clapping sounds, friction sounds, office noise, air conditioning sounds, etc.

After AI noise reduction:

Public Place

Original audio includes: rain sounds, tram sounds, cooking sounds, car whooshing sounds, etc.

After AI noise reduction:

Music Scenario

Original audio:

Conventional AI noise reduction: Eliminates noise, but causes great damage to music.

After scenario-based AI noise reduction: Eliminates noise, music quality fidelity preserved.

Prerequisites

Before implementing AI noise reduction functionality, please ensure:

You have created a project in the ZEGO Console and applied for a valid AppID and AppSign. For details, please refer to Console - Project Information.
You have integrated the ZEGO Express SDK in your project and implemented basic audio/video streaming functions. For details, please refer to Quick Start - Integration and Quick Start - Implementation.

Usage Steps

Developers can complete the related settings for AI noise reduction according to the following steps:

Please contact ZEGO Technical Support to configure and enable the music detection feature. If already enabled, please ignore this step.
For the specific process of initialization and logging into the room, please refer to "Create Engine" and "Login Room" in the Implement Video Call documentation.
Call the enableANS interface to enable noise suppression. After this feature is enabled, human voice can be clearer.
After enabling noise suppression, developers can call the setANSMode interface to set the ANS mode to "ZegoANSMode.AI" mode to enable AI noise reduction functionality.

// Enable ANS
ZegoExpressEngine.instance.enableANS(true)
// Note: After setting ANS mode to ZegoANSMode.AI, ZEGO Express SDK will forcibly disable transient noise suppression [enableTransientANS]
ZegoExpressEngine.instance.setANSMode(ZegoANSMode.AI)

// Enable ANS
ZegoExpressEngine.instance.enableANS(true)
// Note: After setting ANS mode to ZegoANSMode.AI, ZEGO Express SDK will forcibly disable transient noise suppression [enableTransientANS]
ZegoExpressEngine.instance.setANSMode(ZegoANSMode.AI)