Scenario-Based AI Noise Reduction
Scenario-based AI noise reduction refers to real-time automatic identification of different scenarios, intelligently adjusting AI noise reduction strategies to provide the best noise reduction and audio quality effects. Currently, two common noise reduction scenarios are supported:
- In call scenarios, all sounds other than human voice are identified as noise and eliminated. On the basis of eliminating steady-state noise, it effectively eliminates non-steady-state noise and achieves human voice high fidelity. Main noises include mouse, keyboard, tapping, air conditioning, kitchen dishes, noisy restaurants, environmental wind, coughing, blowing, and other non-human voice noises, as well as human voice reverberation in small rooms.
- In music scenarios, automatically adjust noise reduction effects to restore music audio quality. Real-time music detection on mic input. In sound card, singing accompaniment, or near-field music scenarios, automatically adjust noise reduction levels to ensure high-fidelity music audio quality.
- Before using the AI noise reduction feature, please contact ZEGO technical support for special packaging.
- Starting from version 3.0.0, ZEGO Express SDK supports intelligent recognition of music scenarios. In music scenarios, AI noise reduction can automatically reduce the noise reduction level to improve audio quality experience. To use this feature, please contact ZEGO technical support for special packaging and configuration.
Functional Advantages
- Can eliminate 80% of noise.
- Low latency.
- Low memory usage, basically the same as traditional noise reduction.
- Low CPU usage.
- Music scenario recognition accuracy rate reaches 99%.
Usage Scenarios
This feature is suitable for voice rooms, conferences, voice chat for gaming and other 1v1 or multi-person audio/video call scenarios, as well as live streaming or online KTV scenarios for sound cards, singing accompaniment, and near-field music.
Music scenario recognition requires turning on the music detection switch.
Eliminable Noise
Developers can use this feature to eliminate the following noise:
| Scenario | Some Typical Noises |
|---|---|
| Meeting Room |
|
| Office |
|
| Vehicle |
|
| Internet Cafe |
|
| Coffee Shop |
|
Effect Demonstration
Office
The original audio includes: mouse click sounds, keyboard sounds, clapping sounds, friction sounds, office noise, air conditioning sounds, etc.
After AI noise reduction:
Public Place
The original audio includes: rain sounds, tram sounds, cooking sounds, car whooshing sounds, etc.
After AI noise reduction:
Music Scenario
Original audio:
Conventional AI noise reduction: Eliminates noise, but causes significant damage to music.
After scenario-based AI noise reduction: Eliminates noise, music quality fidelity is preserved.
Prerequisites
Before implementing scenario-based AI noise reduction functionality, please ensure:
- You have created a project in the ZEGOCLOUD Console and applied for a valid AppID and AppSign. For details, please refer to Console - Project Information.
- You have integrated ZEGO Express SDK in the project and implemented basic audio/video publishing and playing functions. For details, please refer to Quick Start - Integration and Quick Start - Implementation Flow.
Usage Steps
Developers can complete AI noise reduction related settings according to the following steps:
-
Please contact ZEGO technical support to configure and enable the music detection feature. If already enabled, please ignore this step.
-
For the specific process of initialization and logging in to the room, please refer to "Create Engine" and "Login Room" in the implementation video call documentation.
-
Call the enableANS interface to enable noise suppression. After this feature is enabled, human voice can be clearer.
-
After enabling noise suppression, developers can set the ANS mode and enable the AI noise reduction feature by calling the setANSMode interface. The following shows some AI noise reduction modes. For more modes, please refer to ZegoANSMode.
AI Noise Reduction Mode Applicable Scenarios ZEGO_ANS_MODE_AI Lightweight mode, still has good noise reduction effects under extremely low power consumption and package size increment, suitable for indoor noise environments and relatively comfortable domestic regions. ZEGO_ANS_MODE_AI_BALANCED Balanced mode, comprehensively eliminates noise while lossless human voice, but power consumption slightly increases. Suitable for complex call environments, such as outdoor busy markets, transportation, and other environments as well as regions with serious noise interference. ZEGO_ANS_MODE_AI_LOW_LATENCY Low latency mode, still maintains pure noise reduction effects and high-fidelity human voice audio quality under 10ms latency, suitable for game voice, game chat, real-time singing, and other scenarios sensitive to latency.
// Enable ANS
engine->enableANS(true);
// Set AI noise reduction mode according to needs, Note: After setting ANS mode to ZEGO_ANS_MODE, ZEGO Express SDK will forcibly disable transient noise suppression [enableTransientANS]
engine->setANSMode(ZEGO_ANS_MODE_AI);