AI Voice Changer

2024-01-02

Introduction

The "Conan Voice Changer Bow Tie" in real-time calls perfectly reproduces the target character's timbre and rhythm, while preserving the user's speech rate, emotion, and tone. Switch timbres at will, and ultra-low latency allows users to enjoy social voice chat, live streaming, game voice, and other scenarios freely.

Warning

The "AI Voice Changer" feature is a paid feature. If you need to apply for a trial or inquire about formal pricing, please contact ZEGO business personnel.
This feature is supported starting from version 3.10.0. The current official website SDK does not include this feature. If needed, please contact ZEGOCLOUD Technical Support for special packaging.
Currently, this feature does not support simultaneous use with "Custom Audio Processing".

Feature Advantages

Ultra-high sound quality, ultra-low latency.
Flexible and realistic, perfectly reproducing the target character's timbre and rhythm, while preserving the user's speech rate, emotion, and tone.
Massive timbres for flexible selection, supporting timbre customization.

Effect Demonstration

Original Voice	Target Timbre		After AI Voice Changer
	Young Male
	Adult Male
	Young Female
	Adult Female

Applicable Scenarios

This feature can be used in the following real-time scenarios to achieve user timbre transformation.

Social voice chat
Game voice
Audio and video live streaming
Virtual human

Prerequisites

Before implementing the AI Voice Changer feature, ensure that:

A project has been created in the ZEGOCLOUD Console, and valid AppID and AppSign have been obtained. For details, please refer to Console - Project Information.
ZEGO Express SDK has been integrated into the project, and basic audio and video streaming functionality has been implemented. For details, please refer to Quick Start - Integration and Quick Start - Implementation Process.

Usage Steps

Developers can complete the relevant settings of AI Voice Changer according to the following steps:

1 Enable Permission

Please confirm that you have contacted ZEGOCLOUD Technical Support for special packaging and enabled the AI Voice Changer permission.

For the specific process of initialization and logging in to the room, please refer to "Create Engine" and "Login Room" in the Implementing Video Call document.

3 Initialize AI Voice Changer Engine Instance

Call the createAIVoiceChanger interface to create an AI Voice Changer engine instance.

Currently, only creating one instance at the same time is supported. Before calling the destroyAIVoiceChanger interface to destroy the instance, creating again will return NULL.
```
// Create AI Voice Changer engine instance
aiVoiceChanger = engine.createAIVoiceChanger();
```
```
// Create AI Voice Changer engine instance
aiVoiceChanger = engine.createAIVoiceChanger();
```

Call the IZegoAIVoiceChanger.setEventHandler interface to set AI Voice Changer engine event callbacks.

// Set AI Voice Changer engine event callbacks
aiVoiceChanger->setEventHandler(this);

// Set AI Voice Changer engine event callbacks
aiVoiceChanger->setEventHandler(this);

Call the IZegoAIVoiceChanger.initEngine interface to initialize the AI Voice Changer engine instance.

Warning
The IZegoAIVoiceChanger.initEngine interface must be called before the startPublishingStream interface to take effect.
```
// Initialize AI Voice Changer engine
aiVoiceChanger->initEngine();
```
```
// Initialize AI Voice Changer engine
aiVoiceChanger->initEngine();
```

4 Update AI Voice Changer Engine Model

Call the IZegoAIVoiceChanger.update interface to update the AI Voice Changer engine model. The AI Voice Changer engine model file is relatively large, and the first update will take a long time. Please be patient.

// Update AI Voice Changer engine model
aiVoiceChanger->update();

// Update AI Voice Changer engine model
aiVoiceChanger->update();

5 Get Timbre List

Call the IZegoAIVoiceChanger.getSpeakerList interface to get the list of available timbres.

The list of available timbres will be returned through the IZegoAIVoiceChangerEventHandler.onGetSpeakerList callback interface.

// Get list of available timbres
aiVoiceChanger->getSpeakerList();

// Get list of available timbres
aiVoiceChanger->getSpeakerList();

6 Set Target Timbre

Call the IZegoAIVoiceChanger.setSpeaker interface to set the timbre. The choice of timbre can be obtained through 5 Get Timbre List.

Setting the timbre ID to 0 indicates using the original voice.

// Set timbre
int speakerID = 0; // Timbre ID
aiVoiceChanger->setSpeaker(speakerID);

// Set timbre
int speakerID = 0; // Timbre ID
aiVoiceChanger->setSpeaker(speakerID);

7 Destroy AI Voice Changer Engine Instance

After using the feature, call the destroyAIVoiceChanger interface to destroy the AI Voice Changer engine instance and release resources such as microphones.

// Destroy AI Voice Changer engine instance
engine.destroyAIVoiceChanger(aiVoiceChanger);

// Destroy AI Voice Changer engine instance
engine.destroyAIVoiceChanger(aiVoiceChanger);

Notes

After completing the above steps, before deploying the application, you need to deploy the cacert.pem file in the SDK to the same directory as the application main program, otherwise the AI Voice Changer feature cannot be used normally.