logo

Clone Voice for AI Agent

During real-time voice interaction with an AI agent, you can switch the AI agent's voice to a desired voice, such as a user's voice. By recording just a few seconds of the target person's voice, you can instantly replicate their voice timbre, speaking style, accent, and acoustic environment.

Voice cloning is a value-added capability. For pricing details, please contact ZEGOCLOUD business staff.

Note
Currently, ZEGOCLOUD supports voice cloning and text-to-speech capabilities from service providers including BytePlus, MiniMax, and Alibaba Cloud.

Prerequisites

  • You have integrated the ZEGOCLOUD AI Agent service as shown in the Quick Start.
  • Contact ZEGOCLOUD technical support to select a service provider, activate TTS (Text-to-Speech/Speech Synthesis/Voice Cloning) service, and obtain relevant sub-account or API authentication information.

Steps

1

Clone voice according to service provider instructions

MiniMax
BytePlus
  1. Contact ZEGOCLOUD technical support to obtain sub-account, group_ip, and api_key.
  2. Clone voice
    • Method 1: Follow MiniMax Voice Cloning doc to complete voice cloning.
    • Method 2: Complete voice cloning on the MiniMax API Debug Console page
      MinimaxConsoleVoiceClone.jpeg
  3. After cloning is complete, keep the voice_id safe.
  1. Contact ZEGOCLOUD technical support to purchase voice cloning service and obtain voice ID.
  2. Use the BytePlus appid and token provided by ZEGOCLOUD technical support to call Voice Cloning API-2.0 to complete voice cloning.
  3. After cloning is complete, keep the speaker_id and voice cloning cluster safe.
2

Use cloned voice in voice chats

When registering an AI agent or creating an AI agent instance, set the Params field in the TTS structure. This field will be passed through to the third-party TTS interface, including voice information:

  • MiniMax: Fill in voice_id
  • BytePlus: Fill in speaker_id
// Minimax, fill in voice_id to use the cloned voice
"TTS": {
    "Vendor": "MiniMax",
    "Params": {
        "app": {
// !mark(1:2)
            "group_id": "your_group_id",
            "api_key":  "your_api_key"
        },
        "model": "speech-02-turbo-preview",
        "voice_setting": {
// !mark
            "voice_id": "clone_voice_id"
        }
    }
}

Previous

Role-Playing System Prompt

Next

Interact with AI in IM and make voice calls