logo
On this page

Clone Voice for AI Agent

During real-time voice interaction with an AI agent, you can switch the AI agent's voice to a desired voice, such as a user's voice. By recording just a few seconds of the target person's voice, you can instantly replicate their voice timbre, speaking style, accent, and acoustic environment.

Voice cloning is a value-added capability. For pricing details, please contact ZEGOCLOUD business staff.

Note
Currently, ZEGOCLOUD supports voice cloning and text-to-speech capabilities from service providers including BytePlus, MiniMax, and Alibaba Cloud.

Prerequisites

  • You have integrated the ZEGOCLOUD AI Agent service as shown in the Quick Start.
  • Contact ZEGOCLOUD technical support to select a service provider, activate TTS (Text-to-Speech/Speech Synthesis/Voice Cloning) service, and obtain relevant sub-account or API authentication information.

Steps

1

Clone voice according to service provider instructions

2

Use cloned voice in voice chats

When registering an AI agent or creating an AI agent instance, set the Params field in the TTS structure. This field will be passed through to the third-party TTS interface, including voice information:

  • MiniMax: Fill in voice_id
  • BytePlus: Fill in speaker_id
// Minimax, fill in voice_id to use the cloned voice
"TTS": {
    "Vendor": "MiniMax",
    "Params": {
        "app": {
// !mark(1:2)
            "group_id": "your_group_id",
            "api_key":  "your_api_key"
        },
        "model": "speech-02-turbo",
        "voice_setting": {
// !mark
            "voice_id": "clone_voice_id"
        }
    }
}

Previous

Role-Playing System Prompt

Next

Interact with AI in IM and make voice calls

On this page

Back to top