logo
On this page

Quick Start Voice Call

This document explains how to quickly call AI Agent related backend APIs to achieve voice interaction with AI Agent.

Quick Start Digital Human Video Call

Prerequisites

  • You have created a project in the ZEGOCLOUD Console and obtained a valid AppID and AppSign.
  • You have contacted ZEGOCLOUd Technical Support to enable AI Agent related services and obtain LLM and TTS related configuration information.
Note
During the test period (within 2 weeks after the AI Agent service is enabled), you can set the LLM and TTS authentication parameters to "zego_test" to use the related services. For details, see Agent Parameter Description.

Example Code

The following is the example code for the business backend that integrates the real-time interactive AI Agent API. You can refer to the example code to implement your own business logic.

The following is the client sample code, you can refer to the example code to implement your own business logic.

:::

Overall Business Process

  1. Service backend, run the business backend example code, deploy the business backend
    • Integrate the real-time interactive AI Agent API to manage the AI Agent.
  1. Client, refer to the Android Quick Start , iOS Quick Start or Web Quick Start document to run the client example code
    • Create and manage AI Agent through the business backend.
    • Integrate ZEGO Express SDK to complete real-time communication.

After completing the above two steps, you can achieve real-time interaction between the AI agent and real users by joining the room.

Core Capabilities Implementation

1

Register AI Agent

Register AI Agent is used to set the basic configuration of the AI Agent, including the AI Agent name, LLM, TTS, ASR, and other related configurations. After registration, you can create multiple instances with multiple real users as templates.

Usually the AI Agent is relatively fixed, once the related parameters (personality and image) of the AI Agent are set, they will not be changed frequently. So it is recommended to register the AI Agent at the appropriate time according to the business process. The AI Agent will not be automatically destroyed and recycled after registration. After creating an AI Agent instance, you can interact with the AI Agent via voice.

Note
An AI Agent can only be registered once (the same ID), if it is registered again, it will return error code 410001008.

The following is an example of calling the Register AI Agent interface:

Server(NodeJS)
// Please replace the authentication parameters such as ApiKey, appid, token of LLM and TTS in the following example with your actual authentication parameters.
async registerAgent(agentId: string, agentName: string) {
    // Request interface: https://aigc-aiagent-api.zegotech.cn?Action=RegisterAgent
    const action = 'RegisterAgent';
    const body = {
        AgentId: agentId,
        Name: agentName,
        LLM: {
            Url: "https://ark.cn-beijing.volces.com/api/v3/chat/completions",
            ApiKey: "zego_test",
            Model: "doubao-lite-32k-240828",
            SystemPrompt: "You are an AI Agent, please answer the user's question."
        },
        TTS: {
            Vendor: "ByteDance",
            Params: {
                "app": {
                    "appid": "zego_test",
                    "token": "zego_test",
                    "cluster": "volcano_tts"
                },
                "audio": {
                    "voice_type": "zh_female_wanwanxiaohe_moon_bigtts"
                }
            }
        }
    };
    // sendRequest method encapsulates the request URL and common parameters. For details, see: https://zegocloud.com/docs/aiagent-server/api-reference/accessing-server-apis
    return this.sendRequest<any>(action, body);
}
Note
  • Please ensure that all parameters of LLM are filled in correctly according to the official documentation of the LLM service provider, otherwise you may not be able to see the text content of the AI Agent's answer and cannot hear the AI Agent's output voice.
  • Please ensure that all parameters of TTS are filled in correctly according to the official documentation of the TTS service provider, otherwise you may be able to see the text content of the AI Agent's answer but cannot hear the AI Agent's output voice.
  • If the AI Agent cannot output text content or voice, please first check if the LLM and TTS parameter configurations are completely correct, or refer to Get AI Agent Status - Listen for Service-Side Exception Events to determine the specific problem.
2

Create AI Agent Instance

Use the registered AI Agent as a template to create multiple AI Agent instances to join different rooms and interact with different users in real time. After creating the AI Agent instance, the AI Agent instance will automatically login the room and push the stream, at the same time, it will also pull the real user's stream.

Create AI Agent Instance successfully, the real user can interact with the AI Agent in real time by listening to the stream change event and pulling the stream in the client.

Note
In the client, after successfully entering the room, you should immediately call this interface to allow the AI Agent instance to join the room and start pushing and pulling streams
Note
By default, each account can have at most 10 AI Agent instances. If the limit is exceeded, the creation of an AI Agent instance will fail. If you need to adjust this limit, please contact ZEGOCLOUD Technical Support.

The following is an example of calling the Create AI Agent Instance interface:

Server(NodeJS)
async createAgentInstance(agentId: string, userId: string, rtcInfo: RtcInfo, messages?: any[]) {
    // Request interface: https://aigc-aiagent-api.zegotech.cn?Action=CreateAgentInstance
    // const rtcInfo = {
    //   RoomId: room_id,
    //   AgentStreamId: agent_stream_id,
    //   AgentUserId: agent_user_id,
    //   UserStreamId: user_stream_id,
    // };
    const action = 'CreateAgentInstance';
    const body = {
        AgentId: agentId,
        UserId: userId, // The ID of the real user that interacts with this AI Agent instance
        RTC: rtcInfo,
        MessageHistory: {
            SyncMode: 1, // Change to 0 to use history messages from ZIM
            Messages: messages && messages.length > 0 ? messages : [],
            WindowSize: 10
        }
    };
    // sendRequest method encapsulates the request URL and common parameters. For details, see: https://zegocloud.com/docs/aiagent-server/api-reference/accessing-server-apis
    const result = await this.sendRequest<any>(action, body);
    console.log("create agent instance result", result);
    // In the client, you need to save the returned AgentInstanceId, which is used for subsequent deletion of the AI Agent instance.
    return result.AgentInstanceId;
}

After completing this step, you can create an AI Agent instance. Once the client is integrated, you can interact with the AI Agent instance via voice.

3

Integrate Client SDK

Please refer to the following documents to complete the client integration development:

Congratulations! 🎉 After completing this step, you have successfully integrated the client SDK and can interact with the AI Agent instance in real-time voice. You can ask the AI Agent any questions, and it will answer your questions!

4

Delete AI Agent Instance

After Delete AI Agent Instance, the AI Agent instance will automatically exit the room and stop the push stream. The real user will stop the push stream and exit the room after the client, and a complete interactive session will end.

The following is an example of calling the delete AI Agent Instance interface:

Server(NodeJS)
async deleteAgentInstance(agentInstanceId: string) {
    // Request interface: https://aigc-aiagent-api.zegotech.cn?Action=DeleteAgentInstance
    const action = 'DeleteAgentInstance';
    const body = {
        AgentInstanceId: agentInstanceId
    };
    // sendRequest method encapsulates the request URL and common parameters. For details, see: https://zegocloud.com/docs/aiagent-server/api-reference/accessing-server-apis
    return this.sendRequest(action, body);
}

This is the complete core process for you to implement real-time interaction with the AI Agent.

Listen for Exception Callback

Note
Since there are many and complex parameters such as LLM and TTS, it is easy to cause various abnormal problems such as the AI Agent not answering or not speaking due to parameter configuration errors during the test. We strongly recommend that you listen for exception callbacks during the test and quickly troubleshoot problems based on the callback information.

Previous

Release Notes

Next

Quick Start Digital Human Video Call

On this page

Back to top