Quick Start Voice Call
This document explains how to quickly call AI Agent related backend APIs to achieve voice interaction with AI Agent.
Quick Start Digital Human Video Call
Digital Human Introduction
Prerequisites
- You have created a project in the ZEGOCLOUD Console and obtained a valid AppID and AppSign.
- You have contacted ZEGOCLOUd Technical Support to enable AI Agent related services and obtain LLM and TTS related configuration information.
Example Code
The following is the example code for the business backend that integrates the real-time interactive AI Agent API. You can refer to the example code to implement your own business logic.
Includes the basic capabilities of obtaining ZEGO Token, registering AI Agent, creating AI Agent instances, and deleting AI Agent instances.
The following is the client sample code, you can refer to the example code to implement your own business logic.
Includes the basic capabilities of logging in, publishing streams, playing streams, and exiting rooms.
Includes the basic capabilities of logging in, publishing streams, playing streams, and exiting rooms.
Includes the basic capabilities of logging in, publishing streams, playing streams, and exiting rooms.
Includes the basic capabilities of logging in, publishing streams, playing streams, and exiting rooms.
:::
Overall Business Process
- Service backend, run the business backend example code, deploy the business backend
- Integrate the real-time interactive AI Agent API to manage the AI Agent.
- Client, refer to the Android Quick Start , iOS Quick Start or Web Quick Start document to run the client example code
- Create and manage AI Agent through the business backend.
- Integrate ZEGO Express SDK to complete real-time communication.
After completing the above two steps, you can achieve real-time interaction between the AI agent and real users by joining the room.
Core Capabilities Implementation
Register AI Agent
Register AI Agent is used to set the basic configuration of the AI Agent, including the AI Agent name, LLM, TTS, ASR, and other related configurations. After registration, you can create multiple instances with multiple real users as templates.
Usually the AI Agent is relatively fixed, once the related parameters (personality and image) of the AI Agent are set, they will not be changed frequently. So it is recommended to register the AI Agent at the appropriate time according to the business process. The AI Agent will not be automatically destroyed and recycled after registration. After creating an AI Agent instance, you can interact with the AI Agent via voice.
The following is an example of calling the Register AI Agent interface:
// Please replace the authentication parameters such as ApiKey, appid, token of LLM and TTS in the following example with your actual authentication parameters.
async registerAgent(agentId: string, agentName: string) {
// Request interface: https://aigc-aiagent-api.zegotech.cn?Action=RegisterAgent
const action = 'RegisterAgent';
const body = {
AgentId: agentId,
Name: agentName,
LLM: {
Url: "https://ark.cn-beijing.volces.com/api/v3/chat/completions",
ApiKey: "zego_test",
Model: "doubao-lite-32k-240828",
SystemPrompt: "You are an AI Agent, please answer the user's question."
},
TTS: {
Vendor: "ByteDance",
Params: {
"app": {
"appid": "zego_test",
"token": "zego_test",
"cluster": "volcano_tts"
},
"audio": {
"voice_type": "zh_female_wanwanxiaohe_moon_bigtts"
}
}
}
};
// sendRequest method encapsulates the request URL and common parameters. For details, see: https://zegocloud.com/docs/aiagent-server/api-reference/accessing-server-apis
return this.sendRequest<any>(action, body);
}
- Please ensure that all parameters of LLM are filled in correctly according to the official documentation of the LLM service provider, otherwise you may not be able to see the text content of the AI Agent's answer and cannot hear the AI Agent's output voice.
- Please ensure that all parameters of TTS are filled in correctly according to the official documentation of the TTS service provider, otherwise you may be able to see the text content of the AI Agent's answer but cannot hear the AI Agent's output voice.
- If the AI Agent cannot output text content or voice, please first check if the LLM and TTS parameter configurations are completely correct, or refer to Get AI Agent Status - Listen for Service-Side Exception Events to determine the specific problem.
Create AI Agent Instance
Use the registered AI Agent as a template to create multiple AI Agent instances to join different rooms and interact with different users in real time. After creating the AI Agent instance, the AI Agent instance will automatically login the room and push the stream, at the same time, it will also pull the real user's stream.
Create AI Agent Instance successfully, the real user can interact with the AI Agent in real time by listening to the stream change event and pulling the stream in the client.
The following is an example of calling the Create AI Agent Instance interface:
async createAgentInstance(agentId: string, userId: string, rtcInfo: RtcInfo, messages?: any[]) {
// Request interface: https://aigc-aiagent-api.zegotech.cn?Action=CreateAgentInstance
// const rtcInfo = {
// RoomId: room_id,
// AgentStreamId: agent_stream_id,
// AgentUserId: agent_user_id,
// UserStreamId: user_stream_id,
// };
const action = 'CreateAgentInstance';
const body = {
AgentId: agentId,
UserId: userId, // The ID of the real user that interacts with this AI Agent instance
RTC: rtcInfo,
MessageHistory: {
SyncMode: 1, // Change to 0 to use history messages from ZIM
Messages: messages && messages.length > 0 ? messages : [],
WindowSize: 10
}
};
// sendRequest method encapsulates the request URL and common parameters. For details, see: https://zegocloud.com/docs/aiagent-server/api-reference/accessing-server-apis
const result = await this.sendRequest<any>(action, body);
console.log("create agent instance result", result);
// In the client, you need to save the returned AgentInstanceId, which is used for subsequent deletion of the AI Agent instance.
return result.AgentInstanceId;
}
After completing this step, you can create an AI Agent instance. Once the client is integrated, you can interact with the AI Agent instance via voice.
Integrate Client SDK
Please refer to the following documents to complete the client integration development:
Quick Start
Quick Start
Quick Start
Quick Start
Congratulations! 🎉 After completing this step, you have successfully integrated the client SDK and can interact with the AI Agent instance in real-time voice. You can ask the AI Agent any questions, and it will answer your questions!
Delete AI Agent Instance
After Delete AI Agent Instance, the AI Agent instance will automatically exit the room and stop the push stream. The real user will stop the push stream and exit the room after the client, and a complete interactive session will end.
The following is an example of calling the delete AI Agent Instance interface:
async deleteAgentInstance(agentInstanceId: string) {
// Request interface: https://aigc-aiagent-api.zegotech.cn?Action=DeleteAgentInstance
const action = 'DeleteAgentInstance';
const body = {
AgentInstanceId: agentInstanceId
};
// sendRequest method encapsulates the request URL and common parameters. For details, see: https://zegocloud.com/docs/aiagent-server/api-reference/accessing-server-apis
return this.sendRequest(action, body);
}
This is the complete core process for you to implement real-time interaction with the AI Agent.
Listen for Exception Callback
Click to view the guide for listening to exception callbacks. The event with Event as Exception in the listening callback can be quickly located through Data.Code and Data.Message.