Get AI Agent Status and Latency Data
During real-time voice calls with AI Agents, you might need to obtain the AI agent instance's status or real-time change messages to handle subsequent operations promptly on the business side or ensure business stability. You can obtain this information through active API calls or by listening to corresponding server callbacks.
The information includes the following types:
- Server exception events: including AI Agent service errors, Real-Time Communication (RTC) related errors, Large Language Model (LLM) related errors, Text-to-Speech (TTS) related errors (such as TTS concurrency limit exceeded), etc.
- AI agent instance status:
- Status that can be queried via server API: idle, listening, thinking, speaking, etc.
- Status that can be monitored via server callbacks: agent instance creation success, interruption, and deletion success events.
- AI agent average latency data:
- Large Language Model (LLM) related latency.
- Text-to-Speech (TTS) related latency.
- AI Agent server total latency.
- Client & server latency. Can be obtained through ZEGO Express SDK.
Listen for Server Exception Events
When there are exception events on the server, the AI Agent backend will send an exception event notification (Event
is Exception
) to the configured address above. Here's a callback content sample:
{
"AppId": 123456789,
// !mark
"Event": "Exception",
"Nonce": "abcdd22113",
"Timestamp":1741221508000,
"Signature": "XXXXXXX",
"Sequence": 1921825797275873300,
"RoomId": "test_room",
"AgentUserId": "test_agent",
"AgentInstanceId": "1912124734317838336",
"Data": {
"Code": 2203,
"Message": "The API key in the request is missing or invalid"
}
}
For more detailed information, please refer to the Receiving Callback and Exception Codes documentation.
Get Agent Instance Status
Via Server API
Call the Query Agent Instance Status API (QueryAgentInstanceStatus), pass in the corresponding AgentInstanceId
, and the server will return the current status of the AI agent instance (such as idle, listening, thinking, speaking, etc.).
AgentInstanceId
field is included in the successful response when you create an agent instance (CreateAgentInstance).Listen for Agent-Related Events
Get Agent Latency Data
When an agent instance is successfully deleted, the AgentInstanceDeleted event will be triggered, which includes average latency data for conversations with the agent instance.
{
"AppId": 1234567,
"AgentInstanceId": "1912124734317838336",
"AgentUserId": "agent_user_1",
"RoomId": "room_1",
"Sequence": 1234567890,
"Data": {
"Code": 0,
"DeletedTimestamp": 1745502345138,
"LatencyData": {
"LLMTTFT": 613,
"LLMTPS": 11.493,
"TTSAudioFirstFrameTime": 783,
"TotalCost": 1693
}
},
"Event": "AgentInstanceDeleted",
"Nonce": "7450395512627324902",
"Signature": "fd9c1ce54e85bd92f48b0a805e82a52b0c0c6445",
"Timestamp": 1745502313000
}
The latency data (average values) are defined as follows:

Parameter | Type | Description |
---|---|---|
LLMTTFT | Int | LLM first token average latency (milliseconds). The time from requesting the Large Language Model to the Large Language Model returning the first non-empty token. |
LLMTPS | Float64 | LLM average output speed (tokens/second). The average number of tokens output per second by the Large Language Model. |
TTSAudioFirstFrameTime | Int | TTS audio first frame average latency (milliseconds). From the first non-empty LLM token to the first TTS non-silent frame return (including request establishment time) |
TotalCost | Int | AI Agent server average total latency (milliseconds):
|