Get AI Agent Status and Latency Data

During real-time voice calls with AI Agents, you might need to obtain the AI agent instance's status or real-time change messages to handle subsequent operations promptly on the business side or ensure business stability. You can obtain this information through active API calls or by listening to corresponding server callbacks.

The information includes the following types:

Server exception events: including AI Agent service errors, Real-Time Communication (RTC) related errors, Large Language Model (LLM) related errors, Text-to-Speech (TTS) related errors (such as TTS concurrency limit exceeded), etc.
AI agent instance status:
- Status that can be queried via server API: idle, listening, thinking, speaking, etc.
- Status that can be monitored via server callbacks: agent instance creation success, interruption, and deletion success events.
AI agent average latency data:
- Large Language Model (LLM) related latency.
- Text-to-Speech (TTS) related latency.
- AI Agent server total latency.
- Client & server latency. Can be obtained through ZEGO Express SDK.

Listen for Server Exception Events

Note

Please contact ZEGOCLOUD Technical Support to configure the address for receiving AI Agent backend callbacks.

When there are exception events on the server, the AI Agent backend will send an exception event notification (Event is Exception) to the configured address above. Here's a callback content sample:

{
    "AppId": 123456789,
// !mark
    "Event": "Exception",
    "Nonce": "abcdd22113",
    "Timestamp":1741221508000,
    "Signature": "XXXXXXX",
    "Sequence": 1921825797275873300,
    "RoomId": "test_room",
    "AgentUserId": "test_agent",
    "AgentInstanceId": "1912124734317838336",
    "Data": {
        "Code": 2203,
        "Message": "The API key in the request is missing or invalid"
    }
}

{
    "AppId": 123456789,
// !mark
    "Event": "Exception",
    "Nonce": "abcdd22113",
    "Timestamp":1741221508000,
    "Signature": "XXXXXXX",
    "Sequence": 1921825797275873300,
    "RoomId": "test_room",
    "AgentUserId": "test_agent",
    "AgentInstanceId": "1912124734317838336",
    "Data": {
        "Code": 2203,
        "Message": "The API key in the request is missing or invalid"
    }
}

For more detailed information, please refer to the Receiving Callback and Exception Codes documentation.

Get Agent Instance Status

Via Server API

Call the Query Agent Instance Status API (QueryAgentInstanceStatus), pass in the corresponding AgentInstanceId, and the server will return the current status of the AI agent instance (such as idle, listening, thinking, speaking, etc.).

Note

The AgentInstanceId field is included in the successful response when you create an agent instance (CreateAgentInstance).

Note

Please contact ZEGOCLOUD Technical Support to configure the address for receiving AI Agent backend callbacks.

Get Agent Latency Data

Note

Please contact ZEGOCLOUD Technical Support to configure the address for receiving AI Agent backend callbacks.

When an agent instance is successfully deleted, the AgentInstanceDeleted event will be triggered, which includes average latency data for conversations with the agent instance.

AgentInstanceDeleted callback data example

{
    "AppId": 1234567,
    "AgentInstanceId": "1912124734317838336",
    "AgentUserId": "agent_user_1",
    "RoomId": "room_1",
    "Sequence": 1234567890,
    "Data": {
        "Code": 0,
        "DeletedTimestamp": 1745502345138,
        "LatencyData": {
            "LLMTTFT": 613,
            "LLMTPS": 11.493,
            "TTSAudioFirstFrameTime": 783,
            "TotalCost": 1693
        }
    },
    "Event": "AgentInstanceDeleted",
    "Nonce": "7450395512627324902",
    "Signature": "fd9c1ce54e85bd92f48b0a805e82a52b0c0c6445",
    "Timestamp": 1745502313000
}

AgentInstanceDeleted callback data example

{
    "AppId": 1234567,
    "AgentInstanceId": "1912124734317838336",
    "AgentUserId": "agent_user_1",
    "RoomId": "room_1",
    "Sequence": 1234567890,
    "Data": {
        "Code": 0,
        "DeletedTimestamp": 1745502345138,
        "LatencyData": {
            "LLMTTFT": 613,
            "LLMTPS": 11.493,
            "TTSAudioFirstFrameTime": 783,
            "TotalCost": 1693
        }
    },
    "Event": "AgentInstanceDeleted",
    "Nonce": "7450395512627324902",
    "Signature": "fd9c1ce54e85bd92f48b0a805e82a52b0c0c6445",
    "Timestamp": 1745502313000
}

The latency data (average values) are defined as follows:

Parameter	Type	Description
LLMTTFT	Int	LLM first token average latency (milliseconds). The time from requesting the Large Language Model to the Large Language Model returning the first non-empty token.
LLMTPS	Float64	LLM average output speed (tokens/second). The average number of tokens output per second by the Large Language Model.
TTSAudioFirstFrameTime	Int	TTS audio first frame average latency (milliseconds). From the first non-empty LLM token to the first TTS non-silent frame return (including request establishment time)
TotalCost	Int	AI Agent server average total latency (milliseconds): User speaking: The time from when the AI Agent server pulls the stream and determines the user has finished speaking, to when TTS returns the first non-silent frame and starts pushing the stream. All server-generated latency, including at least Automatic Speech Recognition (ASR) latency, Large Language Model (LLM) related latency, Text-to-Speech (TTS) related latency, etc. Custom LLM/TTS calls: The time from API call to start of stream pushing.

Get AI Agent Status and Latency Data

Listen for Server Exception Events

Get Agent Instance Status

Via Server API

Listen for Agent-Related Events

Get Agent Latency Data