Getting MetaInfo During AI Broadcast
Feature Overview
MetaInfo enables the AI Agent to notify the business service when it broadcasts keywords or reaches key nodes. Based on this capability, special business logic can be triggered when users actually hear certain nodes or key information, for example:
- In AI digital human intelligent tutoring scenarios, when the digital human teacher responds to a student: "You're right, teacher gives you a thumbs up." When the AI starts broadcasting "thumbs up", the digital human is triggered to perform the corresponding thumbs-up action.
- In AI e-commerce live streaming, when the anchor says "link is up", the business side receives a callback notification and triggers the display of the shopping link or a popup.
Since the timing of voice broadcast and command delivery needs to be aligned, there are certain requirements for TTS vendors. Currently, only the following TTS vendors are supported:
- MiniMax: MiniMax TTS
- ByteDance: Volcano Engine unidirectional streaming TTS.
Refer to Configure TTS documentation to learn how to set up TTS vendors.

Implement Exposing MetaInfo When AI Speaks at Fixed Nodes
Prerequisites
- Enable AI Agent service and complete the basic process according to Quick Start.
- Configure TTS vendor to MiniMax or ByteDance according to the feature requirements.
Configure wrapper characters and MetaInfo format
By configuring the AdvancedConfig.LLMMetaInfo parameter of the Create Agent Instance API, specify the wrapper characters and format of MetaInfo.
| Parameter Name | Type | Description | Example |
|---|---|---|---|
| BeginCharacters | string | The starting symbol that marks metadata in text. Content from this symbol until the EndCharacters symbol is metadata. Cannot be empty or pure spaces. Do not use common characters or sentence separators. | [[ |
| EndCharacters | string | The ending symbol that marks metadata in text. Content from this symbol back until the BeginCharacters symbol is metadata. Cannot be empty or pure spaces. Do not use common characters or sentence separators. | ]] |
| SendMetaKeys | string array | Specifies the range of keys to send in MetaInfo, e.g., ["emotion", "action"]. Important Note Only keys and values within this range will be sent to the business side via room signaling or callback. Message types: - Room message Cmd = 102 - Server callback "AgentInstanceMetaInfo" | ["emotion", "action"] |
Let's take the example of specifying text wrapped in "[[" and "]]" in LLM text as metadata. The configuration example is as follows:
"LLMMetaInfo" : {
"BeginCharacters": "[[",
"EndCharacters": "]]",
"SendMetaKeys":["action"]
}After configuration, when AI speaks text wrapped with "[[" and "]]", the AI Agent service will send the metadata with key "action" to the business side via room signaling or callback.
Let AI speak text containing MetaInfo
There are two ways to make AI speak text containing MetaInfo:
Get MetaInfo
There are two ways to get MetaInfo:
- Client gets MetaInfo through real-time audio and video (RTC) room signaling.
- Server gets MetaInfo through callbacks.
