Configuring LLM
Depending on your use case, you can plug in any third-party LLM—whether it's Volcano Ark, MiniMax, Qwen, Stepfun, DeepSeek, or your own in-house model. This guide walks you through configuring for the above kinds of LLMs and highlights key considerations.
LLM Parameter Description
When using third-party LLM services or custom LLM services, you need to configure LLM parameters.
Parameter | Type | Required | Description |
---|---|---|---|
Url | String | Yes | LLM callback address, which must be compatible with the OpenAI protocol. |
ApiKey | String | No | Authentication credentials for accessing various models and related services provided by LLM. |
Model | String | Yes | The model to call. Different LLM service providers support different configurations, please refer to the corresponding documentation. |
SystemPrompt | String | No | System prompt. Can include role settings, prompts, and response examples. |
Temperature | Float | No | Higher values will make the output more random, while lower values will make the output more focused and deterministic. |
TopP | Float | No | Sampling method, smaller values result in stronger determinism; larger values result in more randomness. |
Params | Object | No | Other LLM parameters, such as maximum Token number limit, etc. Different LLM providers support different configurations, please refer to the corresponding documentation and fill in as needed. Note Parameter names should match those of each vendor's LLM. |
AddAgentInfo | Bool | No | If this value is true, when the AI Agent backend sends requests to custom LLM services, the request parameters will include agent information agent_info . This value defaults to false. When using custom LLM, additional business logic can be implemented based on this parameter content. |
Using Third-party LLMs
Please contact ZEGOCLOUD Technical Support first to activate third-party LLM services and obtain the access Url and API Key.
Third-party LLMs must be compatible with the OpenAI protocol.
You can set LLM parameters when registering an AI agent (RegisterAgent) or creating an AI agent instance (CreateAgentInstance).
Here are configuration samples for common LLM vendors:
Use Custom LLM
The ZEGOCLOUD AI Agent server uses the OpenAI API protocol to call LLM services. Therefore, you can also use any custom LLM compatible with the OpenAI protocol. The custom LLM can even call multiple sub-LLM models or perform RAG search and web search before integrating and outputting results at the underlying implementation level.
The implementation steps are as follows:
Provide an interface compatible with platform.openai.com. Key points are as follows:
- Endpoint: Define a Url that can be called by the AI Agent, for example
https://your-custom-llm-service/chat/completions
. - Request Format: Accept request headers and request body compatible with the OpenAI protocol.
- Response Format: Return streaming response data that is compatible with the OpenAI protocol and conforms to the SSE specification.
Custom LLM streaming data format considerations:
- Each data entry must start with
data:
(note the space after the colon). - The last valid data entry must contain
"finish_reason":"stop"
. - A termination data entry must be sent at the end:
data: [DONE]
.
Incorrect format may cause the agent to not output or output incompletely.
When registering an AI agent (RegisterAgent) or creating an AI agent instance (CreateAgentInstance), set the configuration for the custom LLM.
"LLM": {
"Url": "https://your-custom-llm-service/chat/completions",
"ApiKey": "your_api_key",
"Model": "your_model",
"SystemPrompt": "You are Xiaozhi, an adult woman, a companion assistant **created by ZEGOCLOUD**. knowledgeable in everything, intelligent, wise, enthusiastic, and friendly. \nDialogue requirements: 1. Dialogue with users according to the requirements of the persona. \n2.No more than 100 words.",
"Temperature": 1,
"TopP": 0.7,
"Params": {
"max_tokens": 1024
}
}