logo
On this page

Configuring LLM


Depending on your use case, you can plug in any third-party LLM—whether it's Volcano Ark, MiniMax, Qwen, Stepfun, DeepSeek, or your own in-house model. This guide walks you through configuring for the above kinds of LLMs and highlights key considerations.

LLM Parameter Description

When using third-party LLM services or custom LLM services, you need to configure LLM parameters.

ParameterTypeRequiredDescription
UrlStringYesLLM callback address, which must be compatible with the OpenAI protocol.
ApiKeyStringNoAuthentication credentials for accessing various models and related services provided by LLM.
ModelStringYesThe model to call. Different LLM service providers support different configurations, please refer to the corresponding documentation.
SystemPromptStringNoSystem prompt. Can include role settings, prompts, and response examples.
TemperatureFloatNoHigher values will make the output more random, while lower values will make the output more focused and deterministic.
TopPFloatNoSampling method, smaller values result in stronger determinism; larger values result in more randomness.
ParamsObjectNoOther LLM parameters, such as maximum Token number limit, etc. Different LLM providers support different configurations, please refer to the corresponding documentation and fill in as needed.
Note
Parameter names should match those of each vendor's LLM.
AddAgentInfoBoolNoIf this value is true, when the AI Agent backend sends requests to custom LLM services, the request parameters will include agent information agent_info. This value defaults to false. When using custom LLM, additional business logic can be implemented based on this parameter content.

Using Third-party LLMs

Note

Please contact ZEGOCLOUD Technical Support first to activate third-party LLM services and obtain the access Url and API Key.

Third-party LLMs must be compatible with the OpenAI protocol.

You can set LLM parameters when registering an AI agent (RegisterAgent) or creating an AI agent instance (CreateAgentInstance).

Here are configuration samples for common LLM vendors:

Use Custom LLM

The ZEGOCLOUD AI Agent server uses the OpenAI API protocol to call LLM services. Therefore, you can also use any custom LLM compatible with the OpenAI protocol. The custom LLM can even call multiple sub-LLM models or perform RAG search and web search before integrating and outputting results at the underlying implementation level.

The implementation steps are as follows:

1
Create a service compatible with the OpenAI API protocol

Provide an interface compatible with platform.openai.com. Key points are as follows:

  • Endpoint: Define a Url that can be called by the AI Agent, for example https://your-custom-llm-service/chat/completions.
  • Request Format: Accept request headers and request body compatible with the OpenAI protocol.
  • Response Format: Return streaming response data that is compatible with the OpenAI protocol and conforms to the SSE specification.
Note

Custom LLM streaming data format considerations:

  • Each data entry must start with data: (note the space after the colon).
  • The last valid data entry must contain "finish_reason":"stop".
  • A termination data entry must be sent at the end: data: [DONE].

Incorrect format may cause the agent to not output or output incompletely.

2
Configure the custom LLM

When registering an AI agent (RegisterAgent) or creating an AI agent instance (CreateAgentInstance), set the configuration for the custom LLM.

Untitled
"LLM": {
    "Url": "https://your-custom-llm-service/chat/completions",
    "ApiKey": "your_api_key",
    "Model": "your_model",
    "SystemPrompt": "You are Xiaozhi, an adult woman, a companion assistant **created by ZEGOCLOUD**. knowledgeable in everything, intelligent, wise, enthusiastic, and friendly. \nDialogue requirements: 1. Dialogue with users according to the requirements of the persona. \n2.No more than 100 words.",
    "Temperature": 1,
    "TopP": 0.7,
    "Params": {
        "max_tokens": 1024
    }
}

1
Copied!

Previous

Get AI Agent Status

Next

Proactive Invocation of LLM and TTS