logo
On this page

Configuring LLM


Depending on your use case, you can plug in any third-party LLM—whether it's Volcano Ark, MiniMax, Qwen, Stepfun, DeepSeek, or your own in-house model. This guide walks you through configuring for the above kinds of LLMs and highlights key considerations.

LLM Parameter Description

When using third-party LLM services or custom LLM services, you need to configure LLM parameters.

ParameterTypeRequiredDescription
VendorStringNoThe vendor type of the LLM interface.
  • OpenAIChat: OpenAI's Chat Completions interface type (default).
  • OpenAIResponses: OpenAI's Responses API interface type.
UrlStringYesLLM callback address, which must be compatible with the OpenAI protocol.
ApiKeyStringNoAuthentication credentials for accessing various models and related services provided by LLM.
ModelStringYesThe model to call. Different LLM service providers support different configurations, please refer to the corresponding documentation.
SystemPromptStringNoSystem prompt. Can include role settings, prompts, and response examples.
TemperatureFloatNoHigher values will make the output more random, while lower values will make the output more focused and deterministic.
TopPFloatNoSampling method, smaller values result in stronger determinism; larger values result in more randomness.
ParamsObjectNoOther LLM parameters, such as maximum Token number limit, etc. Different LLM providers support different configurations, please refer to the corresponding documentation and fill in as needed.
Note
Parameter names should match those of each vendor's LLM.
AddAgentInfoBoolNoIf this value is true, when the AI Agent backend sends requests to custom LLM services, the request parameters will include agent information agent_info. This value defaults to false. When using custom LLM, additional business logic can be implemented based on this parameter content.
Note
Only effective when Vendor is "OpenAIChat".
AgentExtraInfoObjectNoAgent extra information, which is passed to the LLM service when the server requests the LLM service. The example of the passed parameter is referred to Using Custom LLM. You can use this parameter to execute additional business logic in your custom LLM service.
Note
Only effective when Vendor is "OpenAIChat".

Using Third-party LLMs

Note

Third-party LLMs must be compatible with the OpenAI protocol.

Warning

The max_tokens parameter maximum value varies for different vendors and models. Please set a reasonable value according to your actual needs. If you do not understand the meaning of this parameter, you can leave it unset and use the default value.

  • If max_tokens exceeds the model limit, the request may fail.
  • If max_tokens is set too small, the output may be incomplete and the response may be truncated.

You can set LLM parameters when registering an AI agent (RegisterAgent) or creating an AI agent instance (CreateAgentInstance).

Here are configuration samples for common LLM vendors:

Use Custom LLM

The ZEGOCLOUD AI Agent server uses the OpenAI API protocol to call LLM services. Therefore, you can also use any custom LLM compatible with the OpenAI protocol. The custom LLM can even call multiple sub-LLM models or perform RAG search and web search before integrating and outputting results at the underlying implementation level.

1

Implement Custom LLM

Create an interface that conforms to the OpenAI Chat Completions API protocol.

2

Register Agent and Use Custom LLM

When registering the agent (RegisterAgent), set the custom LLM URL, and require the LLM to answer the user's question based on the knowledge base content in the SystemPrompt.

Register Agent Call Example
// Please replace the LLM and TTS authentication parameters such as ApiKey, appid, token, etc. with your actual authentication parameters.
async registerAgent(agentId: string, agentName: string) {
    // Request interface: https://aigc-aiagent-api.zegotech.cn?Action=RegisterAgent
    const action = 'RegisterAgent';
    // !mark(4:9)
    const body = {
        AgentId: agentId,
        Name: agentName,
        LLM: {
            Url: "https://your-custom-llm-service/chat/completions",
            ApiKey: "your_api_key",
            Model: "your_model",
            SystemPrompt: "Please answer the user's question in a friendly manner based on the knowledge base content provided by the user. If the user's question is not in the knowledge base, please politely tell the user that we do not have related knowledge base content."
        },
        TTS: {
            Vendor: "ByteDance",
            Params: {
                "app": {
                    "appid": "zego_test",
                    "token": "zego_test",
                    "cluster": "volcano_tts"
                },
                "audio": {
                    "voice_type": "zh_female_wanwanxiaohe_moon_bigtts"
                }
            }
        }
    };
    // The sendRequest method encapsulates the request URL and public parameters. For details, please refer to: https://doc-zh.zego.im/aiagent-server/api-reference/accessing-server-apis
    return this.sendRequest<any>(action, body);
}

You can now chat with your custom LLM.

Best Practices

Detailed usage cases please refer to Use AI Agent with RAG.

How to Configure Multimodal LLM - Configure LLM with Text Input and Audio Output

Compared to LLM requests with pure text input and output, requests with audio output have additional modalities field to specify output modality and audio field to specify output voice and format in the request body. These fields are specified in the LLM.Params parameter.

Note

Please specify DisableTTS:true in AdvancedConfig when calling CreateAgentInstance or CreateDigitalHumanAgentInstance to disable TTS functionality.

The following is an example of LLM audio output parameter configuration:

"LLM": {
    "Url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/chat/completions",
    "ApiKey": "your_api_key",
    "Model": "qwen3-omni-flash",
    "SystemPrompt": "You are Xiao Zhi, an adult female, a **companion assistant created by ZEGOCLOUD Technology**, knowledgeable in astronomy and geography, smart, wise, enthusiastic, and friendly.\nDialogue requirements: 1. Interact with users according to the character requirements.\n2. Do not exceed 100 words.",
    "Temperature": 1,
    "TopP": 0.7,
    "Params": {
        "max_tokens": 1024,
        "stream":true,
        "stream_options":{"include_usage":true},
        "modalities":["text","audio"],
        "audio":{"voice":"Cherry","format":"pcm"}
    }
}

Validate LLM Parameters

If you are not sure about the correctness of the LLM parameters when configuring, you can use the LLM parameter validator to verify.

  • Select the appropriate LLM vendor, select the development language, fill in the LLM parameters, and click the validate button to verify.
  • The user message can be filled with a user message to test whether the LLM can correctly understand the user's intention and return the correct response. If not filled, the default is "Hello, please introduce yourself" as the user message.
  • The "Sample Code" Tab is the sample code for the corresponding language.
  • The "Actual Request LLM Parameters" is the actual parameters sent to the LLM, you can compare with the official documentation of the selected vendor to ensure the parameters are filled correctly.
LLM Validator

Previous

Quick Start Digital Human Video Call

Next

Configuring ASR

On this page

Back to top