To begin with, modern apps need intelligent chatbots that understand users and respond naturally. Therefore, whether you’re building customer support, educational tools, or interactive assistants, users now expect chatbots to handle both text messages and voice interactions seamlessly. In addition, this guide will walk you through how to build an AI chatbot with text and voice capabilities using ZEGOCLOUD. Step by step, you’ll learn how to create a responsive interface that works across devices, integrate real-time AI processing, and ultimately deploy a chatbot that feels natural, engaging, and responsive to users.
How to Make an AI Chatbot in Easy Steps
Traditionally, chatbot development often requires connecting multiple services, including natural language processing, text-to-speech, speech recognition, and real-time messaging. Moreover, each of these services demands separate integration, authentication, and error handling. As a result, ensuring both low latency and high reliability quickly becomes a complex challenge.
ZEGOCLOUD’s AI Agent platform simplifies this process with an all-in-one chatbot solution. Instead of juggling multiple integrations, you can build intelligent chatbots through a single SDK that seamlessly manages text conversations, processes voice input, and delivers natural speech responses. This approach not only reduces development complexity but also ensures a smoother, real-time experience for users.

Prerequisites
Before you begin, gather these essential components:
- ZEGOCLOUD developer account with active AppID and ServerSecret from the console – Sign up here.
- Node.js 18+ installed locally for both backend and frontend development.
- OpenAI API key or compatible language model provider for intelligent responses.
- Code editor with TypeScript support for better development experience.
- Testing device with microphone access since chatbot voice features require actual hardware.
- Basic familiarity with React hooks and Express.js for building the user interface and API endpoints.
Once you have these prerequisites in place, you can proceed with the steps below.
1. Chatbot Backend
Begin by creating the server infrastructure that manages chatbot sessions and handles AI agent communication. This backend serves as the bridge between your frontend interface and ZEGOCLOUD’s AI services.
Create your project structure and initialize the backend:
mkdir ai-chatbot
cd ai-chatbot
mkdir server client
cd server
npm init -y
Configure the server dependencies in server/package.json
:
{
"name": "ai-chatbot-server",
"version": "1.0.0",
"type": "module",
"scripts": {
"dev": "tsx watch src/server.ts",
"build": "tsc",
"start": "node dist/server.js"
},
"dependencies": {
"express": "^5.1.0",
"cors": "^2.8.5",
"dotenv": "^17.2.1",
"axios": "^1.11.0"
},
"devDependencies": {
"@types/express": "^5.0.3",
"@types/cors": "^2.8.19",
"@types/node": "^24.3.0",
"typescript": "^5.9.2",
"tsx": "^4.20.4"
}
}
This configuration establishes a TypeScript-enabled Express server with development hot reloading and production build capabilities.
Set up environment variables in server/.env
:
# ZEGOCLOUD Configuration
ZEGO_APP_ID=your_zego_app_id
ZEGO_SERVER_SECRET=your_zego_server_secret
ZEGO_API_BASE_URL=https://aigc-aiagent-api.zegotech.cn
# AI Model Configuration
LLM_URL=https://api.openai.com/v1/chat/completions
LLM_API_KEY=your_openai_api_key
LLM_MODEL=gpt-4o-mini
# Server Settings
PORT=8080
These variables configure your ZEGOCLOUD credentials, AI model settings, and server port. Replace placeholder values with your actual credentials from the respective service dashboards.
Install dependencies and create the TypeScript configuration:
npm install
Create server/tsconfig.json
:
{
"compilerOptions": {
"target": "ES2022",
"module": "ESNext",
"moduleResolution": "bundler",
"allowSyntheticDefaultImports": true,
"esModuleInterop": true,
"strict": true,
"outDir": "./dist",
"rootDir": "./src",
"declaration": true,
"sourceMap": true,
"resolveJsonModule": true
},
"include": ["src/**/*"],
"exclude": ["node_modules", "dist"]
}
This TypeScript configuration enables modern JavaScript features with strict type checking for reliable chatbot backend development.
2. ZEGOCLOUD Token Generation
Create the authentication token generator at server/zego-token.cjs
:
"use strict";
const crypto = require("crypto");
function generateToken04(appId, userId, secret, effectiveTimeInSeconds, payload) {
if (!appId || typeof appId !== 'number') {
throw new Error('appID invalid');
}
if (!userId || typeof userId !== 'string' || userId.length > 64) {
throw new Error('userId invalid');
}
if (!secret || typeof secret !== 'string' || secret.length !== 32) {
throw new Error('secret must be a 32 byte string');
}
if (!(effectiveTimeInSeconds > 0)) {
throw new Error('effectiveTimeInSeconds invalid');
}
const VERSION_FLAG = '04';
const createTime = Math.floor(new Date().getTime() / 1000);
const tokenInfo = {
app_id: appId,
user_id: userId,
nonce: Math.floor(Math.random() * (Math.pow(2, 31) - (-Math.pow(2, 31)) + 1)) + (-Math.pow(2, 31)),
ctime: createTime,
expire: createTime + effectiveTimeInSeconds,
payload: payload || ''
};
const plaintText = JSON.stringify(tokenInfo);
// AES-GCM encryption
const nonce = crypto.randomBytes(12);
const cipher = crypto.createCipher('aes-256-gcm', secret, nonce);
cipher.setAutoPadding(true);
const encrypted = cipher.update(plaintText, 'utf8');
const encryptBuf = Buffer.concat([encrypted, cipher.final(), cipher.getAuthTag()]);
// Binary token assembly
const b1 = new Uint8Array(8);
const b2 = new Uint8Array(2);
const b3 = new Uint8Array(2);
const b4 = new Uint8Array(1);
new DataView(b1.buffer).setBigInt64(0, BigInt(tokenInfo.expire), false);
new DataView(b2.buffer).setUint16(0, nonce.byteLength, false);
new DataView(b3.buffer).setUint16(0, encryptBuf.byteLength, false);
new DataView(b4.buffer).setUint8(0, 1);
const buf = Buffer.concat([
Buffer.from(b1),
Buffer.from(b2),
Buffer.from(nonce),
Buffer.from(b3),
Buffer.from(encryptBuf),
Buffer.from(b4),
]);
return VERSION_FLAG + Buffer.from(buf).toString('base64');
}
module.exports = { generateToken04 };
This token generator creates secure authentication tokens using ZEGOCLOUD’s token04 format with AES-GCM encryption for chatbot session security.
3. Server Implementation
Create the main server file at server/src/server.ts
:
import express from 'express'
import cors from 'cors'
import dotenv from 'dotenv'
import axios from 'axios'
import { generateToken04 } from '../zego-token.cjs'
dotenv.config()
const app = express()
const PORT = process.env.PORT || 8080
// Middleware setup
app.use(cors({
origin: ['http://localhost:5173', 'http://localhost:3000'],
credentials: true
}))
app.use(express.json())
// Environment validation
const requiredVars = ['ZEGO_APP_ID', 'ZEGO_SERVER_SECRET', 'LLM_API_KEY']
const missingVars = requiredVars.filter(varName => !process.env[varName])
if (missingVars.length > 0) {
console.error('❌ Missing environment variables:', missingVars)
process.exit(1)
}
const ZEGO_APP_ID = parseInt(process.env.ZEGO_APP_ID!)
const ZEGO_SERVER_SECRET = process.env.ZEGO_SERVER_SECRET!
const ZEGO_API_BASE_URL = process.env.ZEGO_API_BASE_URL!
const LLM_URL = process.env.LLM_URL!
const LLM_API_KEY = process.env.LLM_API_KEY!
const LLM_MODEL = process.env.LLM_MODEL || 'gpt-4o-mini'
// Health check endpoint
app.get('/health', (req, res) => {
res.json({
status: 'healthy',
chatbot: 'ready',
timestamp: new Date().toISOString()
})
})
// Generate authentication token
app.get('/api/token', (req, res) => {
try {
const { user_id } = req.query
if (!user_id || typeof user_id !== 'string') {
return res.status(400).json({
success: false,
error: 'User ID required for chatbot session'
})
}
const token = generateToken04(
ZEGO_APP_ID,
user_id,
ZEGO_SERVER_SECRET,
7200, // 2 hours
''
)
console.log(`🔑 Generated chatbot token for user: ${user_id}`)
res.json({
success: true,
token,
expires_in: 7200
})
} catch (error) {
console.error('❌ Token generation failed:', error)
res.status(500).json({
success: false,
error: 'Failed to generate chatbot token'
})
}
})
// Start chatbot session
app.post('/api/chatbot/start', async (req, res) => {
try {
const { room_id, user_id } = req.body
if (!room_id || !user_id) {
return res.status(400).json({
success: false,
error: 'Room ID and User ID required for chatbot'
})
}
console.log(`🤖 Starting chatbot session for room: ${room_id}`)
const response = await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/start`, {
app_id: ZEGO_APP_ID,
room_id: room_id,
user_id: user_id,
user_stream_id: `${user_id}_stream`,
ai_agent_config: {
llm_config: {
url: LLM_URL,
api_key: LLM_API_KEY,
model: LLM_MODEL,
context: [
{
role: "system",
content: "You are a helpful AI chatbot assistant. Provide concise, friendly, and informative responses. Keep answers conversational and engaging while being helpful."
}
]
},
tts_config: {
provider: "elevenlabs",
voice_id: "pNInz6obpgDQGcFmaJgB",
model: "eleven_turbo_v2_5"
},
asr_config: {
provider: "deepgram",
language: "en"
}
}
}, {
timeout: 30000
})
if (response.data?.data?.ai_agent_instance_id) {
const agentInstanceId = response.data.data.ai_agent_instance_id
console.log(`✅ Chatbot started: ${agentInstanceId}`)
res.json({
success: true,
chatbotId: agentInstanceId,
room_id,
user_id
})
} else {
throw new Error('Invalid response from ZEGOCLOUD')
}
} catch (error: any) {
console.error('❌ Chatbot start failed:', error.response?.data || error.message)
res.status(500).json({
success: false,
error: 'Failed to start chatbot session'
})
}
})
// Send message to chatbot
app.post('/api/chatbot/message', async (req, res) => {
try {
const { chatbot_id, message } = req.body
if (!chatbot_id || !message) {
return res.status(400).json({
success: false,
error: 'Chatbot ID and message required'
})
}
console.log(`💬 Sending to chatbot ${chatbot_id}: ${message.substring(0, 50)}...`)
await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/chat`, {
ai_agent_instance_id: chatbot_id,
messages: [
{
role: "user",
content: message
}
]
}, {
timeout: 30000
})
res.json({
success: true,
message: 'Message sent to chatbot'
})
} catch (error: any) {
console.error('❌ Chatbot message failed:', error.response?.data || error.message)
res.status(500).json({
success: false,
error: 'Failed to send message to chatbot'
})
}
})
// Stop chatbot session
app.post('/api/chatbot/stop', async (req, res) => {
try {
const { chatbot_id } = req.body
if (!chatbot_id) {
return res.status(400).json({
success: false,
error: 'Chatbot ID required'
})
}
console.log(`🛑 Stopping chatbot: ${chatbot_id}`)
await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/stop`, {
ai_agent_instance_id: chatbot_id
}, {
timeout: 30000
})
res.json({
success: true,
message: 'Chatbot session stopped'
})
} catch (error: any) {
console.error('❌ Chatbot stop failed:', error.response?.data || error.message)
res.status(500).json({
success: false,
error: 'Failed to stop chatbot'
})
}
})
app.listen(PORT, () => {
console.log(`🚀 Chatbot server running on port ${PORT}`)
console.log(`🤖 Ready to serve AI chatbot requests`)
})
4. Frontend Chatbot Setup
Navigate to the client directory and initialize the React chatbot interface:
cd ../client
npm init -y
Configure the frontend dependencies in client/package.json
:
{
"name": "ai-chatbot-client",
"private": true,
"version": "0.0.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "tsc -b && vite build",
"preview": "vite preview"
},
"dependencies": {
"@tailwindcss/vite": "^4.1.11",
"axios": "^1.11.0",
"framer-motion": "^12.23.12",
"lucide-react": "^0.536.0",
"react": "^19.1.0",
"react-dom": "^19.1.0",
"tailwindcss": "^4.1.11",
"zego-express-engine-webrtc": "^3.10.0",
"zod": "^4.0.15"
},
"devDependencies": {
"@types/react": "^19.1.8",
"@types/react-dom": "^19.1.6",
"@vitejs/plugin-react": "^4.6.0",
"typescript": "~5.8.3",
"vite": "^7.0.4"
}
}
This configuration includes essential packages for building a modern React chatbot with real-time communication capabilities.
Install dependencies and create Vite configuration:
npm install
Create client/vite.config.ts
:
import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'
export default defineConfig({
plugins: [react(), tailwindcss()],
define: {
global: 'globalThis',
},
optimizeDeps: {
include: ['zego-express-engine-webrtc'],
}
})
This Vite configuration optimizes the ZEGOCLOUD SDK and enables Tailwind CSS for styling the chatbot interface.
5. TypeScript Types and Configuration
Now, create the TypeScript configuration at client/tsconfig.json
:
{
"compilerOptions": {
"target": "ES2022",
"useDefineForClassFields": true,
"lib": ["ES2022", "DOM", "DOM.Iterable"],
"module": "ESNext",
"skipLibCheck": true,
"moduleResolution": "bundler",
"allowImportingTsExtensions": true,
"verbatimModuleSyntax": true,
"noEmit": true,
"jsx": "react-jsx",
"strict": true,
"noUnusedLocals": true,
"noUnusedParameters": true,
"noFallthroughCasesInSwitch": true
},
"include": ["src"]
}
Define chatbot types at client/src/types/index.ts
:
export interface ChatMessage {
id: string
content: string
sender: 'user' | 'bot'
timestamp: number
type: 'text' | 'voice'
isStreaming?: boolean
transcript?: string
}
export interface ChatbotSession {
roomId: string
userId: string
chatbotId?: string
isActive: boolean
voiceEnabled: boolean
}
export interface ChatbotConfig {
personality: string
responseStyle: 'concise' | 'detailed' | 'friendly'
voiceSettings: {
enabled: boolean
autoPlay: boolean
speechRate: number
}
}
These types define the core data structures for chatbot messages, sessions, and configuration options.
6. ZEGOCLOUD Service Integration
Create the ZEGOCLOUD service at client/src/services/zego.ts
:
import { ZegoExpressEngine } from 'zego-express-engine-webrtc'
export class ChatbotZegoService {
private static instance: ChatbotZegoService
private zg: ZegoExpressEngine | null = null
private isInitialized = false
private currentRoomId: string | null = null
private currentUserId: string | null = null
private localStream: any = null
private messageCallback: ((message: any) => void) | null = null
static getInstance(): ChatbotZegoService {
if (!ChatbotZegoService.instance) {
ChatbotZegoService.instance = new ChatbotZegoService()
}
return ChatbotZegoService.instance
}
async initialize(appId: string, server: string): Promise<void> {
if (this.isInitialized) return
try {
this.zg = new ZegoExpressEngine(parseInt(appId), server)
this.setupEventListeners()
this.isInitialized = true
console.log('✅ Chatbot ZEGO service initialized')
} catch (error) {
console.error('❌ ZEGO initialization failed:', error)
throw error
}
}
private setupEventListeners(): void {
if (!this.zg) return
// Handle chatbot messages
this.zg.on('recvExperimentalAPI', (result: any) => {
const { method, content } = result
if (method === 'onRecvRoomChannelMessage') {
try {
const message = JSON.parse(content.msgContent)
console.log('🤖 Chatbot message received:', message)
if (this.messageCallback) {
this.messageCallback(message)
}
} catch (error) {
console.error('Failed to parse chatbot message:', error)
}
}
})
// Handle audio streams for voice responses
this.zg.on('roomStreamUpdate', async (_roomID: string, updateType: string, streamList: any[]) => {
if (updateType === 'ADD' && streamList.length > 0) {
for (const stream of streamList) {
if (stream.streamID !== `${this.currentUserId}_stream`) {
try {
const mediaStream = await this.zg!.startPlayingStream(stream.streamID)
if (mediaStream) {
const audioElement = document.getElementById('chatbot-audio') as HTMLAudioElement
if (audioElement) {
audioElement.srcObject = mediaStream
audioElement.play()
}
}
} catch (error) {
console.error('❌ Failed to play chatbot audio:', error)
}
}
}
}
})
}
async joinRoom(roomId: string, userId: string, token: string): Promise<boolean> {
if (!this.zg) {
console.error('❌ ZEGO not initialized')
return false
}
try {
this.currentRoomId = roomId
this.currentUserId = userId
await this.zg.loginRoom(roomId, token, {
userID: userId,
userName: userId
})
// Enable message reception
this.zg.callExperimentalAPI({
method: 'onRecvRoomChannelMessage',
params: {}
})
// Create local stream for voice input
const localStream = await this.zg.createZegoStream({
camera: {
video: false,
audio: true
}
})
if (localStream) {
this.localStream = localStream
await this.zg.startPublishingStream(`${userId}_stream`, localStream)
}
console.log('✅ Joined chatbot room successfully')
return true
} catch (error) {
console.error('❌ Failed to join chatbot room:', error)
return false
}
}
async enableVoiceInput(enabled: boolean): Promise<boolean> {
if (!this.localStream) return false
try {
const audioTrack = this.localStream.getAudioTracks()[0]
if (audioTrack) {
audioTrack.enabled = enabled
console.log(`🎤 Voice input ${enabled ? 'enabled' : 'disabled'}`)
return true
}
return false
} catch (error) {
console.error('❌ Failed to toggle voice input:', error)
return false
}
}
async leaveRoom(): Promise<void> {
if (!this.zg || !this.currentRoomId) return
try {
if (this.localStream) {
await this.zg.stopPublishingStream(`${this.currentUserId}_stream`)
this.zg.destroyStream(this.localStream)
this.localStream = null
}
await this.zg.logoutRoom()
this.currentRoomId = null
this.currentUserId = null
console.log('✅ Left chatbot room')
} catch (error) {
console.error('❌ Failed to leave chatbot room:', error)
}
}
onMessage(callback: (message: any) => void): void {
this.messageCallback = callback
}
isInRoom(): boolean {
return !!this.currentRoomId && !!this.currentUserId
}
}
7. API Service Layer
Create the API service at client/src/services/api.ts
. The API service provides clean interfaces for all chatbot operations with comprehensive error handling and logging.
import axios from 'axios'
const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || 'http://localhost:8080'
const api = axios.create({
baseURL: API_BASE_URL,
timeout: 30000,
headers: {
'Content-Type': 'application/json'
}
})
export const chatbotAPI = {
async getToken(userId: string): Promise<{ token: string }> {
try {
const response = await api.get(`/api/token?user_id=${encodeURIComponent(userId)}`)
if (!response.data?.token) {
throw new Error('No token received')
}
return { token: response.data.token }
} catch (error: any) {
console.error('❌ Get token failed:', error.response?.data || error.message)
throw new Error('Failed to get chatbot token')
}
},
async startChatbot(roomId: string, userId: string): Promise<{ chatbotId: string }> {
try {
const response = await api.post('/api/chatbot/start', {
room_id: roomId,
user_id: userId
})
if (!response.data?.success || !response.data?.chatbotId) {
throw new Error('Failed to start chatbot')
}
return { chatbotId: response.data.chatbotId }
} catch (error: any) {
console.error('❌ Start chatbot failed:', error.response?.data || error.message)
throw new Error('Failed to start chatbot session')
}
},
async sendMessage(chatbotId: string, message: string): Promise<void> {
try {
const response = await api.post('/api/chatbot/message', {
chatbot_id: chatbotId,
message: message.trim()
})
if (!response.data?.success) {
throw new Error('Message send failed')
}
} catch (error: any) {
console.error('❌ Send message failed:', error.response?.data || error.message)
throw new Error('Failed to send message to chatbot')
}
},
async stopChatbot(chatbotId: string): Promise<void> {
try {
await api.post('/api/chatbot/stop', {
chatbot_id: chatbotId
})
} catch (error: any) {
console.error('❌ Stop chatbot failed:', error.response?.data || error.message)
throw new Error('Failed to stop chatbot')
}
},
async healthCheck(): Promise<{ status: string }> {
try {
const response = await api.get('/health')
return response.data
} catch (error: any) {
throw new Error('Health check failed')
}
}
}
8. Core Chatbot Components
Now that our API service is ready, let’s create the chatbot components.
Start by Creating the message display component at client/src/components/MessageBubble.tsx
:
import { motion } from 'framer-motion'
import type { ChatMessage } from '../types'
import { Bot, User, Volume2 } from 'lucide-react'
interface MessageBubbleProps {
message: ChatMessage
}
export const MessageBubble = ({ message }: MessageBubbleProps) => {
const isBot = message.sender === 'bot'
const isVoice = message.type === 'voice'
return (
<motion.div
initial={{ opacity: 0, y: 20 }}
animate={{ opacity: 1, y: 0 }}
transition={{ duration: 0.3 }}
className={`flex w-full mb-4 ${isBot ? 'justify-start' : 'justify-end'}`}
>
<div className={`flex items-end space-x-2 max-w-[80%] ${isBot ? 'flex-row' : 'flex-row-reverse space-x-reverse'}`}>
{/* Avatar */}
<div className={`w-8 h-8 rounded-full flex items-center justify-center flex-shrink-0 ${
isBot
? 'bg-gradient-to-br from-purple-500 to-purple-600'
: 'bg-gradient-to-br from-blue-500 to-blue-600'
}`}>
{isBot ? (
<Bot className="w-4 h-4 text-white" />
) : (
<User className="w-4 h-4 text-white" />
)}
</div>
{/* Message Content */}
<div className={`px-4 py-2 rounded-2xl ${
isBot
? 'bg-white border border-gray-200 text-gray-900 rounded-bl-md'
: 'bg-blue-600 text-white rounded-br-md'
} ${message.isStreaming ? 'animate-pulse' : ''}`}>
{/* Voice indicator */}
{isVoice && (
<div className={`flex items-center space-x-1 mb-1 text-xs ${
isBot ? 'text-purple-600' : 'text-blue-200'
}`}>
<Volume2 className="w-3 h-3" />
<span>Voice</span>
</div>
)}
<p className="text-sm leading-relaxed">
{isVoice ? message.transcript || message.content : message.content}
</p>
</div>
</div>
</motion.div>
)
}
Create the main chatbot interface at client/src/components/Chatbot.tsx
:
import { useState, useEffect, useRef } from 'react'
import { motion, AnimatePresence } from 'framer-motion'
import { Send, Mic, MicOff, Bot, MessageSquare } from 'lucide-react'
import type { ChatMessage, ChatbotSession } from '../types'
import { MessageBubble } from './MessageBubble'
import { ChatbotZegoService } from '../services/zego'
import { chatbotAPI } from '../services/api'
export const Chatbot = () => {
const [messages, setMessages] = useState<ChatMessage[]>([])
const [inputMessage, setInputMessage] = useState('')
const [session, setSession] = useState<ChatbotSession | null>(null)
const [isConnected, setIsConnected] = useState(false)
const [isLoading, setIsLoading] = useState(false)
const [isRecording, setIsRecording] = useState(false)
const [currentTranscript, setCurrentTranscript] = useState('')
const [botStatus, setBotStatus] = useState<'idle' | 'listening' | 'thinking' | 'speaking'>('idle')
const messagesEndRef = useRef<HTMLDivElement>(null)
const zegoService = useRef(ChatbotZegoService.getInstance())
const processedMessageIds = useRef(new Set<string>())
// Environment configuration
const ZEGO_APP_ID = import.meta.env.VITE_ZEGO_APP_ID
const ZEGO_SERVER = import.meta.env.VITE_ZEGO_SERVER || 'wss://webliveroom-api.zegocloud.com/ws'
useEffect(() => {
scrollToBottom()
}, [messages])
useEffect(() => {
// Initialize ZEGO service
if (ZEGO_APP_ID && ZEGO_SERVER) {
zegoService.current.initialize(ZEGO_APP_ID, ZEGO_SERVER)
setupMessageHandlers()
}
return () => {
if (session?.isActive) {
stopChatbot()
}
}
}, [])
const scrollToBottom = () => {
messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' })
}
const setupMessageHandlers = () => {
zegoService.current.onMessage((data: any) => {
try {
const { Cmd, Data: msgData } = data
if (Cmd === 3) { // Voice transcript
const { Text: transcript, EndFlag, MessageId } = msgData
if (transcript?.trim()) {
setCurrentTranscript(transcript)
setBotStatus('listening')
if (EndFlag) {
const messageId = MessageId || `voice_${Date.now()}`
addMessage({
id: messageId,
content: transcript.trim(),
sender: 'user',
timestamp: Date.now(),
type: 'voice',
transcript: transcript.trim()
})
setCurrentTranscript('')
setBotStatus('thinking')
}
}
} else if (Cmd === 4) { // Bot response
const { Text: content, MessageId, EndFlag } = msgData
if (content && MessageId) {
if (EndFlag) {
updateMessageContent(MessageId, content, false)
setBotStatus('idle')
} else {
if (!processedMessageIds.current.has(MessageId)) {
addMessage({
id: MessageId,
content: content,
sender: 'bot',
timestamp: Date.now(),
type: 'text',
isStreaming: true
})
processedMessageIds.current.add(MessageId)
} else {
updateMessageContent(MessageId, content, true)
}
setBotStatus('speaking')
}
}
}
} catch (error) {
console.error('Error handling chatbot message:', error)
}
})
}
const addMessage = (message: ChatMessage) => {
setMessages(prev => [...prev, message])
}
const updateMessageContent = (messageId: string, newContent: string, isStreaming: boolean) => {
setMessages(prev => prev.map(msg =>
msg.id === messageId
? { ...msg, content: newContent, isStreaming }
: msg
))
}
const startChatbot = async () => {
if (isLoading) return
setIsLoading(true)
setBotStatus('idle')
try {
const roomId = `chatbot_${Date.now()}`
const userId = `user_${Date.now()}`
// Initialize ZEGO connection
const { token } = await chatbotAPI.getToken(userId)
const joinSuccess = await zegoService.current.joinRoom(roomId, userId, token)
if (!joinSuccess) {
throw new Error('Failed to join chatbot room')
}
// Start chatbot agent
const { chatbotId } = await chatbotAPI.startChatbot(roomId, userId)
const newSession: ChatbotSession = {
roomId,
userId,
chatbotId,
isActive: true,
voiceEnabled: true
}
setSession(newSession)
setIsConnected(true)
// Add welcome message
addMessage({
id: 'welcome',
content: 'Hello! I\'m your AI assistant. You can type messages or use voice input to chat with me.',
sender: 'bot',
timestamp: Date.now(),
type: 'text'
})
console.log('✅ Chatbot started successfully')
} catch (error) {
console.error('❌ Failed to start chatbot:', error)
setBotStatus('idle')
} finally {
setIsLoading(false)
}
}
const stopChatbot = async () => {
if (!session) return
try {
if (session.chatbotId) {
await chatbotAPI.stopChatbot(session.chatbotId)
}
await zegoService.current.leaveRoom()
setSession(null)
setIsConnected(false)
setBotStatus('idle')
setIsRecording(false)
setCurrentTranscript('')
console.log('✅ Chatbot stopped')
} catch (error) {
console.error('❌ Failed to stop chatbot:', error)
}
}
const sendTextMessage = async () => {
if (!inputMessage.trim() || !session?.chatbotId) return
const message: ChatMessage = {
id: `text_${Date.now()}`,
content: inputMessage.trim(),
sender: 'user',
timestamp: Date.now(),
type: 'text'
}
addMessage(message)
setInputMessage('')
setBotStatus('thinking')
try {
await chatbotAPI.sendMessage(session.chatbotId, message.content)
} catch (error) {
console.error('❌ Failed to send message:', error)
setBotStatus('idle')
}
}
const toggleVoiceInput = async () => {
if (!isConnected) return
try {
const newRecordingState = !isRecording
const success = await zegoService.current.enableVoiceInput(newRecordingState)
if (success) {
setIsRecording(newRecordingState)
setBotStatus(newRecordingState ? 'listening' : 'idle')
}
} catch (error) {
console.error('❌ Failed to toggle voice input:', error)
}
}
const handleKeyPress = (e: React.KeyboardEvent) => {
if (e.key === 'Enter' && !e.shiftKey) {
e.preventDefault()
sendTextMessage()
}
}
const getStatusText = () => {
switch (botStatus) {
case 'listening':
return 'Listening...'
case 'thinking':
return 'Processing...'
case 'speaking':
return 'Responding...'
default:
return isConnected ? 'Ready to chat' : 'Click start to begin'
}
}
const getStatusColor = () => {
switch (botStatus) {
case 'listening':
return 'text-green-600'
case 'thinking':
return 'text-blue-600'
case 'speaking':
return 'text-purple-600'
default:
return isConnected ? 'text-green-600' : 'text-gray-500'
}
}
return (
<div className="flex flex-col h-screen bg-gray-50">
{/* Hidden audio element for voice playback */}
<audio
id="chatbot-audio"
autoPlay
style={{ display: 'none' }}
controls={false}
/>
{/* Header */}
<motion.div
initial={{ y: -20, opacity: 0 }}
animate={{ y: 0, opacity: 1 }}
className="bg-white border-b border-gray-200 px-6 py-4"
>
<div className="flex items-center justify-between">
<div className="flex items-center space-x-3">
<div className="w-10 h-10 bg-gradient-to-br from-purple-500 to-purple-600 rounded-full flex items-center justify-center">
<Bot className="w-5 h-5 text-white" />
</div>
<div>
<h1 className="text-xl font-semibold text-gray-900">AI Chatbot</h1>
<p className={`text-sm ${getStatusColor()}`}>
{getStatusText()}
</p>
</div>
</div>
{isConnected ? (
<button
onClick={stopChatbot}
className="px-4 py-2 bg-red-600 text-white rounded-lg hover:bg-red-700 transition-colors"
>
Stop Chat
</button>
) : (
<button
onClick={startChatbot}
disabled={isLoading}
className="px-4 py-2 bg-purple-600 text-white rounded-lg hover:bg-purple-700 transition-colors disabled:opacity-50"
>
{isLoading ? 'Starting...' : 'Start Chat'}
</button>
)}
</div>
</motion.div>
{/* Messages Area */}
<div className="flex-1 overflow-y-auto px-4 py-6">
{messages.length === 0 && !isConnected && (
<motion.div
initial={{ opacity: 0, y: 20 }}
animate={{ opacity: 1, y: 0 }}
className="flex flex-col items-center justify-center h-full text-center"
>
<div className="w-16 h-16 bg-gradient-to-br from-purple-500 to-purple-600 rounded-full flex items-center justify-center mb-4">
<MessageSquare className="w-8 h-8 text-white" />
</div>
<h3 className="text-lg font-semibold text-gray-900 mb-2">
Welcome to AI Chatbot
</h3>
<p className="text-gray-600 mb-6 max-w-md">
Start chatting with our intelligent AI assistant. You can type messages or use voice input for natural conversations.
</p>
<div className="space-y-2 text-sm text-gray-500">
<p>💬 Natural text conversations</p>
<p>🎤 Voice input and responses</p>
<p>🧠 Context-aware assistance</p>
</div>
</motion.div>
)}
<AnimatePresence>
{messages.map((message) => (
<MessageBubble key={message.id} message={message} />
))}
</AnimatePresence>
{/* Transcript display */}
{currentTranscript && (
<motion.div
initial={{ opacity: 0, y: 10 }}
animate={{ opacity: 1, y: 0 }}
className="mb-4 p-3 bg-green-50 border border-green-200 rounded-lg"
>
<div className="flex items-center space-x-2">
<div className="w-2 h-2 bg-green-500 rounded-full animate-pulse" />
<p className="text-sm text-green-700">{currentTranscript}</p>
</div>
</motion.div>
)}
{/* Thinking indicator */}
{botStatus === 'thinking' && (
<motion.div
initial={{ opacity: 0, y: 10 }}
animate={{ opacity: 1, y: 0 }}
className="flex justify-start mb-4"
>
<div className="flex items-center space-x-2">
<div className="w-8 h-8 bg-gradient-to-br from-purple-500 to-purple-600 rounded-full flex items-center justify-center">
<Bot className="w-4 h-4 text-white" />
</div>
<div className="bg-white border border-gray-200 rounded-2xl px-4 py-2">
<div className="flex space-x-1">
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" style={{ animationDelay: '0.1s' }} />
<div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" style={{ animationDelay: '0.2s' }} />
</div>
</div>
</div>
</motion.div>
)}
<div ref={messagesEndRef} />
</div>
{/* Input Area */}
{isConnected && (
<motion.div
initial={{ y: 20, opacity: 0 }}
animate={{ y: 0, opacity: 1 }}
className="bg-white border-t border-gray-200 p-4"
>
<div className="flex items-center space-x-3">
<div className="flex-1">
<input
type="text"
value={inputMessage}
onChange={(e) => setInputMessage(e.target.value)}
onKeyPress={handleKeyPress}
placeholder="Type your message..."
disabled={botStatus === 'thinking'}
className="w-full px-4 py-3 border border-gray-300 rounded-xl focus:outline-none focus:ring-2 focus:ring-purple-500 focus:border-transparent disabled:opacity-50 disabled:cursor-not-allowed"
/>
</div>
{/* Voice button */}
<button
onClick={toggleVoiceInput}
disabled={botStatus === 'thinking'}
className={`p-3 rounded-xl transition-all duration-200 ${
isRecording
? 'bg-red-500 text-white shadow-lg scale-110'
: 'bg-gray-100 text-gray-600 hover:bg-gray-200'
} disabled:opacity-50 disabled:cursor-not-allowed`}
>
{isRecording ? <MicOff className="w-5 h-5" /> : <Mic className="w-5 h-5" />}
</button>
{/* Send button */}
<button
onClick={sendTextMessage}
disabled={!inputMessage.trim() || botStatus === 'thinking'}
className="p-3 bg-purple-600 text-white rounded-xl hover:bg-purple-700 transition-colors disabled:opacity-50 disabled:cursor-not-allowed"
>
<Send className="w-5 h-5" />
</button>
</div>
</motion.div>
)}
</div>
)
}
This main chatbot component provides a complete interface with message display, text input, voice recording, and real-time status indicators.
9. Application Assembly and Styling
Create the environment configuration at client/.env
:
VITE_ZEGO_APP_ID=your_zego_app_id
VITE_ZEGO_SERVER=wss://webliveroom-api.zegocloud.com/ws
VITE_API_BASE_URL=http://localhost:8080
Create the main application at client/src/App.tsx
:
import { Chatbot } from './components/Chatbot'
import './index.css'
function App() {
return (
<div className="w-full h-screen">
<Chatbot />
</div>
)
}
export default App
Create the styling at client/src/index.css
:
@import "tailwindcss";
body {
margin: 0;
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', sans-serif;
background-color: #f9fafb;
}
* {
box-sizing: border-box;
}
.animate-bounce {
animation: bounce 1s infinite;
}
@keyframes bounce {
0%, 20%, 53%, 80%, 100% {
transform: translate3d(0,0,0);
}
40%, 43% {
transform: translate3d(0, -10px, 0);
}
70% {
transform: translate3d(0, -5px, 0);
}
90% {
transform: translate3d(0, -2px, 0);
}
}
Create the entry point at client/src/main.tsx
:
import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import App from './App.tsx'
createRoot(document.getElementById('root')!).render(
<StrictMode>
<App />
</StrictMode>,
)
Create the HTML template at client/index.html
:
<!doctype html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>AI Chatbot with ZEGOCLOUD</title>
</head>
<body>
<div id="root"></div>
<script type="module" src="/src/main.tsx"></script>
</body>
</html>
10. Testing and Deployment
Start the backend server:
cd server
npm run dev
In a separate terminal, start the frontend:
cd client
npm run dev
Run a Demo
Conclusion
Now you have a working AI chatbot that can understand speech, respond intelligently, and manage real-time conversations. Users are free to type or speak naturally, and the chatbot delivers appropriate replies in text or voice.
What once required complex integrations of multiple AI services, audio processing pipelines, and real-time synchronization has been made simple with ZEGOCLOUD. With a single platform, you built a chatbot that feels both responsive and natural to use.
This solid foundation can power customer support, education, virtual assistants, or any application where intelligent conversation is essential. From here, you can refine the chatbot’s personality, extend its features, or integrate with external services while maintaining the same reliable communication core.
FAQ
Q1: Can I make my own AI chatbot?
Yes. With platforms like ZEGOCLOUD, you can build an AI chatbot that handles both text and voice interactions. The process no longer requires stitching together multiple services, so even individual developers can create powerful chatbots.
Q2: How much does it cost to build an AI chatbot?
The cost depends on scale and features. Simple chatbots can be built at low cost, while advanced real-time conversational bots may require cloud usage fees. ZEGOCLOUD offers flexible pricing so you can start small and scale as your user base grows.
Q3: Can I create my own AI like ChatGPT?
You can build applications powered by large language models similar to ChatGPT, but instead of training one from scratch, most developers integrate existing APIs and SDKs. This saves time, cost, and computing resources.
Q4: Is it hard to develop an AI chatbot?
Traditionally it was difficult because you had to integrate natural language processing, speech recognition, text-to-speech, and real-time messaging. With ZEGOCLOUD’s all-in-one AI agent SDK, the process is much easier and faster.
Let’s Build APP Together
Start building with real-time video, voice & chat SDK for apps today!