Talk to us
Talk to us
menu

How to Create an AI Chatbot

How to Create an AI Chatbot

To begin with, modern apps need intelligent chatbots that understand users and respond naturally. Therefore, whether you’re building customer support, educational tools, or interactive assistants, users now expect chatbots to handle both text messages and voice interactions seamlessly. In addition, this guide will walk you through how to build an AI chatbot with text and voice capabilities using ZEGOCLOUD. Step by step, you’ll learn how to create a responsive interface that works across devices, integrate real-time AI processing, and ultimately deploy a chatbot that feels natural, engaging, and responsive to users.

How to Make an AI Chatbot in Easy Steps

Traditionally, chatbot development often requires connecting multiple services, including natural language processing, text-to-speech, speech recognition, and real-time messaging. Moreover, each of these services demands separate integration, authentication, and error handling. As a result, ensuring both low latency and high reliability quickly becomes a complex challenge.

ZEGOCLOUD’s AI Agent platform simplifies this process with an all-in-one chatbot solution. Instead of juggling multiple integrations, you can build intelligent chatbots through a single SDK that seamlessly manages text conversations, processes voice input, and delivers natural speech responses. This approach not only reduces development complexity but also ensures a smoother, real-time experience for users.

👉 Schedule a Demo

zegocloud conversational ai

Prerequisites

Before you begin, gather these essential components:

  • ZEGOCLOUD developer account with active AppID and ServerSecret from the console – Sign up here.
  • Node.js 18+ installed locally for both backend and frontend development.
  • OpenAI API key or compatible language model provider for intelligent responses.
  • Code editor with TypeScript support for better development experience.
  • Testing device with microphone access since chatbot voice features require actual hardware.
  • Basic familiarity with React hooks and Express.js for building the user interface and API endpoints.

Once you have these prerequisites in place, you can proceed with the steps below.

1. Chatbot Backend

Begin by creating the server infrastructure that manages chatbot sessions and handles AI agent communication. This backend serves as the bridge between your frontend interface and ZEGOCLOUD’s AI services.

Create your project structure and initialize the backend:

mkdir ai-chatbot
cd ai-chatbot
mkdir server client
cd server
npm init -y

Configure the server dependencies in server/package.json:

{
  "name": "ai-chatbot-server",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "dev": "tsx watch src/server.ts",
    "build": "tsc",
    "start": "node dist/server.js"
  },
  "dependencies": {
    "express": "^5.1.0",
    "cors": "^2.8.5",
    "dotenv": "^17.2.1",
    "axios": "^1.11.0"
  },
  "devDependencies": {
    "@types/express": "^5.0.3",
    "@types/cors": "^2.8.19",
    "@types/node": "^24.3.0",
    "typescript": "^5.9.2",
    "tsx": "^4.20.4"
  }
}

This configuration establishes a TypeScript-enabled Express server with development hot reloading and production build capabilities.

Set up environment variables in server/.env:

# ZEGOCLOUD Configuration
ZEGO_APP_ID=your_zego_app_id
ZEGO_SERVER_SECRET=your_zego_server_secret
ZEGO_API_BASE_URL=https://aigc-aiagent-api.zegotech.cn

# AI Model Configuration  
LLM_URL=https://api.openai.com/v1/chat/completions
LLM_API_KEY=your_openai_api_key
LLM_MODEL=gpt-4o-mini

# Server Settings
PORT=8080

These variables configure your ZEGOCLOUD credentials, AI model settings, and server port. Replace placeholder values with your actual credentials from the respective service dashboards.

Install dependencies and create the TypeScript configuration:

npm install

Create server/tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "moduleResolution": "bundler",
    "allowSyntheticDefaultImports": true,
    "esModuleInterop": true,
    "strict": true,
    "outDir": "./dist",
    "rootDir": "./src",
    "declaration": true,
    "sourceMap": true,
    "resolveJsonModule": true
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

This TypeScript configuration enables modern JavaScript features with strict type checking for reliable chatbot backend development.

2. ZEGOCLOUD Token Generation

Create the authentication token generator at server/zego-token.cjs:

"use strict";
const crypto = require("crypto");

function generateToken04(appId, userId, secret, effectiveTimeInSeconds, payload) {
  if (!appId || typeof appId !== 'number') {
    throw new Error('appID invalid');
  }
  if (!userId || typeof userId !== 'string' || userId.length > 64) {
    throw new Error('userId invalid');
  }
  if (!secret || typeof secret !== 'string' || secret.length !== 32) {
    throw new Error('secret must be a 32 byte string');
  }
  if (!(effectiveTimeInSeconds > 0)) {
    throw new Error('effectiveTimeInSeconds invalid');
  }

  const VERSION_FLAG = '04';
  const createTime = Math.floor(new Date().getTime() / 1000);

  const tokenInfo = {
    app_id: appId,
    user_id: userId,
    nonce: Math.floor(Math.random() * (Math.pow(2, 31) - (-Math.pow(2, 31)) + 1)) + (-Math.pow(2, 31)),
    ctime: createTime,
    expire: createTime + effectiveTimeInSeconds,
    payload: payload || ''
  };

  const plaintText = JSON.stringify(tokenInfo);

  // AES-GCM encryption
  const nonce = crypto.randomBytes(12);
  const cipher = crypto.createCipher('aes-256-gcm', secret, nonce);
  cipher.setAutoPadding(true);
  const encrypted = cipher.update(plaintText, 'utf8');
  const encryptBuf = Buffer.concat([encrypted, cipher.final(), cipher.getAuthTag()]);

  // Binary token assembly
  const b1 = new Uint8Array(8);
  const b2 = new Uint8Array(2);
  const b3 = new Uint8Array(2);
  const b4 = new Uint8Array(1);

  new DataView(b1.buffer).setBigInt64(0, BigInt(tokenInfo.expire), false);
  new DataView(b2.buffer).setUint16(0, nonce.byteLength, false);
  new DataView(b3.buffer).setUint16(0, encryptBuf.byteLength, false);
  new DataView(b4.buffer).setUint8(0, 1);

  const buf = Buffer.concat([
    Buffer.from(b1),
    Buffer.from(b2),
    Buffer.from(nonce),
    Buffer.from(b3),
    Buffer.from(encryptBuf),
    Buffer.from(b4),
  ]);

  return VERSION_FLAG + Buffer.from(buf).toString('base64');
}

module.exports = { generateToken04 };

This token generator creates secure authentication tokens using ZEGOCLOUD’s token04 format with AES-GCM encryption for chatbot session security.

3. Server Implementation

Create the main server file at server/src/server.ts:

import express from 'express'
import cors from 'cors'
import dotenv from 'dotenv'
import axios from 'axios'
import { generateToken04 } from '../zego-token.cjs'

dotenv.config()

const app = express()
const PORT = process.env.PORT || 8080

// Middleware setup
app.use(cors({
  origin: ['http://localhost:5173', 'http://localhost:3000'],
  credentials: true
}))
app.use(express.json())

// Environment validation
const requiredVars = ['ZEGO_APP_ID', 'ZEGO_SERVER_SECRET', 'LLM_API_KEY']
const missingVars = requiredVars.filter(varName => !process.env[varName])

if (missingVars.length > 0) {
  console.error('❌ Missing environment variables:', missingVars)
  process.exit(1)
}

const ZEGO_APP_ID = parseInt(process.env.ZEGO_APP_ID!)
const ZEGO_SERVER_SECRET = process.env.ZEGO_SERVER_SECRET!
const ZEGO_API_BASE_URL = process.env.ZEGO_API_BASE_URL!
const LLM_URL = process.env.LLM_URL!
const LLM_API_KEY = process.env.LLM_API_KEY!
const LLM_MODEL = process.env.LLM_MODEL || 'gpt-4o-mini'

// Health check endpoint
app.get('/health', (req, res) => {
  res.json({ 
    status: 'healthy',
    chatbot: 'ready',
    timestamp: new Date().toISOString()
  })
})

// Generate authentication token
app.get('/api/token', (req, res) => {
  try {
    const { user_id } = req.query

    if (!user_id || typeof user_id !== 'string') {
      return res.status(400).json({ 
        success: false, 
        error: 'User ID required for chatbot session' 
      })
    }

    const token = generateToken04(
      ZEGO_APP_ID,
      user_id,
      ZEGO_SERVER_SECRET,
      7200, // 2 hours
      ''
    )

    console.log(`🔑 Generated chatbot token for user: ${user_id}`)

    res.json({ 
      success: true, 
      token,
      expires_in: 7200
    })
  } catch (error) {
    console.error('❌ Token generation failed:', error)
    res.status(500).json({ 
      success: false, 
      error: 'Failed to generate chatbot token' 
    })
  }
})

// Start chatbot session
app.post('/api/chatbot/start', async (req, res) => {
  try {
    const { room_id, user_id } = req.body

    if (!room_id || !user_id) {
      return res.status(400).json({
        success: false,
        error: 'Room ID and User ID required for chatbot'
      })
    }

    console.log(`🤖 Starting chatbot session for room: ${room_id}`)

    const response = await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/start`, {
      app_id: ZEGO_APP_ID,
      room_id: room_id,
      user_id: user_id,
      user_stream_id: `${user_id}_stream`,
      ai_agent_config: {
        llm_config: {
          url: LLM_URL,
          api_key: LLM_API_KEY,
          model: LLM_MODEL,
          context: [
            {
              role: "system",
              content: "You are a helpful AI chatbot assistant. Provide concise, friendly, and informative responses. Keep answers conversational and engaging while being helpful."
            }
          ]
        },
        tts_config: {
          provider: "elevenlabs",
          voice_id: "pNInz6obpgDQGcFmaJgB",
          model: "eleven_turbo_v2_5"
        },
        asr_config: {
          provider: "deepgram",
          language: "en"
        }
      }
    }, {
      timeout: 30000
    })

    if (response.data?.data?.ai_agent_instance_id) {
      const agentInstanceId = response.data.data.ai_agent_instance_id
      console.log(`✅ Chatbot started: ${agentInstanceId}`)

      res.json({
        success: true,
        chatbotId: agentInstanceId,
        room_id,
        user_id
      })
    } else {
      throw new Error('Invalid response from ZEGOCLOUD')
    }
  } catch (error: any) {
    console.error('❌ Chatbot start failed:', error.response?.data || error.message)
    res.status(500).json({
      success: false,
      error: 'Failed to start chatbot session'
    })
  }
})

// Send message to chatbot
app.post('/api/chatbot/message', async (req, res) => {
  try {
    const { chatbot_id, message } = req.body

    if (!chatbot_id || !message) {
      return res.status(400).json({
        success: false,
        error: 'Chatbot ID and message required'
      })
    }

    console.log(`💬 Sending to chatbot ${chatbot_id}: ${message.substring(0, 50)}...`)

    await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/chat`, {
      ai_agent_instance_id: chatbot_id,
      messages: [
        {
          role: "user",
          content: message
        }
      ]
    }, {
      timeout: 30000
    })

    res.json({
      success: true,
      message: 'Message sent to chatbot'
    })
  } catch (error: any) {
    console.error('❌ Chatbot message failed:', error.response?.data || error.message)
    res.status(500).json({
      success: false,
      error: 'Failed to send message to chatbot'
    })
  }
})

// Stop chatbot session
app.post('/api/chatbot/stop', async (req, res) => {
  try {
    const { chatbot_id } = req.body

    if (!chatbot_id) {
      return res.status(400).json({
        success: false,
        error: 'Chatbot ID required'
      })
    }

    console.log(`🛑 Stopping chatbot: ${chatbot_id}`)

    await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/stop`, {
      ai_agent_instance_id: chatbot_id
    }, {
      timeout: 30000
    })

    res.json({
      success: true,
      message: 'Chatbot session stopped'
    })
  } catch (error: any) {
    console.error('❌ Chatbot stop failed:', error.response?.data || error.message)
    res.status(500).json({
      success: false,
      error: 'Failed to stop chatbot'
    })
  }
})

app.listen(PORT, () => {
  console.log(`🚀 Chatbot server running on port ${PORT}`)
  console.log(`🤖 Ready to serve AI chatbot requests`)
})

4. Frontend Chatbot Setup

Navigate to the client directory and initialize the React chatbot interface:

cd ../client
npm init -y

Configure the frontend dependencies in client/package.json:

{
  "name": "ai-chatbot-client",
  "private": true,
  "version": "0.0.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "tsc -b && vite build",
    "preview": "vite preview"
  },
  "dependencies": {
    "@tailwindcss/vite": "^4.1.11",
    "axios": "^1.11.0",
    "framer-motion": "^12.23.12",
    "lucide-react": "^0.536.0",
    "react": "^19.1.0",
    "react-dom": "^19.1.0",
    "tailwindcss": "^4.1.11",
    "zego-express-engine-webrtc": "^3.10.0",
    "zod": "^4.0.15"
  },
  "devDependencies": {
    "@types/react": "^19.1.8",
    "@types/react-dom": "^19.1.6",
    "@vitejs/plugin-react": "^4.6.0",
    "typescript": "~5.8.3",
    "vite": "^7.0.4"
  }
}

This configuration includes essential packages for building a modern React chatbot with real-time communication capabilities.

Install dependencies and create Vite configuration:

npm install

Create client/vite.config.ts:

import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'

export default defineConfig({
  plugins: [react(), tailwindcss()],
  define: {
    global: 'globalThis',
  },
  optimizeDeps: {
    include: ['zego-express-engine-webrtc'],
  }
})

This Vite configuration optimizes the ZEGOCLOUD SDK and enables Tailwind CSS for styling the chatbot interface.

5. TypeScript Types and Configuration

Now, create the TypeScript configuration at client/tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "useDefineForClassFields": true,
    "lib": ["ES2022", "DOM", "DOM.Iterable"],
    "module": "ESNext",
    "skipLibCheck": true,
    "moduleResolution": "bundler",
    "allowImportingTsExtensions": true,
    "verbatimModuleSyntax": true,
    "noEmit": true,
    "jsx": "react-jsx",
    "strict": true,
    "noUnusedLocals": true,
    "noUnusedParameters": true,
    "noFallthroughCasesInSwitch": true
  },
  "include": ["src"]
}

Define chatbot types at client/src/types/index.ts:

export interface ChatMessage {
  id: string
  content: string
  sender: 'user' | 'bot'
  timestamp: number
  type: 'text' | 'voice'
  isStreaming?: boolean
  transcript?: string
}

export interface ChatbotSession {
  roomId: string
  userId: string
  chatbotId?: string
  isActive: boolean
  voiceEnabled: boolean
}

export interface ChatbotConfig {
  personality: string
  responseStyle: 'concise' | 'detailed' | 'friendly'
  voiceSettings: {
    enabled: boolean
    autoPlay: boolean
    speechRate: number
  }
}

These types define the core data structures for chatbot messages, sessions, and configuration options.

6. ZEGOCLOUD Service Integration

Create the ZEGOCLOUD service at client/src/services/zego.ts:

import { ZegoExpressEngine } from 'zego-express-engine-webrtc'

export class ChatbotZegoService {
  private static instance: ChatbotZegoService
  private zg: ZegoExpressEngine | null = null
  private isInitialized = false
  private currentRoomId: string | null = null
  private currentUserId: string | null = null
  private localStream: any = null
  private messageCallback: ((message: any) => void) | null = null

  static getInstance(): ChatbotZegoService {
    if (!ChatbotZegoService.instance) {
      ChatbotZegoService.instance = new ChatbotZegoService()
    }
    return ChatbotZegoService.instance
  }

  async initialize(appId: string, server: string): Promise<void> {
    if (this.isInitialized) return

    try {
      this.zg = new ZegoExpressEngine(parseInt(appId), server)
      this.setupEventListeners()
      this.isInitialized = true
      console.log('✅ Chatbot ZEGO service initialized')
    } catch (error) {
      console.error('❌ ZEGO initialization failed:', error)
      throw error
    }
  }

  private setupEventListeners(): void {
    if (!this.zg) return

    // Handle chatbot messages
    this.zg.on('recvExperimentalAPI', (result: any) => {
      const { method, content } = result
      if (method === 'onRecvRoomChannelMessage') {
        try {
          const message = JSON.parse(content.msgContent)
          console.log('🤖 Chatbot message received:', message)
          if (this.messageCallback) {
            this.messageCallback(message)
          }
        } catch (error) {
          console.error('Failed to parse chatbot message:', error)
        }
      }
    })

    // Handle audio streams for voice responses
    this.zg.on('roomStreamUpdate', async (_roomID: string, updateType: string, streamList: any[]) => {
      if (updateType === 'ADD' && streamList.length > 0) {
        for (const stream of streamList) {
          if (stream.streamID !== `${this.currentUserId}_stream`) {
            try {
              const mediaStream = await this.zg!.startPlayingStream(stream.streamID)
              if (mediaStream) {
                const audioElement = document.getElementById('chatbot-audio') as HTMLAudioElement
                if (audioElement) {
                  audioElement.srcObject = mediaStream
                  audioElement.play()
                }
              }
            } catch (error) {
              console.error('❌ Failed to play chatbot audio:', error)
            }
          }
        }
      }
    })
  }

  async joinRoom(roomId: string, userId: string, token: string): Promise<boolean> {
    if (!this.zg) {
      console.error('❌ ZEGO not initialized')
      return false
    }

    try {
      this.currentRoomId = roomId
      this.currentUserId = userId

      await this.zg.loginRoom(roomId, token, {
        userID: userId,
        userName: userId
      })

      // Enable message reception
      this.zg.callExperimentalAPI({ 
        method: 'onRecvRoomChannelMessage', 
        params: {} 
      })

      // Create local stream for voice input
      const localStream = await this.zg.createZegoStream({
        camera: { 
          video: false, 
          audio: true
        }
      })

      if (localStream) {
        this.localStream = localStream
        await this.zg.startPublishingStream(`${userId}_stream`, localStream)
      }

      console.log('✅ Joined chatbot room successfully')
      return true
    } catch (error) {
      console.error('❌ Failed to join chatbot room:', error)
      return false
    }
  }

  async enableVoiceInput(enabled: boolean): Promise<boolean> {
    if (!this.localStream) return false

    try {
      const audioTrack = this.localStream.getAudioTracks()[0]
      if (audioTrack) {
        audioTrack.enabled = enabled
        console.log(`🎤 Voice input ${enabled ? 'enabled' : 'disabled'}`)
        return true
      }
      return false
    } catch (error) {
      console.error('❌ Failed to toggle voice input:', error)
      return false
    }
  }

  async leaveRoom(): Promise<void> {
    if (!this.zg || !this.currentRoomId) return

    try {
      if (this.localStream) {
        await this.zg.stopPublishingStream(`${this.currentUserId}_stream`)
        this.zg.destroyStream(this.localStream)
        this.localStream = null
      }

      await this.zg.logoutRoom()
      this.currentRoomId = null
      this.currentUserId = null

      console.log('✅ Left chatbot room')
    } catch (error) {
      console.error('❌ Failed to leave chatbot room:', error)
    }
  }

  onMessage(callback: (message: any) => void): void {
    this.messageCallback = callback
  }

  isInRoom(): boolean {
    return !!this.currentRoomId && !!this.currentUserId
  }
}

7. API Service Layer

Create the API service at client/src/services/api.ts. The API service provides clean interfaces for all chatbot operations with comprehensive error handling and logging.

import axios from 'axios'

const API_BASE_URL = import.meta.env.VITE_API_BASE_URL || 'http://localhost:8080'

const api = axios.create({
  baseURL: API_BASE_URL,
  timeout: 30000,
  headers: {
    'Content-Type': 'application/json'
  }
})

export const chatbotAPI = {
  async getToken(userId: string): Promise<{ token: string }> {
    try {
      const response = await api.get(`/api/token?user_id=${encodeURIComponent(userId)}`)

      if (!response.data?.token) {
        throw new Error('No token received')
      }

      return { token: response.data.token }
    } catch (error: any) {
      console.error('❌ Get token failed:', error.response?.data || error.message)
      throw new Error('Failed to get chatbot token')
    }
  },

  async startChatbot(roomId: string, userId: string): Promise<{ chatbotId: string }> {
    try {
      const response = await api.post('/api/chatbot/start', {
        room_id: roomId,
        user_id: userId
      })

      if (!response.data?.success || !response.data?.chatbotId) {
        throw new Error('Failed to start chatbot')
      }

      return { chatbotId: response.data.chatbotId }
    } catch (error: any) {
      console.error('❌ Start chatbot failed:', error.response?.data || error.message)
      throw new Error('Failed to start chatbot session')
    }
  },

  async sendMessage(chatbotId: string, message: string): Promise<void> {
    try {
      const response = await api.post('/api/chatbot/message', {
        chatbot_id: chatbotId,
        message: message.trim()
      })

      if (!response.data?.success) {
        throw new Error('Message send failed')
      }
    } catch (error: any) {
      console.error('❌ Send message failed:', error.response?.data || error.message)
      throw new Error('Failed to send message to chatbot')
    }
  },

  async stopChatbot(chatbotId: string): Promise<void> {
    try {
      await api.post('/api/chatbot/stop', {
        chatbot_id: chatbotId
      })
    } catch (error: any) {
      console.error('❌ Stop chatbot failed:', error.response?.data || error.message)
      throw new Error('Failed to stop chatbot')
    }
  },

  async healthCheck(): Promise<{ status: string }> {
    try {
      const response = await api.get('/health')
      return response.data
    } catch (error: any) {
      throw new Error('Health check failed')
    }
  }
}

8. Core Chatbot Components

Now that our API service is ready, let’s create the chatbot components.

Start by Creating the message display component at client/src/components/MessageBubble.tsx:

import { motion } from 'framer-motion'
import type { ChatMessage } from '../types'
import { Bot, User, Volume2 } from 'lucide-react'

interface MessageBubbleProps {
  message: ChatMessage
}

export const MessageBubble = ({ message }: MessageBubbleProps) => {
  const isBot = message.sender === 'bot'
  const isVoice = message.type === 'voice'

  return (
    <motion.div
      initial={{ opacity: 0, y: 20 }}
      animate={{ opacity: 1, y: 0 }}
      transition={{ duration: 0.3 }}
      className={`flex w-full mb-4 ${isBot ? 'justify-start' : 'justify-end'}`}
    >
      <div className={`flex items-end space-x-2 max-w-[80%] ${isBot ? 'flex-row' : 'flex-row-reverse space-x-reverse'}`}>
        {/* Avatar */}
        <div className={`w-8 h-8 rounded-full flex items-center justify-center flex-shrink-0 ${
          isBot 
            ? 'bg-gradient-to-br from-purple-500 to-purple-600' 
            : 'bg-gradient-to-br from-blue-500 to-blue-600'
        }`}>
          {isBot ? (
            <Bot className="w-4 h-4 text-white" />
          ) : (
            <User className="w-4 h-4 text-white" />
          )}
        </div>

        {/* Message Content */}
        <div className={`px-4 py-2 rounded-2xl ${
          isBot 
            ? 'bg-white border border-gray-200 text-gray-900 rounded-bl-md' 
            : 'bg-blue-600 text-white rounded-br-md'
        } ${message.isStreaming ? 'animate-pulse' : ''}`}>
          {/* Voice indicator */}
          {isVoice && (
            <div className={`flex items-center space-x-1 mb-1 text-xs ${
              isBot ? 'text-purple-600' : 'text-blue-200'
            }`}>
              <Volume2 className="w-3 h-3" />
              <span>Voice</span>
            </div>
          )}

          <p className="text-sm leading-relaxed">
            {isVoice ? message.transcript || message.content : message.content}
          </p>
        </div>
      </div>
    </motion.div>
  )
}

Create the main chatbot interface at client/src/components/Chatbot.tsx:

import { useState, useEffect, useRef } from 'react'
import { motion, AnimatePresence } from 'framer-motion'
import { Send, Mic, MicOff, Bot, MessageSquare } from 'lucide-react'
import type { ChatMessage, ChatbotSession } from '../types'
import { MessageBubble } from './MessageBubble'
import { ChatbotZegoService } from '../services/zego'
import { chatbotAPI } from '../services/api'

export const Chatbot = () => {
  const [messages, setMessages] = useState<ChatMessage[]>([])
  const [inputMessage, setInputMessage] = useState('')
  const [session, setSession] = useState<ChatbotSession | null>(null)
  const [isConnected, setIsConnected] = useState(false)
  const [isLoading, setIsLoading] = useState(false)
  const [isRecording, setIsRecording] = useState(false)
  const [currentTranscript, setCurrentTranscript] = useState('')
  const [botStatus, setBotStatus] = useState<'idle' | 'listening' | 'thinking' | 'speaking'>('idle')

  const messagesEndRef = useRef<HTMLDivElement>(null)
  const zegoService = useRef(ChatbotZegoService.getInstance())
  const processedMessageIds = useRef(new Set<string>())

  // Environment configuration
  const ZEGO_APP_ID = import.meta.env.VITE_ZEGO_APP_ID
  const ZEGO_SERVER = import.meta.env.VITE_ZEGO_SERVER || 'wss://webliveroom-api.zegocloud.com/ws'

  useEffect(() => {
    scrollToBottom()
  }, [messages])

  useEffect(() => {
    // Initialize ZEGO service
    if (ZEGO_APP_ID && ZEGO_SERVER) {
      zegoService.current.initialize(ZEGO_APP_ID, ZEGO_SERVER)
      setupMessageHandlers()
    }

    return () => {
      if (session?.isActive) {
        stopChatbot()
      }
    }
  }, [])

  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' })
  }

  const setupMessageHandlers = () => {
    zegoService.current.onMessage((data: any) => {
      try {
        const { Cmd, Data: msgData } = data

        if (Cmd === 3) { // Voice transcript
          const { Text: transcript, EndFlag, MessageId } = msgData

          if (transcript?.trim()) {
            setCurrentTranscript(transcript)
            setBotStatus('listening')

            if (EndFlag) {
              const messageId = MessageId || `voice_${Date.now()}`
              addMessage({
                id: messageId,
                content: transcript.trim(),
                sender: 'user',
                timestamp: Date.now(),
                type: 'voice',
                transcript: transcript.trim()
              })
              setCurrentTranscript('')
              setBotStatus('thinking')
            }
          }
        } else if (Cmd === 4) { // Bot response
          const { Text: content, MessageId, EndFlag } = msgData

          if (content && MessageId) {
            if (EndFlag) {
              updateMessageContent(MessageId, content, false)
              setBotStatus('idle')
            } else {
              if (!processedMessageIds.current.has(MessageId)) {
                addMessage({
                  id: MessageId,
                  content: content,
                  sender: 'bot',
                  timestamp: Date.now(),
                  type: 'text',
                  isStreaming: true
                })
                processedMessageIds.current.add(MessageId)
              } else {
                updateMessageContent(MessageId, content, true)
              }
              setBotStatus('speaking')
            }
          }
        }
      } catch (error) {
        console.error('Error handling chatbot message:', error)
      }
    })
  }

  const addMessage = (message: ChatMessage) => {
    setMessages(prev => [...prev, message])
  }

  const updateMessageContent = (messageId: string, newContent: string, isStreaming: boolean) => {
    setMessages(prev => prev.map(msg => 
      msg.id === messageId 
        ? { ...msg, content: newContent, isStreaming }
        : msg
    ))
  }

  const startChatbot = async () => {
    if (isLoading) return

    setIsLoading(true)
    setBotStatus('idle')

    try {
      const roomId = `chatbot_${Date.now()}`
      const userId = `user_${Date.now()}`

      // Initialize ZEGO connection
      const { token } = await chatbotAPI.getToken(userId)
      const joinSuccess = await zegoService.current.joinRoom(roomId, userId, token)

      if (!joinSuccess) {
        throw new Error('Failed to join chatbot room')
      }

      // Start chatbot agent
      const { chatbotId } = await chatbotAPI.startChatbot(roomId, userId)

      const newSession: ChatbotSession = {
        roomId,
        userId,
        chatbotId,
        isActive: true,
        voiceEnabled: true
      }

      setSession(newSession)
      setIsConnected(true)

      // Add welcome message
      addMessage({
        id: 'welcome',
        content: 'Hello! I\'m your AI assistant. You can type messages or use voice input to chat with me.',
        sender: 'bot',
        timestamp: Date.now(),
        type: 'text'
      })

      console.log('✅ Chatbot started successfully')
    } catch (error) {
      console.error('❌ Failed to start chatbot:', error)
      setBotStatus('idle')
    } finally {
      setIsLoading(false)
    }
  }

  const stopChatbot = async () => {
    if (!session) return

    try {
      if (session.chatbotId) {
        await chatbotAPI.stopChatbot(session.chatbotId)
      }

      await zegoService.current.leaveRoom()

      setSession(null)
      setIsConnected(false)
      setBotStatus('idle')
      setIsRecording(false)
      setCurrentTranscript('')

      console.log('✅ Chatbot stopped')
    } catch (error) {
      console.error('❌ Failed to stop chatbot:', error)
    }
  }

  const sendTextMessage = async () => {
    if (!inputMessage.trim() || !session?.chatbotId) return

    const message: ChatMessage = {
      id: `text_${Date.now()}`,
      content: inputMessage.trim(),
      sender: 'user',
      timestamp: Date.now(),
      type: 'text'
    }

    addMessage(message)
    setInputMessage('')
    setBotStatus('thinking')

    try {
      await chatbotAPI.sendMessage(session.chatbotId, message.content)
    } catch (error) {
      console.error('❌ Failed to send message:', error)
      setBotStatus('idle')
    }
  }

  const toggleVoiceInput = async () => {
    if (!isConnected) return

    try {
      const newRecordingState = !isRecording
      const success = await zegoService.current.enableVoiceInput(newRecordingState)

      if (success) {
        setIsRecording(newRecordingState)
        setBotStatus(newRecordingState ? 'listening' : 'idle')
      }
    } catch (error) {
      console.error('❌ Failed to toggle voice input:', error)
    }
  }

  const handleKeyPress = (e: React.KeyboardEvent) => {
    if (e.key === 'Enter' && !e.shiftKey) {
      e.preventDefault()
      sendTextMessage()
    }
  }

  const getStatusText = () => {
    switch (botStatus) {
      case 'listening':
        return 'Listening...'
      case 'thinking':
        return 'Processing...'
      case 'speaking':
        return 'Responding...'
      default:
        return isConnected ? 'Ready to chat' : 'Click start to begin'
    }
  }

  const getStatusColor = () => {
    switch (botStatus) {
      case 'listening':
        return 'text-green-600'
      case 'thinking':
        return 'text-blue-600'
      case 'speaking':
        return 'text-purple-600'
      default:
        return isConnected ? 'text-green-600' : 'text-gray-500'
    }
  }

  return (
    <div className="flex flex-col h-screen bg-gray-50">
      {/* Hidden audio element for voice playback */}
      <audio 
        id="chatbot-audio" 
        autoPlay 
        style={{ display: 'none' }}
        controls={false}
      />

      {/* Header */}
      <motion.div 
        initial={{ y: -20, opacity: 0 }}
        animate={{ y: 0, opacity: 1 }}
        className="bg-white border-b border-gray-200 px-6 py-4"
      >
        <div className="flex items-center justify-between">
          <div className="flex items-center space-x-3">
            <div className="w-10 h-10 bg-gradient-to-br from-purple-500 to-purple-600 rounded-full flex items-center justify-center">
              <Bot className="w-5 h-5 text-white" />
            </div>
            <div>
              <h1 className="text-xl font-semibold text-gray-900">AI Chatbot</h1>
              <p className={`text-sm ${getStatusColor()}`}>
                {getStatusText()}
              </p>
            </div>
          </div>

          {isConnected ? (
            <button
              onClick={stopChatbot}
              className="px-4 py-2 bg-red-600 text-white rounded-lg hover:bg-red-700 transition-colors"
            >
              Stop Chat
            </button>
          ) : (
            <button
              onClick={startChatbot}
              disabled={isLoading}
              className="px-4 py-2 bg-purple-600 text-white rounded-lg hover:bg-purple-700 transition-colors disabled:opacity-50"
            >
              {isLoading ? 'Starting...' : 'Start Chat'}
            </button>
          )}
        </div>
      </motion.div>

      {/* Messages Area */}
      <div className="flex-1 overflow-y-auto px-4 py-6">
        {messages.length === 0 && !isConnected && (
          <motion.div
            initial={{ opacity: 0, y: 20 }}
            animate={{ opacity: 1, y: 0 }}
            className="flex flex-col items-center justify-center h-full text-center"
          >
            <div className="w-16 h-16 bg-gradient-to-br from-purple-500 to-purple-600 rounded-full flex items-center justify-center mb-4">
              <MessageSquare className="w-8 h-8 text-white" />
            </div>
            <h3 className="text-lg font-semibold text-gray-900 mb-2">
              Welcome to AI Chatbot
            </h3>
            <p className="text-gray-600 mb-6 max-w-md">
              Start chatting with our intelligent AI assistant. You can type messages or use voice input for natural conversations.
            </p>
            <div className="space-y-2 text-sm text-gray-500">
              <p>💬 Natural text conversations</p>
              <p>🎤 Voice input and responses</p>
              <p>🧠 Context-aware assistance</p>
            </div>
          </motion.div>
        )}

        <AnimatePresence>
          {messages.map((message) => (
            <MessageBubble key={message.id} message={message} />
          ))}
        </AnimatePresence>

        {/* Transcript display */}
        {currentTranscript && (
          <motion.div
            initial={{ opacity: 0, y: 10 }}
            animate={{ opacity: 1, y: 0 }}
            className="mb-4 p-3 bg-green-50 border border-green-200 rounded-lg"
          >
            <div className="flex items-center space-x-2">
              <div className="w-2 h-2 bg-green-500 rounded-full animate-pulse" />
              <p className="text-sm text-green-700">{currentTranscript}</p>
            </div>
          </motion.div>
        )}

        {/* Thinking indicator */}
        {botStatus === 'thinking' && (
          <motion.div
            initial={{ opacity: 0, y: 10 }}
            animate={{ opacity: 1, y: 0 }}
            className="flex justify-start mb-4"
          >
            <div className="flex items-center space-x-2">
              <div className="w-8 h-8 bg-gradient-to-br from-purple-500 to-purple-600 rounded-full flex items-center justify-center">
                <Bot className="w-4 h-4 text-white" />
              </div>
              <div className="bg-white border border-gray-200 rounded-2xl px-4 py-2">
                <div className="flex space-x-1">
                  <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
                  <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" style={{ animationDelay: '0.1s' }} />
                  <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" style={{ animationDelay: '0.2s' }} />
                </div>
              </div>
            </div>
          </motion.div>
        )}

        <div ref={messagesEndRef} />
      </div>

      {/* Input Area */}
      {isConnected && (
        <motion.div 
          initial={{ y: 20, opacity: 0 }}
          animate={{ y: 0, opacity: 1 }}
          className="bg-white border-t border-gray-200 p-4"
        >
          <div className="flex items-center space-x-3">
            <div className="flex-1">
              <input
                type="text"
                value={inputMessage}
                onChange={(e) => setInputMessage(e.target.value)}
                onKeyPress={handleKeyPress}
                placeholder="Type your message..."
                disabled={botStatus === 'thinking'}
                className="w-full px-4 py-3 border border-gray-300 rounded-xl focus:outline-none focus:ring-2 focus:ring-purple-500 focus:border-transparent disabled:opacity-50 disabled:cursor-not-allowed"
              />
            </div>

            {/* Voice button */}
            <button
              onClick={toggleVoiceInput}
              disabled={botStatus === 'thinking'}
              className={`p-3 rounded-xl transition-all duration-200 ${
                isRecording
                  ? 'bg-red-500 text-white shadow-lg scale-110'
                  : 'bg-gray-100 text-gray-600 hover:bg-gray-200'
              } disabled:opacity-50 disabled:cursor-not-allowed`}
            >
              {isRecording ? <MicOff className="w-5 h-5" /> : <Mic className="w-5 h-5" />}
            </button>

            {/* Send button */}
            <button
              onClick={sendTextMessage}
              disabled={!inputMessage.trim() || botStatus === 'thinking'}
              className="p-3 bg-purple-600 text-white rounded-xl hover:bg-purple-700 transition-colors disabled:opacity-50 disabled:cursor-not-allowed"
            >
              <Send className="w-5 h-5" />
            </button>
          </div>
        </motion.div>
      )}
    </div>
  )
}

This main chatbot component provides a complete interface with message display, text input, voice recording, and real-time status indicators.

9. Application Assembly and Styling

Create the environment configuration at client/.env:

VITE_ZEGO_APP_ID=your_zego_app_id
VITE_ZEGO_SERVER=wss://webliveroom-api.zegocloud.com/ws
VITE_API_BASE_URL=http://localhost:8080

Create the main application at client/src/App.tsx:

import { Chatbot } from './components/Chatbot'
import './index.css'

function App() {
  return (
    <div className="w-full h-screen">
      <Chatbot />
    </div>
  )
}

export default App

Create the styling at client/src/index.css:

@import "tailwindcss";

body {
  margin: 0;
  font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', sans-serif;
  background-color: #f9fafb;
}

* {
  box-sizing: border-box;
}

.animate-bounce {
  animation: bounce 1s infinite;
}

@keyframes bounce {
  0%, 20%, 53%, 80%, 100% {
    transform: translate3d(0,0,0);
  }
  40%, 43% {
    transform: translate3d(0, -10px, 0);
  }
  70% {
    transform: translate3d(0, -5px, 0);
  }
  90% {
    transform: translate3d(0, -2px, 0);
  }
}

Create the entry point at client/src/main.tsx:

import { StrictMode } from 'react'
import { createRoot } from 'react-dom/client'
import App from './App.tsx'

createRoot(document.getElementById('root')!).render(
  <StrictMode>
    <App />
  </StrictMode>,
)

Create the HTML template at client/index.html:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>AI Chatbot with ZEGOCLOUD</title>
  </head>
  <body>
    <div id="root"></div>
    <script type="module" src="/src/main.tsx"></script>
  </body>
</html>

10. Testing and Deployment

Start the backend server:

cd server
npm run dev

In a separate terminal, start the frontend:

cd client
npm run dev

Run a Demo

Conclusion

Now you have a working AI chatbot that can understand speech, respond intelligently, and manage real-time conversations. Users are free to type or speak naturally, and the chatbot delivers appropriate replies in text or voice.

What once required complex integrations of multiple AI services, audio processing pipelines, and real-time synchronization has been made simple with ZEGOCLOUD. With a single platform, you built a chatbot that feels both responsive and natural to use.

This solid foundation can power customer support, education, virtual assistants, or any application where intelligent conversation is essential. From here, you can refine the chatbot’s personality, extend its features, or integrate with external services while maintaining the same reliable communication core.

FAQ

Q1: Can I make my own AI chatbot?

Yes. With platforms like ZEGOCLOUD, you can build an AI chatbot that handles both text and voice interactions. The process no longer requires stitching together multiple services, so even individual developers can create powerful chatbots.

Q2: How much does it cost to build an AI chatbot?

The cost depends on scale and features. Simple chatbots can be built at low cost, while advanced real-time conversational bots may require cloud usage fees. ZEGOCLOUD offers flexible pricing so you can start small and scale as your user base grows.

Q3: Can I create my own AI like ChatGPT?

You can build applications powered by large language models similar to ChatGPT, but instead of training one from scratch, most developers integrate existing APIs and SDKs. This saves time, cost, and computing resources.

Q4: Is it hard to develop an AI chatbot?

Traditionally it was difficult because you had to integrate natural language processing, speech recognition, text-to-speech, and real-time messaging. With ZEGOCLOUD’s all-in-one AI agent SDK, the process is much easier and faster.

Let’s Build APP Together

Start building with real-time video, voice & chat SDK for apps today!

Talk to us

Take your apps to the next level with our voice, video and chat APIs

Free Trial
  • 10,000 minutes for free
  • 4,000+ corporate clients
  • 3 Billion daily call minutes

Stay updated with us by signing up for our newsletter!

Don't miss out on important news and updates from ZEGOCLOUD!

* You may unsubscribe at any time using the unsubscribe link in the digest email. See our privacy policy for more information.