Talk to us
Talk to us
menu

How to Create a Conversational AI

How to Create a Conversational AI

Voice assistants are everywhere now. People talk to their phones, ask questions of smart speakers, and expect apps to understand what they say. To build a conversational AI used to be really hard. You needed different services for speech recognition, processing what people mean, and making the computer talk back. Managing all the audio streams and making sure voices sound clear on different devices was a nightmare.

ZEGOCLOUD conversational AI solution makes this much simpler. You can add voice conversations to your app without dealing with complex audio processing or expensive backend systems. Your app can listen to users, understand what they want, and respond with natural-sounding speech.

👉 Schedule a Demo

This guide shows you how to build a conversational AI that actually works. Users will be able to have real voice conversations with your app.

Conversational AI Solutions Built by ZEGOCLOUD

ZEGOCLOUD treats AI agents like real participants in your app. Instead of building separate chatbots, you invite AI directly into voice calls, video rooms, or live streams. The AI joins as an active participant and talks with users in real-time.

Multiple people can speak with the same AI agent during group calls. The AI recognizes different voices, gives personalized responses, and even suggests topics to keep conversations flowing. It handles interruptions naturally and responds just like a human participant would.

This approach makes conversational AI feel more natural. Users don’t switch between talking to people and talking to bots. The AI agent participates in the same conversation using the same voice streams as everyone else in the room.

👉 Schedule a Demo

build conversational ai

Prerequisites

Before building the conversational AI functionality, ensure you have these essential components:

  • ZEGOCLOUD developer account and AI Agent service activation – Signup here.
  • Node.js 18+ with npm for package management and development tooling.
  • Valid AppID and ServerSecret credentials from ZEGOCLOUD admin console for authentication.
  • OpenAI API key for AI responses, or any OpenAI-compatible LLM provider.
  • Physical device with microphone access for voice testing, as browser simulators cannot provide reliable audio capabilities.

Steps for Building a Conversational AI

1. Project Structure and Backend Setup

Begin by creating the complete project structure with separated client and server components. This organization enables independent development of frontend and backend while maintaining clean separation of concerns.

Create a new project directory and initialize the backend server:

mkdir conversational-ai
cd conversational-ai
mkdir server client
cd server
npm init -y

Update your server/package.json with the required dependencies and configuration:

{
  "name": "conversational-ai-server",
  "version": "1.0.0",
  "type": "module",
  "scripts": {
    "dev": "tsx watch src/server.ts",
    "build": "tsc",
    "start": "node dist/server.js",
    "type-check": "tsc --noEmit"
  },
  "dependencies": {
    "express": "^5.1.0",
    "cors": "^2.8.5",
    "dotenv": "^17.2.1",
    "axios": "^1.11.0"
  },
  "devDependencies": {
    "@types/express": "^5.0.3",
    "@types/cors": "^2.8.19",
    "@types/node": "^24.3.0",
    "typescript": "^5.9.2",
    "tsx": "^4.20.4"
  }
}

This package configuration establishes a modern Node.js server with TypeScript support, hot-reloading capabilities for development, and essential packages for building REST APIs and integrating with ZEGOCLOUD services.

Install the dependencies:

npm install

Create the TypeScript configuration at server/tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "moduleResolution": "bundler",
    "allowSyntheticDefaultImports": true,
    "esModuleInterop": true,
    "forceConsistentCasingInFileNames": true,
    "strict": true,
    "noImplicitAny": true,
    "strictNullChecks": true,
    "strictFunctionTypes": true,
    "noImplicitReturns": true,
    "noFallthroughCasesInSwitch": true,
    "noUncheckedIndexedAccess": true,
    "skipLibCheck": true,
    "declaration": true,
    "declarationMap": true,
    "sourceMap": true,
    "outDir": "./dist",
    "rootDir": "./src",
    "removeComments": false,
    "resolveJsonModule": true,
    "isolatedModules": true,
    "moduleDetection": "force"
  },
  "include": ["src/**/*"],
  "exclude": ["node_modules", "dist"]
}

This TypeScript configuration enables modern JavaScript features, strict type checking, and proper module resolution for the latest Node.js environments while generating source maps for debugging.

2. Environment Configuration and ZEGOCLOUD Credentials

Set up your environment variables by creating server/.env:

# ZEGOCLOUD Configuration
ZEGO_APP_ID=your_zego_app_id_here
ZEGO_SERVER_SECRET=your_zego_server_secret_here
ZEGO_API_BASE_URL=https://aigc-aiagent-api.zegotech.cn

# LLM Provider Configuration
LLM_URL=https://api.openai.com/v1/chat/completions
LLM_API_KEY=your_openai_api_key_here
LLM_MODEL=gpt-4o-mini

# Server Configuration
PORT=8080

Replace the placeholder values with your actual ZEGOCLOUD App ID and Server Secret from the console, plus your OpenAI API key.

3. ZEGOCLOUD Token Generation Implementation

Create the ZEGOCLOUD token generator at server/zego-token.cjs:

"use strict";
Object.defineProperty(exports, "__esModule", { value: true });
exports.generateToken04 = generateToken04;
var crypto_1 = require("crypto");

// Generate random number in int32 range
function makeNonce() {
    var min = -Math.pow(2, 31); // -2^31
    var max = Math.pow(2, 31) - 1; // 2^31 - 1
    return Math.floor(Math.random() * (max - min + 1)) + min;
}

// AES encryption using GCM mode
function aesGcmEncrypt(plainText, key) {
    // Ensure valid key length (16, 24 or 32 bytes)
    if (![16, 24, 32].includes(key.length)) {
        throw createError(5, 'Invalid Secret length. Key must be 16, 24, or 32 bytes.');
    }

    // Generate random 12-byte nonce for AES encryption
    var nonce = (0, crypto_1.randomBytes)(12);
    var cipher = (0, crypto_1.createCipheriv)('aes-256-gcm', key, nonce);
    cipher.setAutoPadding(true);
    var encrypted = cipher.update(plainText, 'utf8');
    var encryptBuf = Buffer.concat([encrypted, cipher.final(), cipher.getAuthTag()]);
    return { encryptBuf: encryptBuf, nonce: nonce };
}

function createError(errorCode, errorMessage) {
    return {
        errorCode: errorCode,
        errorMessage: errorMessage
    };
}

function generateToken04(appId, userId, secret, effectiveTimeInSeconds, payload) {
    if (!appId || typeof appId !== 'number') {
        throw createError(1, 'appID invalid');
    }
    if (!userId || typeof userId !== 'string' || userId.length > 64) {
        throw createError(3, 'userId invalid');
    }
    if (!secret || typeof secret !== 'string' || secret.length !== 32) {
        throw createError(5, 'secret must be a 32 byte string');
    }
    if (!(effectiveTimeInSeconds > 0)) {
        throw createError(6, 'effectiveTimeInSeconds invalid');
    }

    var VERSION_FLAG = '04';
    var createTime = Math.floor(new Date().getTime() / 1000);
    var tokenInfo = {
        app_id: appId,
        user_id: userId,
        nonce: makeNonce(),
        ctime: createTime,
        expire: createTime + effectiveTimeInSeconds,
        payload: payload || ''
    };

    // Convert token info to JSON
    var plaintText = JSON.stringify(tokenInfo);
    console.log('plain text: ', plaintText);

    // Perform encryption
    var _a = aesGcmEncrypt(plaintText, secret), encryptBuf = _a.encryptBuf, nonce = _a.nonce;

    // Binary token assembly: expire time + Base64(nonce length + nonce + encrypted info length + encrypted info + encryption mode)
    var _b = [new Uint8Array(8), new Uint8Array(2), new Uint8Array(2), new Uint8Array(1)], b1 = _b[0], b2 = _b[1], b3 = _b[2], b4 = _b[3];
    new DataView(b1.buffer).setBigInt64(0, BigInt(tokenInfo.expire), false);
    new DataView(b2.buffer).setUint16(0, nonce.byteLength, false);
    new DataView(b3.buffer).setUint16(0, encryptBuf.byteLength, false);
    new DataView(b4.buffer).setUint8(0, 1);

    var buf = Buffer.concat([
        Buffer.from(b1),
        Buffer.from(b2),
        Buffer.from(nonce),
        Buffer.from(b3),
        Buffer.from(encryptBuf),
        Buffer.from(b4),
    ]);
    var dv = new DataView(Uint8Array.from(buf).buffer);

    return VERSION_FLAG + Buffer.from(dv.buffer).toString('base64');
}

This token generator creates secure, time-limited authentication tokens using ZEGOCLOUD’s token04 format. It encrypts user session data with AES-GCM encryption and packages it in a binary format that ZEGOCLOUD’s servers can verify and decode.

4. Main Server Implementation

Create the main server file at server/src/server.ts:

import express from 'express'
import cors from 'cors'
import dotenv from 'dotenv'
import axios from 'axios'
import { generateToken04 } from '../zego-token.cjs'

dotenv.config()

const app = express()
const PORT = process.env.PORT || 8080

// Middleware
app.use(cors({
  origin: ['http://localhost:5173', 'http://localhost:3000'],
  credentials: true
}))
app.use(express.json())

// Environment validation
const requiredEnvVars = ['ZEGO_APP_ID', 'ZEGO_SERVER_SECRET', 'LLM_API_KEY']
const missingVars = requiredEnvVars.filter(varName => !process.env[varName])

if (missingVars.length > 0) {
  console.error('❌ Missing required environment variables:', missingVars)
  process.exit(1)
}

const ZEGO_APP_ID = parseInt(process.env.ZEGO_APP_ID!)
const ZEGO_SERVER_SECRET = process.env.ZEGO_SERVER_SECRET!
const ZEGO_API_BASE_URL = process.env.ZEGO_API_BASE_URL!
const LLM_URL = process.env.LLM_URL!
const LLM_API_KEY = process.env.LLM_API_KEY!
const LLM_MODEL = process.env.LLM_MODEL || 'gpt-4o-mini'

// Health check endpoint
app.get('/health', (req, res) => {
  res.json({ 
    status: 'healthy', 
    timestamp: new Date().toISOString(),
    environment: {
      hasZegoAppId: !!process.env.ZEGO_APP_ID,
      hasZegoSecret: !!process.env.ZEGO_SERVER_SECRET,
      hasLLMKey: !!process.env.LLM_API_KEY,
      nodeVersion: process.version
    }
  })
})

// Generate ZEGO token for authentication
app.get('/api/token', (req, res) => {
  try {
    const { user_id } = req.query

    if (!user_id || typeof user_id !== 'string') {
      return res.status(400).json({ 
        success: false, 
        error: 'user_id is required and must be a string' 
      })
    }

    const effectiveTimeInSeconds = 7200 // 2 hours
    const payload = ''

    const token = generateToken04(
      ZEGO_APP_ID,
      user_id,
      ZEGO_SERVER_SECRET,
      effectiveTimeInSeconds,
      payload
    )

    console.log(`✅ Generated token for user: ${user_id}`)

    res.json({ 
      success: true, 
      token,
      expires_in: effectiveTimeInSeconds
    })
  } catch (error) {
    console.error('❌ Token generation failed:', error)
    res.status(500).json({ 
      success: false, 
      error: 'Failed to generate token' 
    })
  }
})

// Start AI agent session
app.post('/api/start', async (req, res) => {
  try {
    const { room_id, user_id, user_stream_id } = req.body

    if (!room_id || !user_id) {
      return res.status(400).json({
        success: false,
        error: 'room_id and user_id are required'
      })
    }

    console.log(`🚀 Starting AI session for room: ${room_id}, user: ${user_id}`)

    const response = await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/start`, {
      app_id: ZEGO_APP_ID,
      room_id: room_id,
      user_id: user_id,
      user_stream_id: user_stream_id || `${user_id}_stream`,
      ai_agent_config: {
        llm_config: {
          url: LLM_URL,
          api_key: LLM_API_KEY,
          model: LLM_MODEL,
          context: [
            {
              role: "system",
              content: "You are a helpful AI assistant. Be conversational, friendly, and helpful. Keep responses concise but informative. You can engage in natural conversation and help with various topics."
            }
          ]
        },
        tts_config: {
          provider: "elevenlabs",
          voice_id: "pNInz6obpgDQGcFmaJgB",
          model: "eleven_turbo_v2_5"
        },
        asr_config: {
          provider: "deepgram",
          language: "en"
        }
      }
    }, {
      headers: {
        'Content-Type': 'application/json'
      },
      timeout: 30000
    })

    if (response.data && response.data.data && response.data.data.ai_agent_instance_id) {
      const agentInstanceId = response.data.data.ai_agent_instance_id
      console.log(`✅ AI agent started successfully: ${agentInstanceId}`)

      res.json({
        success: true,
        agentInstanceId: agentInstanceId,
        room_id: room_id,
        user_id: user_id
      })
    } else {
      throw new Error('Invalid response from ZEGO API')
    }
  } catch (error: any) {
    console.error('❌ Failed to start AI session:', error.response?.data || error.message)
    res.status(500).json({
      success: false,
      error: error.response?.data?.message || error.message || 'Failed to start AI session'
    })
  }
})

// Send message to AI agent
app.post('/api/send-message', async (req, res) => {
  try {
    const { agent_instance_id, message } = req.body

    if (!agent_instance_id || !message) {
      return res.status(400).json({
        success: false,
        error: 'agent_instance_id and message are required'
      })
    }

    console.log(`💬 Sending message to agent ${agent_instance_id}: ${message.substring(0, 50)}...`)

    const response = await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/chat`, {
      ai_agent_instance_id: agent_instance_id,
      messages: [
        {
          role: "user",
          content: message
        }
      ]
    }, {
      headers: {
        'Content-Type': 'application/json'
      },
      timeout: 30000
    })

    console.log(`✅ Message sent successfully to agent: ${agent_instance_id}`)

    res.json({
      success: true,
      message: 'Message sent successfully'
    })
  } catch (error: any) {
    console.error('❌ Failed to send message:', error.response?.data || error.message)
    res.status(500).json({
      success: false,
      error: error.response?.data?.message || error.message || 'Failed to send message'
    })
  }
})

// Stop AI agent session
app.post('/api/stop', async (req, res) => {
  try {
    const { agent_instance_id } = req.body

    if (!agent_instance_id) {
      return res.status(400).json({
        success: false,
        error: 'agent_instance_id is required'
      })
    }

    console.log(`🛑 Stopping AI session: ${agent_instance_id}`)

    const response = await axios.post(`${ZEGO_API_BASE_URL}/v1/ai_agent/stop`, {
      ai_agent_instance_id: agent_instance_id
    }, {
      headers: {
        'Content-Type': 'application/json'
      },
      timeout: 30000
    })

    console.log(`✅ AI session stopped successfully: ${agent_instance_id}`)

    res.json({
      success: true,
      message: 'AI session stopped successfully'
    })
  } catch (error: any) {
    console.error('❌ Failed to stop AI session:', error.response?.data || error.message)
    res.status(500).json({
      success: false,
      error: error.response?.data?.message || error.message || 'Failed to stop AI session'
    })
  }
})

app.listen(PORT, () => {
  console.log(`🚀 Server running on port ${PORT}`)
  console.log(`🏥 Health check: http://localhost:${PORT}/health`)
})

The /api/start endpoint creates ZEGOCLOUD rooms and initializes AI agents with configured language models, text-to-speech, and speech recognition providers.

The /api/send-message endpoint forwards user messages to active AI agents, while /api/stop properly terminates sessions and cleans up resources.

5. Frontend Project Initialization

Now set up the React frontend. Navigate to the root directory and create the client Vite project:

cd ..
npm create vite@latest
Choose react, ts,.. name it client or so

Update client/package.json with the required frontend dependencies:

{
  "name": "zego-convo-ai",
  "private": true,
  "version": "0.0.0",
  "type": "module",
  "scripts": {
    "dev": "vite",
    "build": "tsc -b && vite build",
    "lint": "eslint .",
    "preview": "vite preview"
  },
  "dependencies": {
    "@tailwindcss/vite": "^4.1.11",
    "@types/dom-speech-recognition": "^0.0.6",
    "@types/node": "^24.2.0",
    "axios": "^1.11.0",
    "framer-motion": "^12.23.12",
    "lucide-react": "^0.536.0",
    "react": "^19.1.0",
    "react-dom": "^19.1.0",
    "react-speech-kit": "^3.0.1",
    "tailwindcss": "^4.1.11",
    "zego-express-engine-webrtc": "^3.10.0",
    "zod": "^4.0.15"
  },
  "devDependencies": {
    "@eslint/js": "^9.30.1",
    "@types/react": "^19.1.8",
    "@types/react-dom": "^19.1.6",
    "@vitejs/plugin-react": "^4.6.0",
    "eslint": "^9.30.1",
    "eslint-plugin-react-hooks": "^5.2.0",
    "eslint-plugin-react-refresh": "^0.4.20",
    "globals": "^16.3.0",
    "typescript": "~5.8.3",
    "typescript-eslint": "^8.35.1",
    "vite": "^7.0.4"
  }
}

Install the frontend dependencies:

npm install

Update the Vite configuration at client/vite.config.ts:

import { defineConfig } from 'vite'
import react from '@vitejs/plugin-react'
import tailwindcss from '@tailwindcss/vite'

export default defineConfig({
  plugins: [react(), tailwindcss()],
  define: {
    global: 'globalThis',
  },
  server: {
    host: true,
  },
  optimizeDeps: {
    include: ['zego-express-engine-webrtc'],
  }
})

Now create the type definitions at client/src/types/index.ts:

export interface Message {
  id: string
  content: string
  sender: 'user' | 'ai'
  timestamp: number
  type: 'text' | 'voice'
  isStreaming?: boolean
  audioUrl?: string
  duration?: number
  transcript?: string
}

export interface ConversationMemory {
  id: string
  title: string
  messages: Message[]
  createdAt: number
  updatedAt: number
  metadata: {
    totalMessages: number
    lastAIResponse: string
    topics: string[]
  }
}

export interface VoiceSettings {
  isEnabled: boolean
  autoPlay: boolean
  speechRate: number
  speechPitch: number
  preferredVoice?: string
}

export interface ChatSession {
  roomId: string
  userId: string
  agentInstanceId?: string
  isActive: boolean
  conversationId?: string
  voiceSettings: VoiceSettings
}

export interface AIAgent {
  id: string
  name: string
  personality: string
  voiceCharacteristics: {
    language: 'en-US' | 'en-GB'
    gender: 'male' | 'female'
    speed: number
    pitch: number
  }
}

These TypeScript interfaces define the data structures for messages, conversations, chat sessions, and voice settings.

The Message interface supports both text and voice messages with streaming capabilities. ConversationMemory handles local storage of chat history with metadata for organization and search functionality.

7. Environment Configuration and Service Setup

Create the frontend environment configuration at client/.env:

VITE_ZEGO_APP_ID=your_zego_app_id_here
VITE_ZEGO_SERVER=wss://webliveroom-api.zegocloud.com/ws
VITE_API_BASE_URL=http://localhost:8080 preferably deploy your backend and put the URL here

Create the configuration service at client/src/config.ts:

import { z } from 'zod'

const configSchema = z.object({
  ZEGO_APP_ID: z.string().min(1, 'ZEGO App ID is required'),
  ZEGO_SERVER: z.string().url('Valid ZEGO server URL required'),
  API_BASE_URL: z.string().url('Valid API base URL required'),
})

const rawConfig = {
  ZEGO_APP_ID: import.meta.env.VITE_ZEGO_APP_ID,
  ZEGO_SERVER: import.meta.env.VITE_ZEGO_SERVER,
  API_BASE_URL: import.meta.env.VITE_API_BASE_URL,
}

export const config = configSchema.parse(rawConfig)

export const STORAGE_KEYS = {
  CONVERSATIONS: 'ai_conversations',
  USER_PREFERENCES: 'ai_user_preferences',
  SESSION_HISTORY: 'ai_session_history',
} as const

This configuration module validates environment variables using Zod schemas to ensure required ZEGOCLOUD credentials are present and properly formatted. It also defines localStorage keys for persisting conversation data and user preferences across browser sessions.

8. API Service Layer Implementation

Create the API service layer at client/src/services/api.ts:

import axios from 'axios'
import { config } from '../config'

const api = axios.create({
  baseURL: config.API_BASE_URL,
  timeout: 30000,
  headers: {
    'Content-Type': 'application/json'
  }
})

api.interceptors.request.use(
  (config) => {
    console.log('🌐 API Request:', config.method?.toUpperCase(), config.url)
    if (config.data && config.method !== 'get') {
      console.log('📤 Request Data:', config.data)
    }
    return config
  },
  (error) => {
    console.error('❌ API Request Error:', error)
    return Promise.reject(error)
  }
)

api.interceptors.response.use(
  (response) => {
    console.log('✅ API Response:', response.status, response.config.url)
    if (response.data) {
      console.log('📥 Response Data:', response.data)
    }
    return response
  },
  (error) => {
    console.error('❌ API Response Error:', {
      status: error.response?.status,
      statusText: error.response?.statusText,
      data: error.response?.data,
      url: error.config?.url,
      method: error.config?.method
    })
    return Promise.reject(error)
  }
)

export const agentAPI = {
  async startSession(roomId: string, userId: string): Promise<{ agentInstanceId: string }> {
    try {
      const requestData = {
        room_id: roomId,
        user_id: userId,
        user_stream_id: `${userId}_stream`,
      }

      console.log('🚀 Starting session with data:', requestData)

      const response = await api.post('/api/start', requestData)

      if (!response.data || !response.data.success) {
        throw new Error(response.data?.error || 'Session start failed')
      }

      if (!response.data.agentInstanceId) {
        throw new Error('No agent instance ID returned')
      }

      console.log('✅ Session started successfully:', response.data.agentInstanceId)

      return {
        agentInstanceId: response.data.agentInstanceId
      }
    } catch (error: any) {
      console.error('❌ Start session failed:', error.response?.data || error.message)
      throw new Error(error.response?.data?.error || error.message || 'Failed to start session')
    }
  },

  async sendMessage(agentInstanceId: string, message: string): Promise<void> {
    if (!agentInstanceId) {
      throw new Error('Agent instance ID is required')
    }

    if (!message || !message.trim()) {
      throw new Error('Message content is required')
    }

    try {
      const requestData = {
        agent_instance_id: agentInstanceId,
        message: message.trim(),
      }

      console.log('💬 Sending message:', {
        agentInstanceId,
        messageLength: message.length,
        messagePreview: message.substring(0, 50) + (message.length > 50 ? '...' : '')
      })

      const response = await api.post('/api/send-message', requestData)

      if (!response.data || !response.data.success) {
        throw new Error(response.data?.error || 'Message send failed')
      }

      console.log('✅ Message sent successfully')
    } catch (error: any) {
      console.error('❌ Send message failed:', error.response?.data || error.message)
      throw new Error(error.response?.data?.error || error.message || 'Failed to send message')
    }
  },

  async stopSession(agentInstanceId: string): Promise<void> {
    if (!agentInstanceId) {
      console.warn('⚠️ No agent instance ID provided for stop session')
      return
    }

    try {
      const requestData = {
        agent_instance_id: agentInstanceId,
      }

      console.log('🛑 Stopping session:', agentInstanceId)

      const response = await api.post('/api/stop', requestData)

      if (!response.data || !response.data.success) {
        console.warn('⚠️ Session stop returned non-success:', response.data)
      } else {
        console.log('✅ Session stopped successfully')
      }
    } catch (error: any) {
      console.error('❌ Stop session failed:', error.response?.data || error.message)
      throw new Error(error.response?.data?.error || error.message || 'Failed to stop session')
    }
  },

  async getToken(userId: string): Promise<{ token: string }> {
    if (!userId) {
      throw new Error('User ID is required')
    }

    try {
      console.log('🔑 Getting token for user:', userId)

      const response = await api.get(`/api/token?user_id=${encodeURIComponent(userId)}`)

      if (!response.data || !response.data.token) {
        throw new Error('No token returned')
      }

      console.log('✅ Token received successfully')

      return { token: response.data.token }
    } catch (error: any) {
      console.error('❌ Get token failed:', error.response?.data || error.message)
      throw new Error(error.response?.data?.error || error.message || 'Failed to get token')
    }
  },

  async healthCheck(): Promise<{ status: string }> {
    try {
      console.log('🏥 Checking backend health')

      const response = await api.get('/health')

      console.log('✅ Backend health check successful:', response.data)

      return response.data
    } catch (error: any) {
      console.error('❌ Backend health check failed:', error.response?.data || error.message)
      throw new Error(error.response?.data?.error || error.message || 'Backend health check failed')
    }
  }
}

This API service provides a clean interface for communicating with your backend server. It includes comprehensive error handling, request/response logging, and automatic retry logic.

The service handles session management, message sending, and token generation while providing detailed console output for debugging.

9. ZEGOCLOUD Real-Time Communication Service

Create the ZEGOCLOUD service at client/src/services/zego.ts:

import { ZegoExpressEngine } from 'zego-express-engine-webrtc'
import { config } from '../config'
import { agentAPI } from './api'

export class ZegoService {
  private static instance: ZegoService
  private zg: ZegoExpressEngine | null = null
  private isInitialized = false
  private currentRoomId: string | null = null
  private currentUserId: string | null = null
  private localStream: any = null
  private isJoining = false
  private audioElement: HTMLAudioElement | null = null

  static getInstance(): ZegoService {
    if (!ZegoService.instance) {
      ZegoService.instance = new ZegoService()
    }
    return ZegoService.instance
  }

  async initialize(): Promise<void> {
    if (this.isInitialized || this.isJoining) return

    this.isJoining = true
    try {
      this.zg = new ZegoExpressEngine(
        parseInt(config.ZEGO_APP_ID), 
        config.ZEGO_SERVER
      )

      this.setupEventListeners()
      this.setupAudioElement()
      this.isInitialized = true
      console.log('✅ ZEGO initialized successfully')
    } catch (error) {
      console.error('❌ ZEGO initialization failed:', error)
      throw error
    } finally {
      this.isJoining = false
    }
  }

  private setupAudioElement(): void {
    this.audioElement = document.getElementById('ai-audio-output') as HTMLAudioElement
    if (!this.audioElement) {
      this.audioElement = document.createElement('audio')
      this.audioElement.id = 'ai-audio-output'
      this.audioElement.autoplay = true
      this.audioElement.controls = false
      this.audioElement.style.display = 'none'
      document.body.appendChild(this.audioElement)
    }

    this.audioElement.volume = 0.8
    this.audioElement.muted = false

    this.audioElement.addEventListener('loadstart', () => {
      console.log('🔊 Audio loading started')
    })

    this.audioElement.addEventListener('canplay', () => {
      console.log('🔊 Audio ready to play')
    })

    this.audioElement.addEventListener('play', () => {
      console.log('🔊 Audio playback started')
    })

    this.audioElement.addEventListener('error', (e) => {
      console.error('❌ Audio error:', e)
    })
  }

  private setupEventListeners(): void {
    if (!this.zg) return

    this.zg.on('recvExperimentalAPI', (result: any) => {
      const { method, content } = result
      if (method === 'onRecvRoomChannelMessage') {
        try {
          const message = JSON.parse(content.msgContent)
          console.log('🎯 Room message received:', message)
          this.handleRoomMessage(message)
        } catch (error) {
          console.error('Failed to parse room message:', error)
        }
      }
    })

    this.zg.on('roomStreamUpdate', async (_roomID: string, updateType: string, streamList: any[]) => {
      console.log('📡 Stream update:', updateType, streamList.length, 'streams')

      if (updateType === 'ADD' && streamList.length > 0) {
        for (const stream of streamList) {
          const userStreamId = this.currentUserId ? `${this.currentUserId}_stream` : null

          if (userStreamId && stream.streamID === userStreamId) {
            console.log('🚫 Skipping user\'s own stream:', stream.streamID)
            continue
          }

          try {
            console.log('🔗 Playing AI agent stream:', stream.streamID)

            const mediaStream = await this.zg!.startPlayingStream(stream.streamID)
            if (mediaStream) {
              console.log('✅ Media stream received:', mediaStream)

              const remoteView = await this.zg!.createRemoteStreamView(mediaStream)
              if (remoteView && this.audioElement) {
                try {
                  await remoteView.play(this.audioElement, { 
                    enableAutoplayDialog: false,
                    muted: false
                  })
                  console.log('✅ AI agent audio connected and playing')

                  this.audioElement.muted = false
                  this.audioElement.volume = 0.8
                } catch (playError) {
                  console.error('❌ Failed to play audio through element:', playError)

                  try {
                    if (this.audioElement) {
                      this.audioElement.srcObject = mediaStream
                      await this.audioElement.play()
                      console.log('✅ Fallback audio play successful')
                    }
                  } catch (fallbackError) {
                    console.error('❌ Fallback audio play failed:', fallbackError)
                  }
                }
              }
            }
          } catch (error) {
            console.error('❌ Failed to play agent stream:', error)
          }
        }
      } else if (updateType === 'DELETE') {
        console.log('📴 Agent stream disconnected')
        if (this.audioElement) {
          this.audioElement.srcObject = null
        }
      }
    })

    this.zg.on('roomUserUpdate', (_roomID: string, updateType: string, userList: any[]) => {
      console.log('👥 Room user update:', updateType, userList.length, 'users')
    })

    this.zg.on('roomStateChanged', (roomID: string, reason: string, errorCode: number) => {
      console.log('🏠 Room state changed:', { roomID, reason, errorCode })
    })

    this.zg.on('networkQuality', (userID: string, upstreamQuality: number, downstreamQuality: number) => {
      if (upstreamQuality > 2 || downstreamQuality > 2) {
        console.warn('📶 Network quality issues:', { userID, upstreamQuality, downstreamQuality })
      }
    })

    this.zg.on('publisherStateUpdate', (result: any) => {
      console.log('📤 Publisher state update:', result)
    })

    this.zg.on('playerStateUpdate', (result: any) => {
      console.log('📥 Player state update:', result)
    })
  }

  private messageCallback: ((message: any) => void) | null = null

  private handleRoomMessage(message: any): void {
    if (this.messageCallback) {
      this.messageCallback(message)
    }
  }

  async joinRoom(roomId: string, userId: string): Promise<boolean> {
    if (!this.zg) {
      console.error('❌ ZEGO not initialized')
      return false
    }

    if (this.currentRoomId === roomId && this.currentUserId === userId) {
      console.log('ℹ️ Already in the same room')
      return true
    }

    try {
      if (this.currentRoomId) {
        console.log('🔄 Leaving previous room before joining new one')
        await this.leaveRoom()
      }

      this.currentRoomId = roomId
      this.currentUserId = userId

      console.log('🔑 Getting token for user:', userId)
      const { token } = await agentAPI.getToken(userId)

      console.log('🚪 Logging into room:', roomId)
      await this.zg.loginRoom(roomId, token, {
        userID: userId,
        userName: userId
      })

      console.log('📢 Enabling room message reception')
      this.zg.callExperimentalAPI({ 
        method: 'onRecvRoomChannelMessage', 
        params: {} 
      })

      console.log('🎤 Creating local stream with enhanced audio settings')
      const localStream = await this.zg.createZegoStream({
        camera: { 
          video: false, 
          audio: true
        }
      })

      if (localStream) {
        this.localStream = localStream
        const streamId = `${userId}_stream`

        console.log('📤 Publishing stream:', streamId)
        await this.zg.startPublishingStream(streamId, localStream, {
          enableAutoSwitchVideoCodec: true
        })

        console.log('✅ Room joined successfully')
        return true
      } else {
        throw new Error('Failed to create local stream')
      }
    } catch (error) {
      console.error('❌ Failed to join room:', error)
      this.currentRoomId = null
      this.currentUserId = null
      return false
    }
  }

  async enableMicrophone(enabled: boolean): Promise<boolean> {
    if (!this.zg || !this.localStream) {
      console.warn('⚠️ Cannot toggle microphone: no stream available')
      return false
    }

    try {
      if (this.localStream.getAudioTracks) {
        const audioTrack = this.localStream.getAudioTracks()[0]
        if (audioTrack) {
          audioTrack.enabled = enabled
          console.log(`🎤 Microphone ${enabled ? 'enabled' : 'disabled'}`)
          return true
        }
      }

      console.warn('⚠️ No audio track found in local stream')
      return false
    } catch (error) {
      console.error('❌ Failed to toggle microphone:', error)
      return false
    }
  }

  async leaveRoom(): Promise<void> {
    if (!this.zg || !this.currentRoomId) {
      console.log('ℹ️ No room to leave')
      return
    }

    try {
      console.log('🚪 Leaving room:', this.currentRoomId)

      if (this.currentUserId && this.localStream) {
        const streamId = `${this.currentUserId}_stream`
        console.log('📤 Stopping stream publication:', streamId)
        await this.zg.stopPublishingStream(streamId)
      }

      if (this.localStream) {
        console.log('🗑️ Destroying local stream')
        this.zg.destroyStream(this.localStream)
        this.localStream = null
      }

      await this.zg.logoutRoom()

      if (this.audioElement) {
        this.audioElement.srcObject = null
      }

      this.currentRoomId = null
      this.currentUserId = null

      console.log('✅ Left room successfully')
    } catch (error) {
      console.error('❌ Failed to leave room:', error)
      this.currentRoomId = null
      this.currentUserId = null
      this.localStream = null
    }
  }

  onRoomMessage(callback: (message: any) => void): void {
    this.messageCallback = callback
  }

  getCurrentRoomId(): string | null {
    return this.currentRoomId
  }

  getCurrentUserId(): string | null {
    return this.currentUserId
  }

  getEngine(): ZegoExpressEngine | null {
    return this.zg
  }

  isInRoom(): boolean {
    return !!this.currentRoomId && !!this.currentUserId
  }

  destroy(): void {
    if (this.zg) {
      this.leaveRoom()
      this.zg = null
      this.isInitialized = false
      if (this.audioElement && this.audioElement.parentNode) {
        this.audioElement.parentNode.removeChild(this.audioElement)
        this.audioElement = null
      }
      console.log('🗑️ ZEGO service destroyed')
    }
  }
}

This service manages all ZEGOCLOUD WebRTC communication, including room joining, audio stream handling, and real-time message reception.

10. Conversation Memory Service

Now, create the memory service at client/src/services/memory.ts:

import type { ConversationMemory, Message } from '../types'
import { STORAGE_KEYS } from '../config'

class MemoryService {
  private static instance: MemoryService
  private conversations: Map<string, ConversationMemory> = new Map()

  static getInstance(): MemoryService {
    if (!MemoryService.instance) {
      MemoryService.instance = new MemoryService()
    }
    return MemoryService.instance
  }

  constructor() {
    this.loadFromStorage()
  }

  private loadFromStorage(): void {
    try {
      const stored = localStorage.getItem(STORAGE_KEYS.CONVERSATIONS)
      if (stored) {
        const conversations: ConversationMemory[] = JSON.parse(stored)
        conversations.forEach(conv => {
          this.conversations.set(conv.id, conv)
        })
      }
    } catch (error) {
      console.error('Failed to load conversations from storage:', error)
    }
  }

  private saveToStorage(): void {
    try {
      const conversations = Array.from(this.conversations.values())
      localStorage.setItem(STORAGE_KEYS.CONVERSATIONS, JSON.stringify(conversations))
    } catch (error) {
      console.error('Failed to save conversations to storage:', error)
    }
  }

  createOrGetConversation(id?: string): ConversationMemory {
    const conversationId = id || this.generateConversationId()

    if (this.conversations.has(conversationId)) {
      return this.conversations.get(conversationId)!
    }

    const newConversation: ConversationMemory = {
      id: conversationId,
      title: 'New Conversation',
      messages: [],
      createdAt: Date.now(),
      updatedAt: Date.now(),
      metadata: {
        totalMessages: 0,
        lastAIResponse: '',
        topics: []
      }
    }

    this.conversations.set(conversationId, newConversation)
    this.saveToStorage()
    return newConversation
  }

  addMessage(conversationId: string, message: Message): void {
    const conversation = this.conversations.get(conversationId)
    if (!conversation) return

    const existingIndex = conversation.messages.findIndex(m => m.id === message.id)
    if (existingIndex >= 0) {
      conversation.messages[existingIndex] = message
    } else {
      conversation.messages.push(message)
    }

    conversation.updatedAt = Date.now()
    conversation.metadata.totalMessages = conversation.messages.length

    if (message.sender === 'ai') {
      conversation.metadata.lastAIResponse = message.content
    }

    if (conversation.messages.length === 1 && message.sender === 'user') {
      conversation.title = message.content.slice(0, 50) + (message.content.length > 50 ? '...' : '')
    }

    this.saveToStorage()
  }

  deleteMessage(conversationId: string, messageId: string): void {
    const conversation = this.conversations.get(conversationId)
    if (!conversation) return

    conversation.messages = conversation.messages.filter(m => m.id !== messageId)
    conversation.updatedAt = Date.now()
    conversation.metadata.totalMessages = conversation.messages.length

    if (conversation.messages.length > 0) {
      const lastAIMessage = conversation.messages
        .filter(m => m.sender === 'ai')
        .pop()
      conversation.metadata.lastAIResponse = lastAIMessage?.content || ''
    } else {
      conversation.metadata.lastAIResponse = ''
    }

    this.saveToStorage()
  }

  getConversation(conversationId: string): ConversationMemory | null {
    return this.conversations.get(conversationId) || null
  }

  getAllConversations(): ConversationMemory[] {
    return Array.from(this.conversations.values())
      .sort((a, b) => b.updatedAt - a.updatedAt)
  }

  deleteConversation(conversationId: string): void {
    this.conversations.delete(conversationId)
    this.saveToStorage()
  }

  updateConversation(conversationId: string, updates: Partial<ConversationMemory>): void {
    const conversation = this.conversations.get(conversationId)
    if (!conversation) return

    Object.assign(conversation, updates, { updatedAt: Date.now() })
    this.saveToStorage()
  }

  clearAllConversations(): void {
    this.conversations.clear()
    this.saveToStorage()
  }

  private generateConversationId(): string {
    return `conv_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`
  }
}

export const memoryService = MemoryService.getInstance()

This memory service handles local storage of conversation history, enabling users to resume chats across browser sessions. You can switch this to your storage of choice.

11. React Chat Hook Implementation

The chat hook is the second most important integration here after ZEGOCLOUD as it controls the chat flow. Create the main chat hook at client/src/hooks/useChat.ts:

import { useCallback, useRef, useEffect, useReducer } from 'react'
import type { Message, ChatSession, ConversationMemory, VoiceSettings } from '../types'
import { ZegoService } from '../services/zego'
import { agentAPI } from '../services/api'
import { memoryService } from '../services/memory'

interface ChatState {
  messages: Message[]
  session: ChatSession | null
  conversation: ConversationMemory | null
  isLoading: boolean
  isConnected: boolean
  isRecording: boolean
  currentTranscript: string
  agentStatus: 'idle' | 'listening' | 'thinking' | 'speaking'
  error: string | null
}

type ChatAction = 
  | { type: 'SET_MESSAGES'; payload: Message[] }
  | { type: 'ADD_MESSAGE'; payload: Message }
  | { type: 'UPDATE_MESSAGE'; payload: { id: string; updates: Partial<Message> } }
  | { type: 'SET_SESSION'; payload: ChatSession | null }
  | { type: 'SET_CONVERSATION'; payload: ConversationMemory | null }
  | { type: 'SET_LOADING'; payload: boolean }
  | { type: 'SET_CONNECTED'; payload: boolean }
  | { type: 'SET_RECORDING'; payload: boolean }
  | { type: 'SET_TRANSCRIPT'; payload: string }
  | { type: 'SET_AGENT_STATUS'; payload: 'idle' | 'listening' | 'thinking' | 'speaking' }
  | { type: 'SET_ERROR'; payload: string | null }
  | { type: 'RESET_CHAT' }

const initialState: ChatState = {
  messages: [],
  session: null,
  conversation: null,
  isLoading: false,
  isConnected: false,
  isRecording: false,
  currentTranscript: '',
  agentStatus: 'idle',
  error: null
}

function chatReducer(state: ChatState, action: ChatAction): ChatState {
  switch (action.type) {
    case 'SET_MESSAGES':
      return { ...state, messages: action.payload }

    case 'ADD_MESSAGE':
      const exists = state.messages.some(m => m.id === action.payload.id)
      if (exists) {
        return {
          ...state,
          messages: state.messages.map(m => 
            m.id === action.payload.id ? action.payload : m
          )
        }
      }
      return { ...state, messages: [...state.messages, action.payload] }

    case 'UPDATE_MESSAGE':
      return {
        ...state,
        messages: state.messages.map(m => 
          m.id === action.payload.id ? { ...m, ...action.payload.updates } : m
        )
      }

    case 'SET_SESSION':
      return { ...state, session: action.payload }

    case 'SET_CONVERSATION':
      return { ...state, conversation: action.payload }

    case 'SET_LOADING':
      return { ...state, isLoading: action.payload }

    case 'SET_CONNECTED':
      return { ...state, isConnected: action.payload }

    case 'SET_RECORDING':
      return { ...state, isRecording: action.payload }

    case 'SET_TRANSCRIPT':
      return { ...state, currentTranscript: action.payload }

    case 'SET_AGENT_STATUS':
      return { ...state, agentStatus: action.payload }

    case 'SET_ERROR':
      return { ...state, error: action.payload }

    case 'RESET_CHAT':
      return {
        ...initialState,
        isLoading: state.isLoading
      }

    default:
      return state
  }
}

export const useChat = () => {
  const [state, dispatch] = useReducer(chatReducer, initialState)

  const zegoService = useRef(ZegoService.getInstance())
  const processedMessageIds = useRef(new Set<string>())
  const messageHandlerSetup = useRef(false)
  const cleanupFunctions = useRef<(() => void)[]>([])
  const currentConversationRef = useRef<string | null>(null)
  const streamingMessages = useRef(new Map<string, string>())

  const defaultVoiceSettings: VoiceSettings = {
    isEnabled: true,
    autoPlay: true,
    speechRate: 1.0,
    speechPitch: 1.0,
  }

  const cleanup = useCallback(() => {
    cleanupFunctions.current.forEach(fn => fn())
    cleanupFunctions.current = []
    processedMessageIds.current.clear()
    messageHandlerSetup.current = false
    streamingMessages.current.clear()
  }, [])

  const addMessageSafely = useCallback((message: Message, conversationId: string) => {
    if (processedMessageIds.current.has(message.id)) {
      console.log('Skipping duplicate message:', message.id)
      return
    }

    processedMessageIds.current.add(message.id)
    dispatch({ type: 'ADD_MESSAGE', payload: message })

    try {
      memoryService.addMessage(conversationId, message)
    } catch (error) {
      console.error('Failed to save message to memory:', error)
    }
  }, [])

  const initializeConversation = useCallback((conversationId?: string) => {
    try {
      const conv = memoryService.createOrGetConversation(conversationId)
      dispatch({ type: 'SET_CONVERSATION', payload: conv })
      dispatch({ type: 'SET_MESSAGES', payload: [...conv.messages] })
      processedMessageIds.current.clear()
      streamingMessages.current.clear()

      conv.messages.forEach(msg => {
        processedMessageIds.current.add(msg.id)
      })

      dispatch({ type: 'SET_ERROR', payload: null })
      currentConversationRef.current = conv.id
      return conv
    } catch (error) {
      console.error('Failed to initialize conversation:', error)
      dispatch({ type: 'SET_ERROR', payload: 'Failed to load conversation' })
      return null
    }
  }, [])

  const resetConversation = useCallback(() => {
    cleanup()
    dispatch({ type: 'RESET_CHAT' })
    currentConversationRef.current = null
  }, [cleanup])

  const setupMessageHandlers = useCallback((conv: ConversationMemory) => {
    if (messageHandlerSetup.current) {
      console.log('Message handlers already setup')
      return
    }

    console.log('Setting up message handlers for conversation:', conv.id)
    messageHandlerSetup.current = true

    const handleRoomMessage = (data: any) => {
      try {
        const { Cmd, Data: msgData } = data
        console.log('Room message received:', { Cmd, msgData })

        if (currentConversationRef.current !== conv.id) {
          console.log('Ignoring message for different conversation')
          return
        }

        if (Cmd === 3) {
          const { Text: transcript, EndFlag, MessageId } = msgData

          if (transcript && transcript.trim()) {
            dispatch({ type: 'SET_TRANSCRIPT', payload: transcript })
            dispatch({ type: 'SET_AGENT_STATUS', payload: 'listening' })

            if (EndFlag) {
              const messageId = MessageId || `voice_${Date.now()}_${Math.random().toString(36).substr(2, 6)}`

              const userMessage: Message = {
                id: messageId,
                content: transcript.trim(),
                sender: 'user',
                timestamp: Date.now(),
                type: 'voice',
                transcript: transcript.trim()
              }

              addMessageSafely(userMessage, conv.id)
              dispatch({ type: 'SET_TRANSCRIPT', payload: '' })
              dispatch({ type: 'SET_AGENT_STATUS', payload: 'thinking' })
            }
          }
        } else if (Cmd === 4) {
          const { Text: content, MessageId, EndFlag } = msgData
          if (!content || !MessageId) return

          if (EndFlag) {
            const currentStreaming = streamingMessages.current.get(MessageId) || ''
            const finalContent = currentStreaming + content

            dispatch({ type: 'UPDATE_MESSAGE', payload: {
              id: MessageId,
              updates: { 
                content: finalContent, 
                isStreaming: false 
              }
            }})

            streamingMessages.current.delete(MessageId)
            dispatch({ type: 'SET_AGENT_STATUS', payload: 'idle' })

            try {
              const finalMessage: Message = {
                id: MessageId,
                content: finalContent,
                sender: 'ai',
                timestamp: Date.now(),
                type: 'text'
              }
              memoryService.addMessage(conv.id, finalMessage)
            } catch (error) {
              console.error('Failed to save final message to memory:', error)
            }
          } else {
            const currentStreaming = streamingMessages.current.get(MessageId) || ''
            const updatedContent = currentStreaming + content
            streamingMessages.current.set(MessageId, updatedContent)

            if (!processedMessageIds.current.has(MessageId)) {
              const streamingMessage: Message = {
                id: MessageId,
                content: updatedContent,
                sender: 'ai',
                timestamp: Date.now(),
                type: 'text',
                isStreaming: true
              }

              processedMessageIds.current.add(MessageId)
              dispatch({ type: 'ADD_MESSAGE', payload: streamingMessage })
            } else {
              dispatch({ type: 'UPDATE_MESSAGE', payload: {
                id: MessageId,
                updates: { content: updatedContent, isStreaming: true }
              }})
            }

            dispatch({ type: 'SET_AGENT_STATUS', payload: 'speaking' })
          }
        }
      } catch (error) {
        console.error('Error handling room message:', error)
        dispatch({ type: 'SET_AGENT_STATUS', payload: 'idle' })
      }
    }

    zegoService.current.onRoomMessage(handleRoomMessage)

    cleanupFunctions.current.push(() => {
      zegoService.current.onRoomMessage(() => {})
    })
  }, [addMessageSafely])

  const startSession = useCallback(async (existingConversationId?: string): Promise<boolean> => {
    if (state.isLoading || state.isConnected) {
      console.log('Session start blocked - already loading or connected')
      return false
    }

    dispatch({ type: 'SET_LOADING', payload: true })
    dispatch({ type: 'SET_ERROR', payload: null })

    try {
      if (state.session?.isActive) {
        console.log('Ending existing session before starting new one')
        await endSession()
        await new Promise(resolve => setTimeout(resolve, 1000))
      }

      const roomId = `room_${Date.now()}_${Math.random().toString(36).substr(2, 6)}`
      const userId = `user_${Date.now()}_${Math.random().toString(36).substr(2, 6)}`

      console.log('Initializing ZEGO service...')
      await zegoService.current.initialize()

      console.log('Joining room:', roomId)
      const joinResult = await zegoService.current.joinRoom(roomId, userId)
      if (!joinResult) throw new Error('Failed to join ZEGO room')

      console.log('Starting AI agent session...')
      const result = await agentAPI.startSession(roomId, userId)

      const conv = initializeConversation(existingConversationId)
      if (!conv) throw new Error('Failed to initialize conversation')

      const newSession: ChatSession = {
        roomId,
        userId,
        agentInstanceId: result.agentInstanceId,
        isActive: true,
        conversationId: conv.id,
        voiceSettings: defaultVoiceSettings
      }

      dispatch({ type: 'SET_SESSION', payload: newSession })
      dispatch({ type: 'SET_CONNECTED', payload: true })

      setupMessageHandlers(conv)

      console.log('Session started successfully')
      return true
    } catch (error) {
      console.error('Failed to start session:', error)
      dispatch({ type: 'SET_ERROR', payload: error instanceof Error ? error.message : 'Failed to start session' })
      return false
    } finally {
      dispatch({ type: 'SET_LOADING', payload: false })
    }
  }, [state.isLoading, state.isConnected, state.session, initializeConversation, setupMessageHandlers])

  const sendTextMessage = useCallback(async (content: string) => {
    if (!state.session?.agentInstanceId || !state.conversation) {
      dispatch({ type: 'SET_ERROR', payload: 'No active session' })
      return
    }

    const trimmedContent = content.trim()
    if (!trimmedContent) return

    try {
      const messageId = `text_${Date.now()}_${Math.random().toString(36).substr(2, 6)}`

      const userMessage: Message = {
        id: messageId,
        content: trimmedContent,
        sender: 'user',
        timestamp: Date.now(),
        type: 'text'
      }

      addMessageSafely(userMessage, state.conversation.id)
      dispatch({ type: 'SET_AGENT_STATUS', payload: 'thinking' })

      await agentAPI.sendMessage(state.session.agentInstanceId, trimmedContent)

    } catch (error) {
      console.error('Failed to send message:', error)
      dispatch({ type: 'SET_ERROR', payload: 'Failed to send message' })
      dispatch({ type: 'SET_AGENT_STATUS', payload: 'idle' })
    }
  }, [state.session, state.conversation, addMessageSafely])

  const toggleVoiceRecording = useCallback(async () => {
    if (!state.isConnected) return

    try {
      if (state.isRecording) {
        await zegoService.current.enableMicrophone(false)
        dispatch({ type: 'SET_RECORDING', payload: false })
        dispatch({ type: 'SET_AGENT_STATUS', payload: 'idle' })
      } else {
        const success = await zegoService.current.enableMicrophone(true)
        if (success) {
          dispatch({ type: 'SET_RECORDING', payload: true })
          dispatch({ type: 'SET_AGENT_STATUS', payload: 'listening' })
        }
      }
    } catch (error) {
      console.error('Failed to toggle recording:', error)
      dispatch({ type: 'SET_RECORDING', payload: false })
      dispatch({ type: 'SET_AGENT_STATUS', payload: 'idle' })
    }
  }, [state.isConnected, state.isRecording])

  const toggleVoiceSettings = useCallback(() => {
    if (state.session) {
      const updatedSession = {
        ...state.session,
        voiceSettings: {
          ...state.session.voiceSettings,
          isEnabled: !state.session.voiceSettings.isEnabled
        }
      }
      dispatch({ type: 'SET_SESSION', payload: updatedSession })
    }
  }, [state.session])

  const endSession = useCallback(async () => {
    if (!state.session && !state.isConnected) return

    try {
      dispatch({ type: 'SET_LOADING', payload: true })

      if (state.isRecording) {
        await zegoService.current.enableMicrophone(false)
        dispatch({ type: 'SET_RECORDING', payload: false })
      }

      if (state.session?.agentInstanceId) {
        await agentAPI.stopSession(state.session.agentInstanceId)
      }

      await zegoService.current.leaveRoom()

      cleanup()
      dispatch({ type: 'SET_SESSION', payload: null })
      dispatch({ type: 'SET_CONNECTED', payload: false })
      dispatch({ type: 'SET_AGENT_STATUS', payload: 'idle' })
      dispatch({ type: 'SET_TRANSCRIPT', payload: '' })
      dispatch({ type: 'SET_ERROR', payload: null })
      currentConversationRef.current = null

      console.log('Session ended successfully')
    } catch (error) {
      console.error('Failed to end session:', error)
    } finally {
      dispatch({ type: 'SET_LOADING', payload: false })
    }
  }, [state.session, state.isConnected, state.isRecording, cleanup])

  const clearError = useCallback(() => {
    dispatch({ type: 'SET_ERROR', payload: null })
  }, [])

  useEffect(() => {
    const handleConversationChange = async () => {
      if (currentConversationRef.current === (state.conversation?.id || null)) {
        return
      }

      if (state.isConnected) {
        await endSession()
        if (state.conversation?.id) {
          await startSession(state.conversation.id)
        } else {
          resetConversation()
        }
      }
    }

    handleConversationChange()
  }, [state.conversation?.id])

  useEffect(() => {
    return () => {
      if (state.session?.isActive || state.isConnected) {
        endSession()
      }
      cleanup()
    }
  }, [])

  return {
    ...state,
    startSession,
    sendTextMessage,
    toggleVoiceRecording,
    toggleVoiceSettings,
    endSession,
    initializeConversation,
    resetConversation,
    clearError
  }
}

With this hook, we can manage the complete chat state, including session management, message handling, voice recording, and ZEGOCLOUD integration.

12. UI Components Implementation

Now, let’s create UI components we will be using in other components. Create the button component at client/src/components/UI/Button.tsx:

import { motion } from 'framer-motion'
import { forwardRef } from 'react'

interface ButtonProps extends React.ButtonHTMLAttributes<HTMLButtonElement> {
  variant?: 'primary' | 'secondary' | 'ghost'
  size?: 'sm' | 'md' | 'lg'
  isLoading?: boolean
}

export const Button = forwardRef<HTMLButtonElement, ButtonProps>(
  ({ variant = 'primary', size = 'md', isLoading, children, className = '', ...props }, ref) => {
    const baseClasses = 'inline-flex items-center justify-center rounded-lg font-medium transition-colors focus:outline-none focus:ring-2'

    const variants = {
      primary: 'bg-blue-600 text-white hover:bg-blue-700 focus:ring-blue-500',
      secondary: 'bg-gray-200 text-gray-900 hover:bg-gray-300 focus:ring-gray-500',
      ghost: 'text-gray-600 hover:text-gray-900 hover:bg-gray-100 focus:ring-gray-500'
    }

    const sizes = {
      sm: 'px-3 py-2 text-sm',
      md: 'px-4 py-2.5 text-sm',
      lg: 'px-6 py-3 text-base'
    }

    return (
      <motion.button
        ref={ref}
        whileHover={{ scale: 1.02 }}
        whileTap={{ scale: 0.98 }}
        className={`${baseClasses} ${variants[variant]} ${sizes[size]} ${className}`}
        disabled={isLoading || props.disabled}
        {...(props as any)}
      >
        {isLoading ? (
          <div className="animate-spin rounded-full h-4 w-4 border-2 border-current border-t-transparent mr-2" />
        ) : null}
        {children}
      </motion.button>
    )
  }
)

Create the message bubble component at client/src/components/Chat/MessageBubble.tsx:

import { motion } from 'framer-motion'
import type { Message } from '../../types'
import { Volume2, User, Bot, Clock } from 'lucide-react'

interface MessageBubbleProps {
  message: Message
  onPlayVoice?: (messageId: string) => void
  showTimestamp?: boolean
}

export const MessageBubble = ({ message, onPlayVoice, showTimestamp = false }: MessageBubbleProps) => {
  const isUser = message.sender === 'user'
  const isVoice = message.type === 'voice'

  const formatTime = (timestamp: number) => {
    return new Date(timestamp).toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })
  }

  return (
    <motion.div
      initial={{ opacity: 0, y: 20, scale: 0.95 }}
      animate={{ opacity: 1, y: 0, scale: 1 }}
      transition={{ duration: 0.3, ease: "easeOut" }}
      className={`flex w-full mb-6 group ${isUser ? 'justify-end' : 'justify-start'}`}
    >
      <div className={`flex items-end space-x-3 max-w-[75%] ${isUser ? 'flex-row-reverse space-x-reverse' : 'flex-row'}`}>
        {/* Avatar */}
        <motion.div 
          whileHover={{ scale: 1.05 }}
          className={`flex-shrink-0 w-10 h-10 rounded-full flex items-center justify-center shadow-md ${
            isUser 
              ? 'bg-gradient-to-br from-blue-500 to-blue-600' 
              : 'bg-gradient-to-br from-gray-700 to-gray-800'
          }`}
        >
          {isUser ? (
            <User className="w-5 h-5 text-white" />
          ) : (
            <Bot className="w-5 h-5 text-white" />
          )}
        </motion.div>

        {/* Message Content */}
        <div className={`flex flex-col ${isUser ? 'items-end' : 'items-start'}`}>
          <motion.div
            className={`px-4 py-3 rounded-2xl shadow-sm break-words ${
              isUser 
                ? 'bg-blue-600 text-white rounded-br-md' 
                : 'bg-white text-gray-900 border border-gray-200 rounded-bl-md'
            } ${message.isStreaming ? 'animate-pulse' : ''} ${
              isVoice ? 'border-2 border-dashed border-purple-300' : ''
            }`}
            layout
            whileHover={{ scale: 1.02 }}
          >
            {/* Voice indicator */}
            {isVoice && (
              <div className={`flex items-center space-x-2 mb-2 ${
                isUser ? 'text-blue-200' : 'text-purple-600'
              }`}>
                <Volume2 className="w-4 h-4" />
                <span className="text-xs font-medium">Voice Message</span>
                {message.duration && (
                  <span className="text-xs opacity-75">{message.duration}s</span>
                )}
              </div>
            )}

            {/* Message text */}
            <p className="text-sm leading-relaxed whitespace-pre-wrap">
              {isVoice ? message.transcript || message.content : message.content}
            </p>

            {/* Voice playback button */}
            {isVoice && message.audioUrl && (
              <button 
                onClick={() => onPlayVoice?.(message.id)}
                className={`mt-3 flex items-center space-x-2 text-xs transition-opacity duration-200 hover:opacity-100 ${
                  isUser ? 'text-blue-200 opacity-75' : 'text-purple-600 opacity-75'
                }`}
              >
                <Volume2 className="w-3 h-3" />
                <span>Play Audio</span>
              </button>
            )}
          </motion.div>

          {/* Timestamp */}
          {showTimestamp && (
            <motion.div 
              initial={{ opacity: 0 }}
              animate={{ opacity: 1 }}
              transition={{ delay: 0.2 }}
              className={`flex items-center space-x-1 mt-1 text-xs text-gray-500 opacity-0 group-hover:opacity-100 transition-opacity ${
                isUser ? 'flex-row-reverse space-x-reverse' : 'flex-row'
              }`}
            >
              <Clock className="w-3 h-3" />
              <span>{formatTime(message.timestamp)}</span>
            </motion.div>
          )}
        </div>
      </div>
    </motion.div>
  )
}

This message component displays both text and voice messages with smooth animations, avatar icons, and optional timestamps. It supports streaming message updates and voice message playback controls.

Moreover, create the voice input component at client/src/components/Voice/VoiceMessageInput.tsx:

import { useState, useEffect, useRef } from 'react'
import { motion, AnimatePresence } from 'framer-motion'
import { Send, Mic, MicOff, Volume2, VolumeX } from 'lucide-react'
import { Button } from '../UI/Button'

interface VoiceMessageInputProps {
  onSendMessage: (content: string) => Promise<void>
  isRecording: boolean
  onToggleRecording: () => void
  currentTranscript: string
  isConnected: boolean
  voiceEnabled: boolean
  onToggleVoice: () => void
  agentStatus?: 'idle' | 'listening' | 'thinking' | 'speaking'
}

export const VoiceMessageInput = ({ 
  onSendMessage, 
  isRecording, 
  onToggleRecording,
  currentTranscript,
  isConnected,
  voiceEnabled,
  onToggleVoice,
  agentStatus = 'idle'
}: VoiceMessageInputProps) => {
  const [message, setMessage] = useState('')
  const [isFocused, setIsFocused] = useState(false)
  const [isSending, setIsSending] = useState(false)
  const textareaRef = useRef<HTMLTextAreaElement>(null)

  // Auto-resize textarea
  useEffect(() => {
    if (textareaRef.current) {
      // Reset height to auto to get the correct scrollHeight
      textareaRef.current.style.height = 'auto'
      // Set height based on content, with min and max limits
      const scrollHeight = textareaRef.current.scrollHeight
      const minHeight = 44 // Minimum height (about 1 line)
      const maxHeight = 120 // Maximum height (about 5 lines)

      textareaRef.current.style.height = Math.min(Math.max(scrollHeight, minHeight), maxHeight) + 'px'
    }
  }, [message])

  const handleSubmit = async (e: React.FormEvent) => {
    e.preventDefault()
    const trimmedMessage = message.trim()
    if (!trimmedMessage || !isConnected || isSending) return

    setIsSending(true)
    try {
      await onSendMessage(trimmedMessage)
      setMessage('')
    } catch (error) {
      console.error('Failed to send message:', error)
    } finally {
      setIsSending(false)
    }
  }

  const handleKeyPress = (e: React.KeyboardEvent) => {
    if (e.key === 'Enter' && !e.shiftKey && !isSending) {
      e.preventDefault()
      handleSubmit(e as any)
    }
  }

  const isDisabled = !isConnected || agentStatus === 'thinking' || agentStatus === 'speaking'
  const isVoiceDisabled = isDisabled || !voiceEnabled

  const getPlaceholderText = () => {
    if (!isConnected) return "Connect to start chatting..."
    if (agentStatus === 'thinking') return "AI is processing..."
    if (agentStatus === 'speaking') return "AI is responding..."
    if (isRecording) return "Recording... speak now"
    return "Type your message or use voice..."
  }

  const getRecordingButtonState = () => {
    if (isVoiceDisabled) return 'disabled'
    if (agentStatus === 'listening' || isRecording) return 'recording'
    return 'idle'
  }

  const recordingState = getRecordingButtonState()

  return (
    <motion.div 
      initial={{ y: 20, opacity: 0 }}
      animate={{ y: 0, opacity: 1 }}
      className="bg-white border-t border-gray-200 p-4"
    >
      <AnimatePresence>
        {(currentTranscript || agentStatus === 'listening') && (
          <motion.div
            initial={{ height: 0, opacity: 0 }}
            animate={{ height: 'auto', opacity: 1 }}
            exit={{ height: 0, opacity: 0 }}
            className="mb-3 p-3 bg-green-50 rounded-lg border border-green-200"
          >
            <div className="flex items-center space-x-2">
              <motion.div
                animate={{ scale: [1, 1.2, 1] }}
                transition={{ repeat: Infinity, duration: 1.5 }}
                className="flex-shrink-0"
              >
                <div className="w-3 h-3 rounded-full bg-green-500" />
              </motion.div>
              <p className="text-sm text-green-700 flex-1">
                {currentTranscript || 'Listening... speak now'}
              </p>
            </div>
          </motion.div>
        )}
      </AnimatePresence>

      <form onSubmit={handleSubmit} className="flex items-end space-x-3">
        {/* Text Input Container */}
        <div className="flex-1 min-w-0">
          <div className={`relative rounded-xl border-2 transition-all duration-200 ${
            isFocused ? 'border-blue-500 bg-blue-50' : 'border-gray-200 bg-gray-50'
          } ${isDisabled ? 'opacity-50' : ''}`}>
            <textarea
              ref={textareaRef}
              value={message}
              onChange={(e) => setMessage(e.target.value)}
              onKeyDown={handleKeyPress}
              onFocus={() => setIsFocused(true)}
              onBlur={() => setIsFocused(false)}
              placeholder={getPlaceholderText()}
              disabled={isDisabled || isSending}
              className="w-full px-4 py-3 bg-transparent border-none focus:outline-none resize-none placeholder-gray-500 disabled:cursor-not-allowed text-sm leading-relaxed"
              style={{ 
                minHeight: '44px',
                maxHeight: '120px',
                overflow: 'hidden'
              }}
              rows={1}
            />

            {/* Character counter */}
            {message.length > 800 && (
              <div className="absolute bottom-2 right-2 text-xs text-gray-400 bg-white px-1 rounded">
                {message.length}/1000
              </div>
            )}
          </div>
        </div>

        {/* Control Buttons */}
        <div className="flex items-center space-x-2">
          {/* Voice Toggle */}
          <Button
            type="button"
            variant="ghost"
            size="md"
            onClick={onToggleVoice}
            disabled={!isConnected}
            className="text-gray-600 hover:text-gray-900 disabled:opacity-50"
            title={voiceEnabled ? "Disable voice" : "Enable voice"}
          >
            {voiceEnabled ? <Volume2 className="w-5 h-5" /> : <VolumeX className="w-5 h-5" />}
          </Button>

          {/* Voice Recording Button */}
          <Button
            type="button"
            variant="ghost"
            size="md"
            onClick={onToggleRecording}
            disabled={recordingState === 'disabled'}
            className={`transition-all duration-200 ${
              recordingState === 'recording'
                ? 'bg-red-500 text-white hover:bg-red-600 shadow-lg scale-110' 
                : recordingState === 'disabled'
                ? 'text-gray-400 cursor-not-allowed opacity-50'
                : 'text-gray-600 hover:text-blue-600 hover:bg-blue-50'
            }`}
            title={
              recordingState === 'disabled' 
                ? "Voice not available" 
                : recordingState === 'recording'
                ? "Stop recording"
                : "Start voice input"
            }
          >
            <motion.div
              animate={recordingState === 'recording' ? { scale: [1, 1.1, 1] } : {}}
              transition={{ repeat: Infinity, duration: 1 }}
            >
              {recordingState === 'recording' ? (
                <MicOff className="w-5 h-5" />
              ) : (
                <Mic className="w-5 h-5" />
              )}
            </motion.div>
          </Button>

          {/* Send Button */}
          <Button
            type="submit"
            disabled={!message.trim() || isDisabled || isSending}
            size="md"
            className="bg-blue-600 hover:bg-blue-700 text-white px-6 disabled:opacity-50 disabled:cursor-not-allowed min-w-[60px]"
            isLoading={isSending}
          >
            <Send className="w-4 h-4" />
          </Button>
        </div>
      </form>

      {!isConnected && (
        <p className="text-xs text-gray-500 mt-2 text-center">
          Start a conversation to enable voice and text input
        </p>
      )}
    </motion.div>
  )
}

13. Main Chat Container Component

Create the chat container at client/src/components/Chat/ChatContainer.tsx:

import { useEffect, useRef } from 'react'
import { motion, AnimatePresence } from 'framer-motion'
import { MessageBubble } from './MessageBubble'
import { VoiceMessageInput } from '../Voice/VoiceMessageInput'
import { Button } from '../UI/Button'
import { useChat } from '../../hooks/useChat'
import { Phone, PhoneOff, Bot } from 'lucide-react'

interface ChatContainerProps {
  conversationId?: string
  onConversationUpdate?: () => void
  onNewConversation?: () => void
}

export const ChatContainer = ({ conversationId, onConversationUpdate, onNewConversation }: ChatContainerProps) => {
  const messagesEndRef = useRef<HTMLDivElement>(null)
  const { 
    messages, 
    isLoading, 
    isConnected, 
    isRecording,
    currentTranscript,
    agentStatus,
    session,
    conversation,
    startSession, 
    sendTextMessage, 
    toggleVoiceRecording,
    toggleVoiceSettings,
    endSession,
    resetConversation,
    initializeConversation
  } = useChat()

  const scrollToBottom = () => {
    messagesEndRef.current?.scrollIntoView({ behavior: 'smooth' })
  }

  useEffect(() => {
    scrollToBottom()
  }, [messages])

  useEffect(() => {
    if (onConversationUpdate) {
      onConversationUpdate()
    }
  }, [messages, onConversationUpdate])

  useEffect(() => {
    if (conversationId && conversationId !== conversation?.id) {
      initializeConversation(conversationId)
    } else if (!conversationId && conversation) {
      resetConversation()
    }
  }, [conversationId, conversation?.id, initializeConversation, resetConversation])

  const handleStartChat = async () => {
    const success = await startSession(conversationId)
    if (success && onNewConversation && !conversationId) {
      onNewConversation()
    }
  }

  const handleEndChat = async () => {
    await endSession()
    if (onNewConversation) {
      onNewConversation()
    }
  }

  const getStatusText = () => {
    if (!isConnected) return 'Click Start Chat to begin'

    switch (agentStatus) {
      case 'listening':
        return 'Listening for your voice...'
      case 'thinking':
        return 'AI is processing your message...'
      case 'speaking':
        return 'AI is responding...'
      default:
        return 'Connected - Ready to chat'
    }
  }

  const getStatusColor = () => {
    if (!isConnected) return 'text-gray-500'

    switch (agentStatus) {
      case 'listening':
        return 'text-green-600'
      case 'thinking':
        return 'text-blue-600'
      case 'speaking':
        return 'text-purple-600'
      default:
        return 'text-green-600'
    }
  }

  return (
    <motion.div 
      initial={{ opacity: 0 }}
      animate={{ opacity: 1 }}
      className="flex flex-col h-full bg-gray-50"
    >
      <audio 
        id="ai-audio-output" 
        autoPlay 
        style={{ display: 'none' }}
        controls={false}
        playsInline
      />

      <motion.div 
        initial={{ y: -20 }}
        animate={{ y: 0 }}
        className="bg-white border-b border-gray-200 px-6 py-4"
      >
        <div className="flex items-center justify-between">
          <div className="flex items-center space-x-3">
            <div className="w-10 h-10 bg-gradient-to-br from-blue-500 to-blue-600 rounded-full flex items-center justify-center">
              <Bot className="w-5 h-5 text-white" />
            </div>
            <div>
              <h1 className="text-xl font-semibold text-gray-900">AI Assistant</h1>
              <p className={`text-sm ${getStatusColor()}`}>
                {getStatusText()}
              </p>
            </div>
          </div>

          {isConnected ? (
            <Button onClick={handleEndChat} variant="secondary" size="sm" disabled={isLoading}>
              <PhoneOff className="w-4 h-4 mr-2" />
              End Chat
            </Button>
          ) : (
            <Button onClick={handleStartChat} isLoading={isLoading} size="sm">
              <Phone className="w-4 h-4 mr-2" />
              Start Chat
            </Button>
          )}
        </div>
      </motion.div>

      <div className="flex-1 overflow-y-auto px-4 py-6">
        {messages.length === 0 && (
          <motion.div
            initial={{ opacity: 0, y: 20 }}
            animate={{ opacity: 1, y: 0 }}
            className="flex flex-col items-center justify-center h-full text-center"
          >
            <div className="w-16 h-16 bg-gradient-to-br from-blue-500 to-blue-600 rounded-full flex items-center justify-center mb-4">
              <Bot className="w-8 h-8 text-white" />
            </div>
            <h3 className="text-lg font-semibold text-gray-900 mb-2">
              {isConnected ? 'Ready to Chat' : 'Welcome to AI Assistant'}
            </h3>
            <p className="text-gray-600 mb-6 max-w-md">
              {isConnected 
                ? 'You can type messages orRetrySCContinueEditResumed chat container component development from previous point.The user wants me to continue from where I left off. I was in the middle of creating the chat container component, specifically in the welcome message section. Let me continue from there and complete the rest of the guide.typescript              {isConnected 
                ? 'You can type messages or use voice input to start chatting with the AI assistant.'
                : 'Start a conversation with our AI assistant. You can type messages or use voice input for a more natural experience.'
              }
            </p>
            {!isConnected && (
              <div className="space-y-2 text-sm text-gray-500 mb-6">
                <p>🎤 Voice conversations with real-time responses</p>
                <p>💬 Natural interruption support</p>
                <p>🧠 Context-aware conversations</p>
              </div>
            )}
            {!isConnected && (
              <Button onClick={handleStartChat} isLoading={isLoading}>
                <Phone className="w-4 h-4 mr-2" />
                Start New Conversation
              </Button>
            )}
          </motion.div>
        )}

        <AnimatePresence mode="popLayout">
          {messages.map((message) => (
            <MessageBubble 
              key={message.id} 
              message={message} 
              showTimestamp={true}
            />
          ))}
        </AnimatePresence>

        {agentStatus === 'thinking' && (
          <motion.div
            initial={{ opacity: 0, y: 20 }}
            animate={{ opacity: 1, y: 0 }}
            exit={{ opacity: 0, y: -20 }}
            className="flex justify-start mb-6"
          >
            <div className="flex items-center space-x-3">
              <div className="w-10 h-10 bg-gradient-to-br from-gray-700 to-gray-800 rounded-full flex items-center justify-center">
                <Bot className="w-5 h-5 text-white" />
              </div>
              <div className="bg-white border border-gray-200 rounded-2xl px-5 py-3">
                <div className="flex items-center space-x-2">
                  <div className="flex space-x-1">
                    <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" />
                    <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" style={{ animationDelay: '0.1s' }} />
                    <div className="w-2 h-2 bg-gray-400 rounded-full animate-bounce" style={{ animationDelay: '0.2s' }} />
                  </div>
                  <span className="text-sm text-gray-500">AI is thinking...</span>
                </div>
              </div>
            </div>
          </motion.div>
        )}

        <div ref={messagesEndRef} />
      </div>

      {isConnected && (
        <VoiceMessageInput 
          onSendMessage={sendTextMessage}
          isRecording={isRecording}
          onToggleRecording={toggleVoiceRecording}
          currentTranscript={currentTranscript}
          isConnected={isConnected}
          voiceEnabled={session?.voiceSettings.isEnabled || false}
          onToggleVoice={toggleVoiceSettings}
          agentStatus={agentStatus}
        />
      )}
    </motion.div>
  )
}

14. Conversation List Component

We will now create the conversation list at client/src/components/Memory/ConversationList.tsx to show our stored conversations:

import { motion, AnimatePresence } from 'framer-motion'
import type { ConversationMemory } from '../../types'
import { MessageSquare, Clock, Trash2 } from 'lucide-react'
import { Button } from '../UI/Button'

interface ConversationListProps {
  conversations: ConversationMemory[]
  onSelectConversation: (id: string) => void
  onDeleteConversation: (id: string) => void
  currentConversationId?: string
}

export const ConversationList = ({ 
  conversations, 
  onSelectConversation, 
  onDeleteConversation,
  currentConversationId 
}: ConversationListProps) => {
  const formatDate = (timestamp: number) => {
    const date = new Date(timestamp)
    const now = new Date()
    const diffInHours = (now.getTime() - date.getTime()) / (1000 * 60 * 60)

    if (diffInHours < 24) {
      return date.toLocaleTimeString([], { hour: '2-digit', minute: '2-digit' })
    } else if (diffInHours < 24 * 7) {
      return date.toLocaleDateString([], { weekday: 'short' })
    } else {
      return date.toLocaleDateString([], { month: 'short', day: 'numeric' })
    }
  }

  return (
    <div className="w-80 bg-gray-50 border-r border-gray-200 flex flex-col">
      <div className="p-4 border-b border-gray-200">
        <h2 className="font-semibold text-gray-900 flex items-center">
          <MessageSquare className="w-5 h-5 mr-2" />
          Conversations
        </h2>
        <p className="text-sm text-gray-500 mt-1">{conversations.length} total</p>
      </div>

      <div className="flex-1 overflow-y-auto">
        <AnimatePresence>
          {conversations.map((conv) => (
            <motion.div
              key={conv.id}
              initial={{ opacity: 0, x: -20 }}
              animate={{ opacity: 1, x: 0 }}
              exit={{ opacity: 0, x: -20 }}
              whileHover={{ backgroundColor: '#f8fafc' }}
              className={`p-4 border-b border-gray-100 cursor-pointer transition-colors group ${
                currentConversationId === conv.id ? 'bg-blue-50 border-blue-200' : ''
              }`}
              onClick={() => onSelectConversation(conv.id)}
            >
              <div className="flex items-start justify-between">
                <div className="flex-1 min-w-0">
                  <h3 className="font-medium text-gray-900 truncate mb-1">
                    {conv.title}
                  </h3>
                  <p className="text-sm text-gray-600 line-clamp-2 mb-2">
                    {conv.metadata.lastAIResponse || 'No messages yet'}
                  </p>
                  <div className="flex items-center space-x-3 text-xs text-gray-500">
                    <span className="flex items-center">
                      <MessageSquare className="w-3 h-3 mr-1" />
                      {conv.metadata.totalMessages}
                    </span>
                    <span className="flex items-center">
                      <Clock className="w-3 h-3 mr-1" />
                      {formatDate(conv.updatedAt)}
                    </span>
                  </div>
                </div>

                <Button
                  variant="ghost"
                  size="sm"
                  onClick={(e) => {
                    e.stopPropagation()
                    onDeleteConversation(conv.id)
                  }}
                  className="text-gray-400 hover:text-red-600 opacity-0 group-hover:opacity-100 transition-opacity"
                >
                  <Trash2 className="w-4 h-4" />
                </Button>
              </div>
            </motion.div>
          ))}
        </AnimatePresence>

        {conversations.length === 0 && (
          <div className="p-8 text-center text-gray-500">
            <MessageSquare className="w-12 h-12 mx-auto mb-4 opacity-50" />
            <p className="text-sm">No conversations yet</p>
            <p className="text-xs mt-1">Start a new chat to begin</p>
          </div>
        )}
      </div>
    </div>
  )
}

15. Main Application Component

Now, create the main app component at client/src/App.tsx:

import { useState, useEffect, useCallback } from 'react'
import { motion, AnimatePresence } from 'framer-motion'
import { ChatContainer } from './components/Chat/ChatContainer'
import { ConversationList } from './components/Memory/ConversationList'
import { memoryService } from './services/memory'
import type { ConversationMemory } from './types'
import { Plus, Menu, X, MessageSquare } from 'lucide-react'
import { Button } from './components/UI/Button'

function App() {
  const [conversations, setConversations] = useState<ConversationMemory[]>([])
  const [currentConversationId, setCurrentConversationId] = useState<string | undefined>(undefined)
  const [sidebarOpen, setSidebarOpen] = useState(window.innerWidth >= 1024) // Desktop open by default
  const [isCreatingNewConversation, setIsCreatingNewConversation] = useState(false)


  // Load conversations on mount and set up periodic refresh
  useEffect(() => {
    const loadConversations = () => {
      try {
        const allConversations = memoryService.getAllConversations()
        setConversations(allConversations)
      } catch (error) {
        console.error('Failed to load conversations:', error)
      }
    }

    loadConversations()

    const interval = setInterval(() => {
      const current = memoryService.getAllConversations()
      if (current.length !== conversations.length || 
          current.some((conv, index) => conv.updatedAt !== conversations[index]?.updatedAt)) {
        loadConversations()
      }
    }, 5000) 

    return () => clearInterval(interval)
  }, [conversations.length])

  // Handle responsive sidebar behavior
  useEffect(() => {
    const handleResize = () => {
      if (window.innerWidth >= 1024) {
        setSidebarOpen(true)
      }
    }

    window.addEventListener('resize', handleResize)
    return () => window.removeEventListener('resize', handleResize)
  }, [])

  const handleNewConversation = useCallback(async () => {
    if (isCreatingNewConversation) return

    setIsCreatingNewConversation(true)
    try {
      // Clear current conversation
      setCurrentConversationId(undefined)

      // On mobile, close sidebar after action
      if (window.innerWidth < 1024) {
        setSidebarOpen(false)
      }
    } finally {
      setIsCreatingNewConversation(false)
    }
  }, [isCreatingNewConversation])

  const handleSelectConversation = useCallback((id: string) => {
    if (id !== currentConversationId && !isCreatingNewConversation) {
      setCurrentConversationId(id)

      // On mobile, close sidebar after selection
      if (window.innerWidth < 1024) {
        setSidebarOpen(false)
      }
    }
  }, [currentConversationId, isCreatingNewConversation])

  const handleDeleteConversation = useCallback((id: string) => {
    try {
      memoryService.deleteConversation(id)

      // Force refresh conversations
      const updatedConversations = memoryService.getAllConversations()
      setConversations(updatedConversations)

      // If we deleted the current conversation, clear it
      if (currentConversationId === id) {
        setCurrentConversationId(undefined)
      }
    } catch (error) {
      console.error('Failed to delete conversation:', error)
    }
  }, [currentConversationId])

  const refreshConversations = useCallback(() => {
    try {
      const updatedConversations = memoryService.getAllConversations()
      setConversations(updatedConversations)
    } catch (error) {
      console.error('Failed to refresh conversations:', error)
    }
  }, [])

  const handleConversationCreated = useCallback(() => {
    try {
      const latestConversations = memoryService.getAllConversations()
      setConversations(latestConversations)

      // Auto-select the newest conversation
      if (latestConversations.length > 0) {
        const newestConv = latestConversations[0]
        if (newestConv.id !== currentConversationId) {
          setCurrentConversationId(newestConv.id)
        }
      }
    } catch (error) {
      console.error('Failed to handle conversation creation:', error)
    }
  }, [currentConversationId])

  const toggleSidebar = useCallback(() => {
    setSidebarOpen(prev => !prev)
  }, [])

  const closeSidebar = useCallback(() => {
    // Only allow closing on mobile
    if (window.innerWidth < 1024) {
      setSidebarOpen(false)
    }
  }, [])

  return (
    <div className="flex h-screen bg-gray-50 overflow-hidden">
      {/* Mobile overlay */}
      <AnimatePresence>
        {sidebarOpen && window.innerWidth < 1024 && (
          <motion.div
            initial={{ opacity: 0 }}
            animate={{ opacity: 1 }}
            exit={{ opacity: 0 }}
            className="fixed inset-0 bg-black bg-opacity-50 z-40 lg:hidden"
            onClick={closeSidebar}
          />
        )}
      </AnimatePresence>

      {/* Sidebar */}
      <AnimatePresence>
        {sidebarOpen && (
          <motion.div
            initial={{ x: window.innerWidth < 1024 ? -320 : 0 }}
            animate={{ x: 0 }}
            exit={{ x: -320 }}
            transition={{ type: "spring", damping: 25, stiffness: 300 }}
            className="fixed left-0 top-0 h-full w-80 bg-white z-50 lg:relative lg:z-auto shadow-xl border-r border-gray-200 flex flex-col"
          >
            {/* Sidebar Header */}
            <div className="p-4 border-b border-gray-200 bg-gradient-to-r from-blue-50 to-indigo-50">
              <div className="flex items-center justify-between mb-4">
                <div className="flex items-center space-x-3">
                  <div className="w-10 h-10 bg-gradient-to-br from-blue-500 to-blue-600 rounded-xl flex items-center justify-center shadow-md">
                    <MessageSquare className="w-5 h-5 text-white" />
                  </div>
                  <div>
                    <h1 className="text-lg font-bold text-gray-900">AI Assistant</h1>
                    <p className="text-xs text-gray-600">{conversations.length} conversations</p>
                  </div>
                </div>

                {/* Close button - only on mobile */}
                <Button
                  variant="ghost"
                  size="sm"
                  onClick={closeSidebar}
                  className="lg:hidden text-gray-500 hover:text-gray-700"
                >
                  <X className="w-5 h-5" />
                </Button>
              </div>

              {/* New Conversation Button */}
              <Button
                onClick={handleNewConversation}
                className="w-full bg-gradient-to-r from-blue-600 to-blue-700 hover:from-blue-700 hover:to-blue-800 text-white shadow-md"
                disabled={isCreatingNewConversation}
                isLoading={isCreatingNewConversation}
              >
                <Plus className="w-4 h-4 mr-2" />
                New Conversation
              </Button>
            </div>

            {/* Conversation List */}
            <div className="flex-1 overflow-hidden">
              <ConversationList
                conversations={conversations}
                onSelectConversation={handleSelectConversation}
                onDeleteConversation={handleDeleteConversation}
                currentConversationId={currentConversationId}
              />
            </div>

            {/* Sidebar Footer */}
            <div className="p-4 border-t border-gray-200 bg-gray-50">
              <p className="text-xs text-gray-500 text-center">
                {conversations.length} conversations
              </p>
            </div>
          </motion.div>
        )}
      </AnimatePresence>

      {/* Main Content */}
      <div className="flex-1 flex flex-col min-w-0">
        {/* Mobile Header - Always Visible */}
        <div className="bg-white border-b border-gray-200 p-3 flex items-center justify-between lg:hidden">
          <Button
            variant="ghost"
            size="sm"
            onClick={toggleSidebar}
            className="text-gray-600 hover:text-gray-900"
          >
            <Menu className="w-5 h-5" />
            <span className="ml-2 text-sm font-medium">
              {sidebarOpen ? 'Close' : 'Conversations'}
            </span>
          </Button>

          <div className="flex items-center space-x-2">
            <div className="w-6 h-6 bg-gradient-to-br from-blue-500 to-blue-600 rounded-lg flex items-center justify-center">
              <MessageSquare className="w-3 h-3 text-white" />
            </div>
            <h1 className="font-semibold text-gray-900">AI Assistant</h1>
          </div>

          <Button
            variant="ghost"
            size="sm"
            onClick={handleNewConversation}
            className="text-blue-600 hover:text-blue-700"
            disabled={isCreatingNewConversation}
            isLoading={isCreatingNewConversation}
          >
            <Plus className="w-5 h-5" />
          </Button>
        </div>

        {/* Desktop Header - Only visible when sidebar is closed */}
        {!sidebarOpen && (
          <div className="bg-white border-b border-gray-200 p-4 hidden lg:flex items-center justify-between">
            <Button
              variant="ghost"
              size="sm"
              onClick={toggleSidebar}
              className="text-gray-600 hover:text-gray-900"
            >
              <Menu className="w-5 h-5 mr-2" />
              Show Conversations
            </Button>

            <Button
              onClick={handleNewConversation}
              className="bg-blue-600 hover:bg-blue-700 text-white"
              disabled={isCreatingNewConversation}
              isLoading={isCreatingNewConversation}
            >
              <Plus className="w-4 h-4 mr-2" />
              New Chat
            </Button>
          </div>
        )}

        {/* Chat Container */}
        <div className="flex-1 overflow-hidden">
          <ChatContainer
            key={currentConversationId || 'new'}
            conversationId={currentConversationId}
            onConversationUpdate={refreshConversations}
            onNewConversation={handleConversationCreated}
          />
        </div>
      </div>
    </div>
  )
}

export default App

This main application component coordinates the entire UI, managing sidebar state, conversation selection, and mobile responsiveness.

16. Styling Setup

Finally, create the CSS entry point at client/src/index.css:

@import "tailwindcss";

body {
  margin: 0;
  font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', sans-serif;
}

This minimal CSS imports Tailwind’s utility classes and sets up the base font family for consistent typography across the application.

17. Running and Testing the Application

Start the backend server first:

cd server
npm run dev

The server will start on port 8080 and display health check information. You should see confirmation that all environment variables are properly configured.

In a new terminal, start the frontend development server:

cd client
npm run dev

The frontend will start on port 5173 and automatically open your browser. You can now test the complete conversational AI system.

Run a Demo

If you open the frontend, you’ll see an interface like the one shown below. You can click on “Start Chat.”

Conclusion

That’s it! You’ve successfully built a complete conversational AI application using ZEGOCLOUD’s real-time communication platform. The system handles voice recognition, AI response generation, and natural conversation flow with persistent memory across sessions.

The application treats AI agents as real participants in voice calls, enabling natural interruption and real-time responses. Users can seamlessly switch between text and voice input while maintaining conversation context.

The modular architecture makes it easy to extend functionality and customize the experience for specific use cases. Your conversational AI system now provides professional-grade voice communication with the intelligence and responsiveness users expect from modern AI applications.

FAQ

Q1. What technologies are required to build a conversational AI?

You typically need natural language processing (NLP/LLM), automatic speech recognition (ASR), text-to-speech (TTS), and real-time communication (RTC) technologies. Together, they create a seamless loop for listening, understanding, and responding.

Q2. How do I integrate conversational AI into existing apps or platforms?

Most providers offer SDKs and APIs that support cross-platform integration (iOS, Android, Web). A well-documented, all-in-one SDK can significantly speed up development.

Q3. What are the main use cases of conversational AI?

Typical use cases include customer service bots, AI voice assistants, live streaming interactions, in-game NPCs, virtual classrooms, healthcare assistants, and enterprise collaboration tools.

Q4. How do I choose the right conversational AI provider?

Evaluate providers based on latency performance, global coverage, ease of integration, scalability, security standards, cost transparency, and proven case studies.

Let’s Build APP Together

Start building with real-time video, voice & chat SDK for apps today!

Talk to us

Take your apps to the next level with our voice, video and chat APIs

Free Trial
  • 10,000 minutes for free
  • 4,000+ corporate clients
  • 3 Billion daily call minutes

Stay updated with us by signing up for our newsletter!

Don't miss out on important news and updates from ZEGOCLOUD!

* You may unsubscribe at any time using the unsubscribe link in the digest email. See our privacy policy for more information.