How to Build a React Native Voice Call App

Voice chat is now a familiar experience in apps such as WhatsApp, X (Twitter) Spaces, Clubhouse, and Discord, because it makes communication more natural and engaging. If you are building your own product, integrating a React Native voice call feature can greatly enhance the experience, but handling audio hardware, network quality, codecs, and echo control on your own is complex. With ZEGOCLOUD providing the core real-time voice layer for React Native voice call development, you can focus on designing how the feature fits smoothly into your app.

How to Create a Voice Call App with React Native

ZEGOCLOUD provides the complete real-time voice infrastructure, handling everything from audio capture and network optimization to cross-platform delivery. This lets you focus on building your app’s experience instead of managing complex audio processing and connection logic.

For this project, you’ll use ZEGOCLOUD’s ZegoExpressEngine React Native SDK, which handles voice rooms, microphone publishing, and remote audio streaming.

The following sections guide you through setting up the token server, SDK integration, and React Native UI components step by step.

Prerequisites

Before you run the React Native voice app, you should have:

A ZEGOCLOUD account with a project created and a valid AppID and ServerSecret.
Node.js 18+ and npm installed.
Expo CLI installed (npm install -g expo-cli or using npx expo).
A physical Android or iOS device or a simulator/emulator.
This project cloned locally (ZegoVoiceApp/) with dependencies installed.

1. Project Setup

The implementation in ZegoVoiceApp is split into a Node token server and an Expo app. The complete code is available in the ZegoVoiceApp GitHub repo.

1.1 Architecture Overview

Token server (server/)
Small Express app exposing /api/token and /health.
Generates ZEGOCLOUD token04 values using your AppID + ServerSecret.
Runs on http://<your-machine-ip>:3001 so mobile devices on the same LAN can reach it.
React Native app (Expo root)
Uses expo-router with a (tabs) layout.
Wraps the app in a VoiceCallProvider context (contexts/VoiceCallContext.tsx).
Uses a ZegoService singleton (services/ZegoService.ts) to wrap zego-express-engine-reactnative.
Renders a single main screen, VoiceCallScreen (components/VoiceCallScreen.tsx), from the app/(tabs)/index.tsx route.

The call flow is:

VoiceCallScreen → useVoiceCall (from VoiceCallContext) to start/join a room.
VoiceCallContext → ZegoService to initialize engine, log in, publish mic, and subscribe to remote streams.
ZEGOCLOUD SDK → triggers room/user/stream events → VoiceCallContext updates React state.

1.2 Installing Dependencies and Environment

From the project root (ZegoVoiceApp):

npm install

The Expo app and token server each have their own dependencies:

# Token server
cd server
npm install

# Back to root for the Expo app (already covered by root npm install)
cd ..

Create a .env file at the project root modeled after .env.example:

# Replace with your computer's LAN IP (reachable by your phone/emulator)
REACT_APP_SERVER_URL=http://YOUR_COMPUTER_IP:3001

On mobile, localhost points to the device, not your development machine. Use the same LAN IP that you would use to access the server from a browser on your phone.

The token server itself hardcodes AppID/ServerSecret in server/server.js in this demo; in your own project you should move those into a server-side .env instead of committing them.

2. Building the Token Server

The token server is a small Node + Express app that generates token04 strings which ZegoExpressEngine uses to authenticate your users.

2.1 Token generator (token04)

server/token-generator.js contains the logic to build an encrypted token:

// server/token-generator.js
const crypto = require('crypto')

function makeNonce() {
  const min = -Math.pow(2, 31)
  const max = Math.pow(2, 31) - 1
  return Math.floor(Math.random() * (max - min + 1)) + min
}

function aesGcmEncrypt(plainText, key) {
  if (![16, 24, 32].includes(key.length)) {
    throw new Error('Invalid Secret length. Key must be 16, 24, or 32 bytes.')
  }

  const nonce = crypto.randomBytes(12)
  const cipher = crypto.createCipheriv('aes-256-gcm', key, nonce)
  cipher.setAutoPadding(true)

  const encrypted = cipher.update(plainText, 'utf8')
  const encryptBuf = Buffer.concat([encrypted, cipher.final(), cipher.getAuthTag()])

  return { encryptBuf, nonce }
}

function generateToken04(appId, userId, secret, effectiveTimeInSeconds, payload = '') {
  // ... build token info with app_id, user_id, nonce, ctime, expire, payload
  // ... encrypt JSON with AES-GCM and pack into "04" + base64 string
}

module.exports = { generateToken04 }

The full implementation follows ZEGOCLOUD’s recommended token04 format (AES-GCM + packed metadata). You usually don’t have to touch this file—copy it as is and only change the AppID/ServerSecret and the server URL.

2.2 `/api/token` endpoint

server/server.js exposes a single POST endpoint that generates tokens for your app:

// server/server.js
const express = require('express')
const cors = require('cors')
const { generateToken04 } = require('./token-generator')

const app = express()
const PORT = 3001

// In production, read these from process.env instead of hardcoding
const APP_ID = /* your ZEGOCLOUD AppID (number) */
const SERVER_SECRET = /* your 32‑character ServerSecret */

app.use(cors())
app.use(express.json())

app.post('/api/token', (req, res) => {
  try {
    const { userId, roomId, effectiveTimeInSeconds = 3600 } = req.body
    if (!userId) {
      return res.status(400).json({ error: 'userId is required' })
    }

    let payload = ''
    if (roomId) {
      payload = JSON.stringify({
        room_id: roomId,
        privilege: { 1: 1, 2: 1 },
        stream_id_list: null
      })
    }

    const token = generateToken04(APP_ID, userId, SERVER_SECRET, effectiveTimeInSeconds, payload)

    res.json({ token, appId: APP_ID, userId, roomId: roomId || null })
  } catch (error) {
    res.status(500).json({ error: error.message })
  }
})

app.get('/health', (_req, res) => {
  res.json({ status: 'Token server is running' })
})

app.listen(PORT, () => {
  console.log(`Token server running on http://localhost:${PORT}`)
})

The React Native app will call this endpoint to get both the token and the appId it should pass into ZegoExpressEngine.

3. Frontend Implementation

On the React Native side, everything flows through three layers:

ZegoService (services/ZegoService.ts) – thin wrapper around zego-express-engine-reactnative.
VoiceCallContext (contexts/VoiceCallContext.tsx) – central call state + actions.
VoiceCallScreen (components/VoiceCallScreen.tsx) – UI that lets you join/leave a room and toggle the mic.

3.1 ZegoExpressEngine wrapper (ZegoService)

ZegoService is a singleton that hides the raw Zego SDK calls behind a friendlier TypeScript API:

// services/ZegoService.ts
import { Platform } from 'react-native'
import ZegoExpressEngine from 'zego-express-engine-reactnative'

export interface TokenResponse {
  token: string
  appId: number
  userId: string
  roomId: string | null
}

export interface UserInfo {
  userID: string
  userName: string
}

export interface RoomConfig {
  token: string
  isUserStatusNotify?: boolean
}

class ZegoService {
  private static instance: ZegoService
  private engine: any = null
  private isInitialized = false

  static getInstance(): ZegoService {
    if (!ZegoService.instance) {
      ZegoService.instance = new ZegoService()
    }
    return ZegoService.instance
  }

  async getToken(userId: string, roomId?: string): Promise<TokenResponse> {
    // Use LAN IP so your device can reach the token server
    const baseUrl = Platform.OS === 'web'
      ? 'http://localhost:3001'
      : process.env.REACT_APP_SERVER_URL || 'http://YOUR_COMPUTER_IP:3001'

    const response = await fetch(`${baseUrl}/api/token`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ userId, roomId })
    })

    if (!response.ok) throw new Error('Failed to get token')
    return response.json()
  }

  async initEngine(appId: number): Promise<void> {
    if (this.isInitialized) return

    if (Platform.OS === 'web') {
      throw new Error('Zego SDK is not supported on web. Please use Android or iOS.')
    }

    const profile = {
      appID: appId,
      appSign: '', // using token authentication instead of appSign
      scenario: 0  // 0 = general voice, 1 = high quality audio
    }

    await ZegoExpressEngine.createEngineWithProfile(profile)
    this.engine = ZegoExpressEngine.instance()
    this.isInitialized = true
  }

  async loginRoom(roomId: string, user: UserInfo, config: RoomConfig) {
    return this.engine.loginRoom(roomId, user, config)
  }

  async logoutRoom(roomId: string) {
    return this.engine.logoutRoom(roomId)
  }

  async startPublishingStream(streamId: string) {
    return this.engine.startPublishingStream(streamId)
  }

  async stopPublishingStream() {
    return this.engine.stopPublishingStream()
  }

  async startPlayingStream(streamId: string) {
    return this.engine.startPlayingStream(streamId)
  }

  async stopPlayingStream(streamId: string) {
    return this.engine.stopPlayingStream(streamId)
  }

  // ... event handlers (roomStateUpdate, roomUserUpdate, roomStreamUpdate, etc.)
  // ... audio enhancements (enableAEC, enableAGC, enableANS)
}

export default ZegoService

The real file also wires up:

Room events (roomStateUpdate, roomUserUpdate, roomStreamUpdate).
Publisher/player events (publisherStateUpdate, playerStateUpdate).
Token renewal (roomTokenWillExpire).

Those callbacks are all forwarded into the VoiceCallContext reducer.

3.2 Voice call context and permissions

contexts/VoiceCallContext.tsx keeps all call state in one place and exposes a useVoiceCall() hook for components to use.

Key parts:

// contexts/VoiceCallContext.tsx
import React, { createContext, useContext, useEffect, useReducer, useRef } from 'react'
import ZegoService from '../services/ZegoService'
import { requestAudioPermissions } from '../utils/permissions'

interface User { userID: string; userName: string }

interface VoiceCallState {
  isConnected: boolean
  currentRoom: string | null
  currentUser: User | null
  isPublishing: boolean
  isPlaying: boolean
  roomUsers: User[]
  streamList: string[]
  connectionState: 'disconnected' | 'connecting' | 'connected' | 'error'
  error: string | null
}

// reducer, initialState, and actions omitted here (see repo)

export const VoiceCallProvider: React.FC<{ children: React.ReactNode }> = ({ children }) => {
  const [state, dispatch] = useReducer(voiceCallReducer, initialState)
  const zegoService = ZegoService.getInstance()
  const currentUserRef = useRef<User | null>(null)
  const currentRoomRef = useRef<string | null>(null)

  useEffect(() => {
    currentUserRef.current = state.currentUser
    currentRoomRef.current = state.currentRoom
  }, [state.currentUser, state.currentRoom])

  const joinRoom = async (userId: string, userName: string, roomId: string) => {
    try {
      dispatch({ type: 'SET_CONNECTION_STATE', payload: 'connecting' })
      dispatch({ type: 'SET_ERROR', payload: null })

      const hasPermission = await requestAudioPermissions()
      if (!hasPermission) throw new Error('Microphone permission denied')

      const user = { userID: userId, userName }
      dispatch({ type: 'SET_USER', payload: user })

      const tokenResponse = await zegoService.getToken(userId, roomId)
      await zegoService.initEngine(tokenResponse.appId)

      // Set up SDK event listeners AFTER engine initialization
      setupEventListeners()

      await zegoService.enableAEC(true)
      await zegoService.enableAGC(true)
      await zegoService.enableANS(true)

      const roomConfig = { token: tokenResponse.token, isUserStatusNotify: true }
      await zegoService.loginRoom(roomId, user, roomConfig)
      dispatch({ type: 'SET_ROOM', payload: roomId })

      await zegoService.startPublishingStream(userId)
      dispatch({ type: 'SET_PUBLISHING', payload: true })
    } catch (error) {
      dispatch({ type: 'SET_CONNECTION_STATE', payload: 'error' })
      dispatch({ type: 'SET_ERROR', payload: `Failed to join room: ${error}` })
      throw error
    }
  }

  const leaveRoom = async () => {
    try {
      if (state.currentRoom) {
        await zegoService.stopPublishingStream()
        await zegoService.logoutRoom(state.currentRoom)
      }
      dispatch({ type: 'RESET' })
    } catch (error) {
      dispatch({ type: 'SET_ERROR', payload: `Failed to leave room: ${error}` })
      throw error
    }
  }

  const toggleMicrophone = async () => {
    try {
      if (state.isPublishing) {
        await zegoService.stopPublishingStream()
        dispatch({ type: 'SET_PUBLISHING', payload: false })
      } else if (state.currentUser) {
        await zegoService.startPublishingStream(state.currentUser.userID)
        dispatch({ type: 'SET_PUBLISHING', payload: true })
      }
    } catch (error) {
      dispatch({ type: 'SET_ERROR', payload: `Failed to toggle microphone: ${error}` })
      throw error
    }
  }

  // ... startListening / stopListening use startPlayingStream / stopPlayingStream
  // ... setupEventListeners wires all ZegoService events into dispatch()

  return (
    <VoiceCallContext.Provider value={{ state, joinRoom, leaveRoom, toggleMicrophone, startListening, stopListening }}>
      {children}
    </VoiceCallContext.Provider>
  )
}

The utils/permissions.ts helper centralizes Android audio (and optional camera) permission handling so your call logic stays clean.

3.3 VoiceCallScreen: join form and in-call UI

components/VoiceCallScreen.tsx renders both the join form and the in-call controls using the useVoiceCall() hook:

// components/VoiceCallScreen.tsx
import { useVoiceCall } from '../contexts/VoiceCallContext'

const VoiceCallScreen = () => {
  const { state, joinRoom, leaveRoom, toggleMicrophone } = useVoiceCall()
  const [userId, setUserId] = useState('')
  const [userName, setUserName] = useState('')
  const [roomId, setRoomId] = useState('')

  const handleJoinRoom = async () => {
    if (!userId.trim() || !userName.trim() || !roomId.trim()) {
      Alert.alert('Error', 'Please fill in all fields')
      return
    }
    await joinRoom(userId.trim(), userName.trim(), roomId.trim())
  }

  // On web, show a warning that voice SDK is not supported
  // On native, show three text inputs (User ID, User Name, Room ID) and a Join button

  // Once connected, render room info, list of users, mic toggle, and End Call button
}

3.4 Navigation and app shell

The Expo router layout wires everything into a tabbed app:

// app/_layout.tsx
import { VoiceCallProvider } from '../contexts/VoiceCallContext'

export default function RootLayout() {
  return (
    <VoiceCallProvider>
      <ThemeProvider value={/* light/dark theme */}>
        <Stack>
          <Stack.Screen name="(tabs)" options={{ headerShown: false }} />
        </Stack>
      </ThemeProvider>
    </VoiceCallProvider>
  )
}

// app/(tabs)/index.tsx
import VoiceCallScreen from '../../components/VoiceCallScreen'

export default function HomeScreen() {
  return <VoiceCallScreen />
}

4. Run It Locally and Test Calls

With the token server and app configured, you can run a full voice call locally.

4.1 Start the token server

   cd server
   node server.js

Visit http://localhost:3001/health in your browser to confirm: { status: "Token server is running" }.
Make sure REACT_APP_SERVER_URL in your .env points to http://<your-machine-ip>:3001.

4.2 Start the Expo app From the project root

   npm run start

Launch the app on an Android emulator, iOS simulator, or a physical device via QR code.
Remember: voice calls won’t work in a web browser; the app will show a warning instead.

4.3 Join a test room from two devices

On device A, enter a User ID, User Name, and a Room ID (for example, test-room-1), then tap Join Room.
On device B, use a different User ID and the same Room ID, then join.
You should see:
- Both users are listed under “Users in Room”.
- The mic button starts/stops publishing for your user.
- Audio from the other device when its mic is enabled.

4.4 End the call

Tap the call/leave button to leave the ZEGOCLOUD room.
The provider stops publishing, logs out of the room, and resets the state.

If something fails:

Check that the token server is reachable from the device (curl http://YOUR_IP:3001/health from the device or emulator).
Verify that the AppID + ServerSecret in the server match your ZEGOCLOUD project.
Look at device logs for ZEGOCLOUD error codes on roomStateUpdate and publisherStateUpdate.

Conclusion

You now have a working React Native voice call app using ZEGOCLOUD:

A token server that safely generates token04 values for your users.
A ZegoService wrapper that hides the raw React Native SDK details.
A VoiceCallProvider context and VoiceCallScreen UI for joining/leaving rooms and toggling the mic.

From here you can:

Add muting controls, speaker selection, or audio device switching.
Expose call quality indicators by reading more stats from the SDK.
Extend the token server to support authenticated users and per‑room permissions.
Evolve this voice app into a full audio/video calling experience by reusing the same architecture and adding camera support.

For all omitted reducer logic, detailed event handlers, and full UI markup, check the project repository (ZegoVoiceApp/).