Product features
Communication capabilities
Basic features
| Basic features | Feature description | Business scenarios |
|---|---|---|
| Voice call | Users join the same room and conduct audio calls. |
|
| Voice live | In the same room, including hosts and audience, hosts can conduct audio live streaming, and audiences in the room can watch the live stream. |
|
| User permission control | Use Token for user permission control, such as: specifying users to enter/exit rooms; specifying users to speak/mute; specifying users. | Video conference |
| Pre-call detection | Before conducting audio and video calls or live streaming, perform device detection on cameras, microphones, monitors, etc., to ensure the normal operation of calls or live streaming. | Normal call function detection |
| Call quality monitoring | Detect the quality of audio and video, such as resolution, frame rate, bitrate, sampling rate, etc., to ensure stable quality. | Bank account opening, remote authentication, etc., which have high requirements and limitations on audio and video quality |
| Network speed testing | Before users publish/play streams, detect uplink and downlink network speeds to determine what bitrate of audio and video streams is suitable for publishing/playing under the current network environment. | Call scenarios, education scenarios, live streaming scenarios |
Advanced features
| Advanced features | Feature description | Business scenarios |
|---|---|---|
| Live co-hosting | In a room, multiple hosts can appear and conduct same-screen co-hosting live streaming. |
|
| Multi-source capture | Provides flexible and easy-to-use audio and video capture sources and channel management capabilities, reducing developer development and maintenance costs. | Video conferences, online education |
| Publish multiple streams simultaneously | A user can publish multiple audio and video streams, such as sending the camera's video stream while sharing the screen. | See the speaker's image while playing PPT in a video conference |
| Supplemental Enhancement Information (SEI) | Text information is packaged with audio and video content and transmitted through the streaming media channel to achieve precise synchronization between text data and audio and video content. |
|
| Traffic control | ZEGO's industry-leading technology. The SDK dynamically adjusts the bitrate, frame rate, and resolution of video publishing streams, as well as audio bitrate, based on its own and the peer's current network environment status, automatically adapting to the current network environment and network fluctuations, thereby ensuring smooth video publishing. | All scenarios that require high-quality real-time audio and video services |
| Cloud proxy | By setting the SDK's cloud proxy interface, all traffic corresponding to the SDK is forwarded through the cloud proxy server to achieve communication with RTC and L3 (Ultra-low latency live streaming). | Hospitals, government, company internal and other restricted network environments such as intranets |
| Geofencing | Restricts the transmission of audio and video and signaling data to a certain region to meet regional data privacy and security-related regulations, that is, restricts access to audio and video services in a specific region. | Call scenarios |
| Audio and video stream encryption | Encrypt the stream when publishing, and must have a decryption key consistent with the encryption key when playing. | Scenarios that need to encrypt stream information to protect communication security |
| Game voice | Imitates the real world, where people have different auditory experiences based on factors such as the direction and distance of sound. For example, the farther the distance, the smaller the sound. At the same time, people who can receive the sound source can be grouped and restricted. For example, in a room, discuss in groups, and different groups cannot hear each other's voices. |
|
| Mass-scale audio and video | ZEGO's industry-leading technology. Automatically plays remote audio and video within the listening range based on the user's location in the cloud and provides spatial audio effects (by default, plays the 12 closest streams). A single scenario supports 10,000 users to enable microphones and cameras at the same time. | Virtual offices, virtual exhibitions, open virtual worlds and other virtual scenarios |
| Real-time synchronization of multiple users' status | ZEGO's industry-leading technology. Provides an orderly, high-frequency, low-latency, large-scale status synchronization service, helping developers quickly implement real-time information synchronization capabilities such as player positions, actions, and images in virtual gameplay. At the same time, it supports 10,000 users online simultaneously in a single scenario. | Metaverse scenarios such as virtual offices, virtual exhibitions, virtual social networking, virtual KTV, and general scenarios that require ultra-high frequency, low latency, and large-scale synchronization of information or control commands |
Room capabilities
Basic features
| Basic features | Feature description | Business scenarios |
|---|---|---|
| Room connection status description | Determine the user's connection status in the room and the conversion process of each connection status. | - |
| Real-time messaging and signaling | Real-time messaging mainly provides the function of sending and receiving pure text messages. It can send broadcast messages and barrage messages to other users in the same room, or send custom messages to specified users, and can implement interactive functions such as likes, gifts, and quizzes according to needs. |
|
Advanced features
| Advanced features | Feature description | Business scenarios |
|---|---|---|
| Login to multiple rooms | A user can enter multiple rooms at the same time to conduct audio and video calls or watch live streams. | Teacher multi-class online teaching |
Audio capabilities
Basic features
| Basic features | Feature description | Business scenarios |
|---|---|---|
| Audio spectrum and sound level changes | Audio spectrum: the energy value of digital audio signals at each frequency point. Sound level changes: the volume of a certain stream. |
|
| Headphone monitor and channel settings |
|
|
| Audio 3A processing | During real-time audio and video calls or live streaming, 3A processing can be performed on audio to improve the quality of calls or live streaming and user experience.
| All scenarios that require high-quality real-time audio and video services |
| Voice changer/reverb/stereo | To increase fun and interactivity, users can use voice changers to be funny, use reverb to enhance the atmosphere, and use stereo to make the sound more three-dimensional. ZEGO Express SDK provides a variety of preset voice changer, reverb, reverb echo, and stereo effects. Developers can flexibly set the sound they want. |
|
Advanced features
| Advanced features | Feature description | Business scenarios |
|---|---|---|
| Audio mixing | The SDK obtains a piece of audio data from the App, and integrates the audio data provided by the App with the audio data collected by the SDK into a single audio data, thereby realizing the ability to play custom sounds and music files during calls or live streaming, and allowing others in the room to hear them. |
|
| Scenario-based AI noise reduction | Real-time automatic recognition of different scenarios, intelligently adjusts AI noise reduction strategies to provide the best noise reduction and sound quality effects. In call scenarios, all sounds except human voice are identified as noise and eliminated. In music scenarios, automatically adjust noise reduction effects to restore music sound quality. | Voice rooms, conferences, voice gaming and other 1v1 or multi-person audio and video call scenarios, as well as live streaming or online KTV scenarios such as sound cards, singing along, near-field music |
| Custom audio capture | Developers can obtain audio information by themselves and then hand it over to the SDK for transmission. |
|
| Custom audio rendering | Audio is rendered and played by the developer themselves. | Developers have their own special rendering requirements |
| Custom audio processing | Developers can perform special audio processing by themselves. | When there are special sound processing requirements that the SDK cannot meet, such as special voice changers |
| Get original audio data | The function of obtaining original audio recording. The obtained original audio data format is PCM. | Audio data retention or special processing |
| AI voice changer | The "Conan voice-changer bow tie" in real-time calls, perfectly reproduces the target character's timbre and rhythm, while retaining the user's speaking speed, emotion, and tone. Switch timbres at will with ultra-low latency. |
|
Live streaming capabilities
Basic features
| Basic features | Feature description | Business scenarios |
|---|---|---|
| Stream mixing | Mix multiple streams from multiple people into one stream, so that you only need to play one stream to see the screens of all members in the room and hear the voices of all members in the room. | Multi-person call host co-hosting |
| Use CDN for live streaming | Unify the access capabilities of multiple CDNs. This function supports publishing to CDN, connecting RTC products and CDN live streaming products, making it convenient for users to watch live content directly from web pages or third-party players. | Basic live streaming with high concurrency, scenarios without strong requirements for live streaming latency |
| CDN publishing authentication | To prevent attackers from stealing the developer's publishing URL address for use elsewhere, or forging the developer's server to generate the publishing URL address, resulting in traffic loss, you can configure CDN publishing authentication by yourself through the ZEGOCLOUD Console. After enabling authentication, you need to splice relevant authentication parameters in the publishing URL address, otherwise you cannot publish. | - |
| Playing stream by URL | When the publishing end uses third-party publishing tools (such as OBS software, network camera IP Camera, etc.) to push the stream to the CDN, or uses the ZEGO SDK to relay the audio and video screen to a third-party CDN, you can use the method of directly passing in the URL address to play the stream. | Third-party live screen acquisition |
Advanced features
| Advanced features | Feature description | Business scenarios |
|---|---|---|
| Ultra-low latency live streaming | Focuses on providing stable and reliable live streaming services. Compared with standard video live streaming products, it has lower audio and video latency, stronger synchronization, better weak network resistance, and can bring users a millisecond-level live streaming experience. |
|
| Direct-to-CDN | The process of pushing audio and video streams directly from the local client to the CDN. Users can watch directly through the playing stream URL from a web page or a third-party player. | Developers who have audio and video distribution service cooperation with third-party CDNs can use |
Other capabilities
Basic features
| Basic features | Feature description | Business scenarios |
|---|---|---|
| Media player | Provides the ability to play audio and video media files and supports publishing the audio and video data of the played media files. |
|
| Audio effect player | Provides an audio effect player, manages audio effects uniformly, and achieves effects such as enhancing realism or setting the atmosphere by playing short sound effects. |
|
| Audio and video recording | During video calls, live streaming, and online teaching, users often need to record and save videos for subsequent on-demand viewing by other users. ZEGO provides multiple recording solutions to meet recording needs in different scenarios. |
|
