logo
Video Call
On this page

Implementation Flow

2023-07-26

Overview

This document introduces the principles and steps of using the ZIM SDK and ZEGO Express SDK's object segmentation and Alpha data transmission and rendering capabilities to implement multi-user same-scene real-time video interaction and mic position management in the room.

In most video interactions, users participating in the interaction are separated by their respective rectangular video areas, and the subject in the picture often accounts for less than half of the picture. In this case, the interaction experience is that different users' different backgrounds easily bring a cluttered visual impression to the overall picture, making it difficult to form an immersive interaction experience, as shown in the following figure:

To improve the interaction experience, in addition to the object segmentation capability, ZEGO Express SDK has pioneered the Alpha data transmission function. The principle is to splice the original video and the Alpha information obtained by object segmentation below the original video to obtain a video with height * 2. After encoding, it is transmitted to the playing end. The playing end will separate the original video and Alpha data and use the Alpha data for rendering, which can form the visual effect of only showing the subject in the picture.

As shown in the following figure, the Alpha information below the original video at the publishing end uses black to represent that this part of the content is transparent. After decoding at the playing end, the black part is normalized to Alpha data, so that the corresponding area of the original video picture can be rendered as a transparent effect. On the view, only the subject part of the person will be displayed, and the user's real background will not be displayed.

For information on the implementation principle of the rendering part, please refer to Play Transparent Gift Effects.

Publishing PicturePlaying PictureDisplay Effect After Rendering

Through this method, the subjects of multiple users can be rendered onto the same background image or background video. Although they are in different spaces, they can still interact in real time in the same scene.

Application Scenarios

Application ScenarioImmersive Conference, Watch Movies TogetherAnchor Co-hostingLarge Events, such as Press Conferences, etc.
Illustration

Solution Architecture

The overall architecture of the in-room business of this best practice is shown in the following figure. Since the developer's business backend only manages the room list and does not involve in-room business, it is not reflected in the figure below. Among them:

Prerequisites

Implementation Flow

The implementation flow mainly includes 6 steps: initializing SDK, joining room, managing mic positions, using object segmentation, playing stream, and leaving room.

1 Initialize SDK

2 Join room

3 Manage mic positions

  • After joining the ZIM room, users can understand the mic position information in the room by querying all room attributes. For interface call details, please refer to "Get Room Attributes" in Instant Messaging - Room Attribute Management.
  • If users need to get on the mic, they can modify the mic position information by modifying room attributes. For interface call details, please refer to "Set Room Attributes" in Instant Messaging - Room Attribute Management.

4 Use object segmentation

  1. To transmit the image segmented by object when publishing stream, you need to set the alpha channel. Please refer to "Use Alpha Channel to Transmit Segmented Subject" in Real-time Audio and Video - Object Segmentation to learn how to call enableAlphaChannelVideoEncoder to set the alpha channel.

  2. Since the picture captured by the mobile phone's front camera is opposite to the actual left and right, you need to enable screen mirroring to obtain the correct direction picture when previewing or playing stream. For interface call details, please refer to "Set Mirror Mode" in Real-time Audio and Video - Common Video Configuration.

  3. For the aesthetic angle of the picture when rotating the mobile phone screen, you need to set the orientation of the captured video. For interface call details, please refer to Real-time Audio and Video - Video Capture Rotation.

  4. If you need to preview the local picture, you first need to set the backgroundColor property of the View used for rendering to clearColor (transparent color). For interface call details, please refer to "Special Settings for View" in Real-time Audio and Video - Object Segmentation.

    Preview and publish local picture. For interface call details, please refer to "Start Preview and Publishing Stream" in Real-time Audio and Video - Object Segmentation.

  5. Enable object segmentation and receive segmentation results. For interface call details, please refer to "Listen to Object Segmentation Status Callback" and "Use Object Segmentation to Implement Multiple Business Functions" in Real-time Audio and Video - Object Segmentation.

    Note

    Since the RTC room is only related to publishing stream, developers can also preview the object segmentation effect outside the RTC room.

5 Play stream

After the user receives the ZIM event notification and learns that someone has gotten on the mic in the room:

  1. Call the real-time audio and video interface to set the backgroundColor property of the View used for rendering to clearColor (transparent color). For interface call details, please refer to "Special Settings for View" in Real-time Audio and Video - Object Segmentation.

    Note

    If the View settings were implemented during preview, this step can be omitted.

  2. Play the video stream of the other party that has implemented object segmentation, thereby achieving the visual effect of two users being in the "same space" facing each other for dialogue. For interface call details, please refer to "Start Playing Stream" in Real-time Audio and Video - Object Segmentation.

  3. If you need to stop playing stream, for interface call details, please refer to "Stop Audio and Video Call" in Real-time Audio and Video - Implementing Video Call.

6 Leave room

  1. Stop preview and publishing stream. For interface call details, please refer to "Stop Audio and Video Call" in Real-time Audio and Video - Implementing Video Call.
  2. Leave the ZIM room. For interface call details, please refer to "Leave Room" in Instant Messaging - Room Management.
  3. Log out of the RTC room. For interface call details, please refer to "Stop Audio and Video Call" in Real-time Audio and Video - Implementing Video Call.

Previous

Experience App

Next

Multiplayer Video Call

On this page

Back to top