logo
Video Call
On this page

Implementation Process

2023-07-26

Overview

This document introduces the principles and steps of using the object segmentation and Alpha data transmission and rendering capabilities in ZIM SDK and ZEGO Express SDK to implement multi-user real-time video interaction in the same scenario and mic position management in the Room.

In most video interactions, the users participating in the interaction are separated by their respective rectangular video areas, and the proportion of the main body in the picture is often less than half. In this case, the interaction experience, different users' different backgrounds can easily bring a messy visual experience to the overall picture, making it difficult to form an immersive interaction experience, as shown in the following figure:

To improve the interaction experience, in addition to the object segmentation capability, ZEGO Express SDK innovatively provides the Alpha data transmission feature. The principle is to splice the original video and the Alpha information obtained by object segmentation below the original video to obtain a video with height * 2. After encoding, it is transmitted to the playing stream end. The playing stream end will separate the original video and Alpha data, and use the Alpha data for rendering, which can form an effect that visually only displays the main body in the picture.

As shown in the following figure, the Alpha information below the original video of the publishing stream end uses black to represent that this part of the content is transparent. After decoding at the playing stream end, the black part is normalized into Alpha data, so that the corresponding area of the original video picture can be rendered into a transparent effect. On the view, only the main body part of the person will be displayed, and the user's real background will not be displayed.

For information on the implementation principle of the rendering part, please refer to Play Transparent Gift Effects.

Publishing Stream PicturePlaying Stream PictureDisplay Effect After Rendering

In this way, the main bodies of multiple users can be rendered onto the same background picture or background video. Although they are in different spaces, they can still interact in real time in the same scenario.

Application Scenarios

Application ScenarioImmersive Meeting, Watch Movies TogetherHost Co-hostingLarge Events, such as Press Conferences, etc.
Diagram

Solution Architecture

The overall architecture of the in-Room business of this best practice is shown in the following figure. Since the developer's business backend only manages the Room list and does not involve in-Room business, it is not reflected in the following figure. Where:

Prerequisites

Implementation Process

The implementation process mainly includes 6 steps, namely initializing SDK, joining Room, managing mic positions, using object segmentation, playing stream, and leaving Room.

Initialize SDK

  • Before managing mic positions in the Room, you need to initialize the ZIM SDK first and set the notification callback to listen for ZIM events. For interface call details, please refer to "2. Create ZIM Instance" and "3. Use EventHandler Protocol" in Instant Messaging - Implement Basic Message Send and Receive.
  • Before implementing object segmentation, you need to initialize the ZEGO Express SDK first and set the notification callback at the same time to listen for Express events. For interface call details, please refer to "Initialization" in Video Call - Implementing Video Call.

Join Room

Manage Mic Positions

  • After joining the ZIM Room, users can understand the mic position information in the Room by querying all Room attributes. For interface call details, please refer to "Get Room Attributes" in Instant Messaging - Room Attribute Management.
  • If users need to get on the mic, they can modify the mic position information by modifying Room attributes. For interface call details, please refer to "Set Room Attributes" in Instant Messaging - Room Attribute Management.

Use Object Segmentation

  1. To transmit the image after object segmentation when publishing stream, you need to set the alpha channel. Please refer to "Use Alpha Channel to Transmit Segmented Main Body" in Video Call - Object Segmentation to learn how to call enableAlphaChannelVideoEncoder to set the alpha channel.

  2. Since the picture captured by the mobile phone's front camera is opposite to the actual left and right, you need to enable screen mirroring to obtain the correct direction picture when previewing or playing stream. For interface call details, please refer to "Set Mirror Mode" in Video Call - Common Video Configuration.

  3. For the aesthetic angle of the picture when rotating the mobile phone screen, you need to set the orientation of the captured video. For interface call details, please refer to Video Call - Video Capture Rotation.

  4. If you need to preview the local picture, you need to set the backgroundColor property of the View used for rendering to clearColor (transparent color). For interface call details, please refer to "Make Special Settings for View" in Video Call - Object Segmentation.

    Preview and publish local picture. For interface call details, please refer to "Start Preview and Publishing Stream" in Video Call - Object Segmentation.

  5. Enable object segmentation and receive segmentation results. For interface call details, please refer to "Listen to Object Segmentation Status Callback" and "Use Object Segmentation to Implement Multiple Business Functions" in Video Call - Object Segmentation.

    Notes

    Since the RTC Room is only related to publishing stream, developers can also preview the effect of object segmentation outside the RTC Room.

Playing Stream

After the user receives the ZIM event notification and learns that someone has gotten on the mic in the Room:

  1. Set the opaque property of the TextureView used for rendering to false (transparent color). For interface call details, please refer to "Make Special Settings for View" in Video Call - Object Segmentation.

    Notes

    If the settings for TextureView have been implemented during preview, this step can be omitted.

  2. Play the video stream that has implemented object segmentation from the other party, thereby achieving the visual effect of two users having a face-to-face dialogue in "the same space". For interface call details, please refer to "Start Playing Stream" in Video Call - Object Segmentation.

  3. If you need to stop playing stream, for interface call details, please refer to "Stop Audio and Video Call" in Video Call - Implementing Video Call.

Leave Room

  1. Stop preview and publishing stream. For interface call details, please refer to "Stop Audio and Video Call" in Video Call - Implementing Video Call.
  2. Leave the ZIM Room. For interface call details, please refer to "Leave Room" in Instant Messaging - Room Management.
  3. Leave the RTC Room. For interface call details, please refer to "Stop Audio and Video Call" in Video Call - Implementing Video Call.

Previous

Experience App

Next

Multi-Person Video Call

On this page

Back to top