Implementation Process

2023-07-26

Overview

This document introduces the principles and steps of using the object segmentation and Alpha data transmission and rendering capabilities in ZIM SDK and ZEGO Express SDK to implement multi-user real-time video interaction in the same scenario and mic position management in the Room.

In most video interactions, the users participating in the interaction are separated by their respective rectangular video areas, and the proportion of the main body in the picture is often less than half. In this case, the interaction experience, different users' different backgrounds can easily bring a messy visual experience to the overall picture, making it difficult to form an immersive interaction experience, as shown in the following figure:

To improve the interaction experience, in addition to the object segmentation capability, ZEGO Express SDK innovatively provides the Alpha data transmission feature. The principle is to splice the original video and the Alpha information obtained by object segmentation below the original video to obtain a video with height * 2. After encoding, it is transmitted to the playing stream end. The playing stream end will separate the original video and Alpha data, and use the Alpha data for rendering, which can form an effect that visually only displays the main body in the picture.

As shown in the following figure, the Alpha information below the original video of the publishing stream end uses black to represent that this part of the content is transparent. After decoding at the playing stream end, the black part is normalized into Alpha data, so that the corresponding area of the original video picture can be rendered into a transparent effect. On the view, only the main body part of the person will be displayed, and the user's real background will not be displayed.

For information on the implementation principle of the rendering part, please refer to Play Transparent Gift Effects.

Publishing Stream Picture	Playing Stream Picture	Display Effect After Rendering

In this way, the main bodies of multiple users can be rendered onto the same background picture or background video. Although they are in different spaces, they can still interact in real time in the same scenario.

Application Scenarios

Application Scenario	Immersive Meeting, Watch Movies Together	Host Co-hosting	Large Events, such as Press Conferences, etc.
Diagram

Solution Architecture

The overall architecture of the in-Room business of this best practice is shown in the following figure. Since the developer's business backend only manages the Room list and does not involve in-Room business, it is not reflected in the following figure. Where:

Prerequisites

You have created a project in the ZEGOCLOUD Console and applied for a valid AppID and AppSign. For details, please refer to Console - Project Management - Project Information.
You have contacted ZEGO business personnel to open the object segmentation permission.
You have contacted ZEGOCLOUD Technical Support to obtain the Video Call SDK containing the object segmentation feature, and integrated the SDK by referring to the Video Call's Quick Start - Integration. If your project has integrated the official website version of the Video Call SDK, you need to replace it.
You have opened the ZIM service by yourself in the ZEGOCLOUD Console (for details, please refer to the Console's Service Configuration - Instant Messaging - Open Service). If you cannot open the ZIM service, please contact ZEGOCLOUD Technical Support to open it.
You have integrated the ZIM SDK. For details, please refer to "Integrate SDK" in Quick Start - Implement Basic Send and Receive Messages.

Implementation Process

The implementation process mainly includes 6 steps, namely initializing SDK, joining Room, managing mic positions, using object segmentation, playing stream, and leaving Room.

Initialize SDK

Before managing mic positions in the Room, you need to initialize the ZIM SDK first and set the notification callback to listen for ZIM events. For interface call details, please refer to "2. Create ZIM Instance" and "3. Use EventHandler Protocol" in Instant Messaging - Implement Basic Message Send and Receive.
Before implementing object segmentation, you need to initialize the ZEGO Express SDK first and set the notification callback at the same time to listen for Express events. For interface call details, please refer to "Initialization" in Video Call - Implementing Video Call.

Join Room

To implement publishing stream, users need to login to the RTC Room first. For interface calls, please refer to "Login Room" in Video Call - Implementing Video Call.
This best practice implements the business logic of users getting on and off the mic by modifying ZIM Room attributes. Therefore, users need to:
1. Login to the ZIM service. For interface calls, please refer to "Login ZIM" in Instant Messaging - Implement Basic Message Send and Receive.
2. Users need to create or join a ZIM Room. For details, please refer to "Create Room, Join Room" in Instant Messaging - Room Management.

Manage Mic Positions

After joining the ZIM Room, users can understand the mic position information in the Room by querying all Room attributes. For interface call details, please refer to "Get Room Attributes" in Instant Messaging - Room Attribute Management.
If users need to get on the mic, they can modify the mic position information by modifying Room attributes. For interface call details, please refer to "Set Room Attributes" in Instant Messaging - Room Attribute Management.

Use Object Segmentation

To transmit the image after object segmentation when publishing stream, you need to set the alpha channel. Please refer to "Use Alpha Channel to Transmit Segmented Main Body" in Video Call - Object Segmentation to learn how to call enableAlphaChannelVideoEncoder to set the alpha channel.
Since the picture captured by the mobile phone's front camera is opposite to the actual left and right, you need to enable screen mirroring to obtain the correct direction picture when previewing or playing stream. For interface call details, please refer to "Set Mirror Mode" in Video Call - Common Video Configuration.
For the aesthetic angle of the picture when rotating the mobile phone screen, you need to set the orientation of the captured video. For interface call details, please refer to Video Call - Video Capture Rotation.
If you need to preview the local picture, you need to set the backgroundColor property of the View used for rendering to clearColor (transparent color). For interface call details, please refer to "Make Special Settings for View" in Video Call - Object Segmentation.

Preview and publish local picture. For interface call details, please refer to "Start Preview and Publishing Stream" in Video Call - Object Segmentation.
Enable object segmentation and receive segmentation results. For interface call details, please refer to "Listen to Object Segmentation Status Callback" and "Use Object Segmentation to Implement Multiple Business Functions" in Video Call - Object Segmentation.

Notes
Since the RTC Room is only related to publishing stream, developers can also preview the effect of object segmentation outside the RTC Room.

Playing Stream

After the user receives the ZIM event notification and learns that someone has gotten on the mic in the Room:

Set the opaque property of the TextureView used for rendering to false (transparent color). For interface call details, please refer to "Make Special Settings for View" in Video Call - Object Segmentation.

Notes
If the settings for TextureView have been implemented during preview, this step can be omitted.
Play the video stream that has implemented object segmentation from the other party, thereby achieving the visual effect of two users having a face-to-face dialogue in "the same space". For interface call details, please refer to "Start Playing Stream" in Video Call - Object Segmentation.
If you need to stop playing stream, for interface call details, please refer to "Stop Audio and Video Call" in Video Call - Implementing Video Call.

Leave Room

Stop preview and publishing stream. For interface call details, please refer to "Stop Audio and Video Call" in Video Call - Implementing Video Call.
Leave the ZIM Room. For interface call details, please refer to "Leave Room" in Instant Messaging - Room Management.
Leave the RTC Room. For interface call details, please refer to "Stop Audio and Video Call" in Video Call - Implementing Video Call.