Stream Mixing

2025-05-07

Feature Introduction

Overview

Stream mixing is a technology that merges multiple audio and video streams into one stream from the cloud, also called stream mixing. Developers only need to pull the mixed stream to see the video of all members in the room and hear the audio of all members, without needing to manage each stream in the room separately.

This document mainly introduces the operation instructions for initiating stream mixing from the client. If you need to initiate stream mixing from your own server, please refer to Server API - Start Stream Mixing.

Stream Mixing Method Classification

ZEGO supports three stream mixing methods: manual stream mixing, automatic stream mixing, and fully automatic stream mixing. The differences between the three stream mixing methods are as follows:

Stream Mixing Method	Manual Stream Mixing	Automatic Stream Mixing	Fully Automatic Stream Mixing
Meaning	Customize control of stream mixing tasks and stream mixing content, including input streams, stream mixing layout, etc. Supports manual video and audio stream mixing.	Specify a room to automatically mix all audio streams in the room. Only supports automatic audio stream mixing.	Each room automatically mixes audio streams. Only supports fully automatic audio stream mixing.
Application Scenarios	Available when merging multiple video and audio, such as live streaming of teacher and student screens in online classrooms, cross-room co-hosting in entertainment scenarios, mixing specified streams in special scenarios, etc.; Devices that do not support pulling multiple streams simultaneously or have poor device performance.	Use automatic stream mixing when merging all audio streams in the room into one stream, such as voice chat rooms, chorus.	Use fully automatic stream mixing when you do not want to do any development and merge all audio streams in the room into one stream, such as voice chat rooms, chorus.
Advantages	Strong flexibility, able to implement logic according to business needs.	Reduces the complexity of developer integration, no need to manage the lifecycle of specified room audio streams.	Developer integration complexity is very low, no need to manage the lifecycle of all room audio stream mixing tasks and audio stream lifecycles.
Initiation Method	User client or user server initiates stream mixing task, user client maintains stream lifecycle.	User client initiates stream mixing task, ZEGO server automatically maintains stream lifecycle in the room (i.e., input stream list).	Contact ZEGOCLOUD Technical Support to enable fully automatic stream mixing, ZEGO server maintains stream mixing task and stream lifecycle in the room (i.e., input stream list).

Advantages

Reduces the complexity of development implementation. For example, when there are N hosts co-hosting, if stream mixing is used, the audience side does not need to pull N video streams at the same time, saving the step of pulling N streams and laying them out in development implementation.
Reduces device performance requirements, reducing device performance overhead and network bandwidth burden. For example, when there are too many co-hosting parties, the audience side needs to pull N video streams, requiring device hardware to support pulling N streams simultaneously.
Simple to implement relaying to multiple CDNs, just need to add output streams as needed when configuring stream mixing.
When the audience side needs to replay multi-host co-hosting videos, just need to enable recording configuration on the CDN.
When reviewing content, you only need to observe one screen, no need to view multiple screens simultaneously.

Sample Source Code Download

Please refer to Download Sample Source Code to get the source code.

For related source code, please check files in the "/ZegoExpressExample/Examples/Others/Mixer" directory.

Prerequisites

Before implementing stream mixing functionality, please ensure:

You have created a project in the ZEGOCLOUD Console and applied for a valid AppID and AppSign. For details, please refer to Console - Project Information.
You have integrated ZEGO Express SDK in the project and implemented basic audio/video publishing and playing functions. For details, please refer to Quick Start - Integration and Quick Start - Implementation Flow.

Warning

Stream mixing functionality is not enabled by default. Please enable it yourself in the ZEGOCLOUD Console before use (for enabling steps, please refer to "Stream Mixing" in Project Management - Service Configuration), or contact ZEGOCLOUD Technical Support to enable.

Implementation Flow

The main flow of stream mixing is as follows:

Users in the room push stream A and stream B to the ZEGOCLOUD server.
ZEGOCLOUD server can configure to push mixed stream or separate stream A and stream B to CDN server as needed. (Using RTMP protocol)
The playing stream end pulls the mixed stream from the CDN server as needed, or can pull separate stream A and stream B (supporting RTMP, FLV, HLS and other protocols).

Manual Stream Mixing Usage Steps

Manual stream mixing allows custom control of stream mixing tasks and content, including input streams, stream mixing layout, etc., commonly used in multi-person interactive live streaming and cross-room co-hosting scenarios. Supports manual video and audio stream mixing.

Developers can implement manual stream mixing functionality through SDK or ZEGO server API. For server-related interfaces, please refer to Start Stream Mixing and Stop Stream Mixing.

The following describes how to use SDK to implement manual stream mixing.

Please refer to "Create Engine" and "Login Room" in Quick Start - Implementation Flow to complete room login.

Warning

The prerequisite for stream mixing is that there must be existing streams in the room.
You can initiate mixing of all existing streams in the room, whether these streams are published by you or by other users.

2 Set stream mixing configuration

ZegoMixerTask is a stream mixing task configuration object defined in the SDK, which contains information such as input stream layout and output stream.

// Stream mixing task object
@interface ZegoMixerTask : NSObject

- (instancetype)init NS_UNAVAILABLE;

// Construct a stream mixing task object through TaskID
- (instancetype)initWithTaskID:(NSString *)taskID;

// Stream mixing task ID
@property (nonatomic, copy, readonly) NSString *taskID;

// Set the audio configuration of the stream mixing task object
- (void)setAudioConfig:(ZegoMixerAudioConfig *)audioConfig;

// Set the video configuration of the stream mixing task object
- (void)setVideoConfig:(ZegoMixerVideoConfig *)videoConfig;

// Set the input stream list of the stream mixing task object
- (void)setInputList:(NSArray<ZegoMixerInput *> *)inputList;

// Set the output list of the stream mixing task object
- (void)setOutputList:(NSArray<ZegoMixerOutput *> *)outputList;

// Set the watermark of the stream mixing task object
- (void)setWatermark:(ZegoWatermark *)watermark;

// Set the background image of the stream mixing task object
- (void)setBackgroundImageURL:(NSString *)backgroundImageURL;

@end

// Stream mixing task object
@interface ZegoMixerTask : NSObject

- (instancetype)init NS_UNAVAILABLE;

// Construct a stream mixing task object through TaskID
- (instancetype)initWithTaskID:(NSString *)taskID;

// Stream mixing task ID
@property (nonatomic, copy, readonly) NSString *taskID;

// Set the audio configuration of the stream mixing task object
- (void)setAudioConfig:(ZegoMixerAudioConfig *)audioConfig;

// Set the video configuration of the stream mixing task object
- (void)setVideoConfig:(ZegoMixerVideoConfig *)videoConfig;

// Set the input stream list of the stream mixing task object
- (void)setInputList:(NSArray<ZegoMixerInput *> *)inputList;

// Set the output list of the stream mixing task object
- (void)setOutputList:(NSArray<ZegoMixerOutput *> *)outputList;

// Set the watermark of the stream mixing task object
- (void)setWatermark:(ZegoWatermark *)watermark;

// Set the background image of the stream mixing task object
- (void)setBackgroundImageURL:(NSString *)backgroundImageURL;

@end

Create stream mixing task object

Create a new stream mixing task object through the constructor initWithTaskID, then call instance methods to set parameters such as input and output respectively.

ZegoMixerTask *task = [[ZegoMixerTask alloc] initWithTaskID:@"task-1"];

// Save this stream mixing task object
self.mixerTask = task;

ZegoMixerTask *task = [[ZegoMixerTask alloc] initWithTaskID:@"task-1"];

// Save this stream mixing task object
self.mixerTask = task;

(Optional) Set stream mixing video configuration

Stream mixing video configuration settings

Developers can configure video parameters (frame rate, bitrate, resolution) of stream mixing tasks through the ZegoMixerVideoConfig class.

If all streams to be mixed are pure audio, no setting is required.

The default values for video frame rate, bitrate, and resolution are 15 fps, 600 kbps, and 360p respectively.

Note

The maximum frame rate for stream mixing output is limited to within 20 frames by default. If you need to output a higher frame rate, please contact ZEGOCLOUD Technical Support for configuration.

ZegoMixerVideoConfig *videoConfig = [[ZegoMixerVideoConfig alloc] init];
[task setVideoConfig:videoConfig];

ZegoMixerVideoConfig *videoConfig = [[ZegoMixerVideoConfig alloc] init];
[task setVideoConfig:videoConfig];

(Optional) Set stream mixing audio configuration

Stream mixing audio configuration settings

Developers can call the ZegoMixerAudioConfig method to configure the audio bitrate, number of channels, and audio encoding of stream mixing tasks.

The default value of the audio bitrate bitrate is 48 kbps.

[task setAudioConfig:[ZegoMixerAudioConfig defaultConfig]];

[task setAudioConfig:[ZegoMixerAudioConfig defaultConfig]];

Set stream mixing input streams

According to the actual business scenario, define the input video stream ZegoMixerInput list, and set the "layout" parameter of each input video stream to layout the screen of each input stream. The ZEGOCLOUD server will mix the input streams and output them in a mixed stream in one screen.

Warning

By default, supports up to 9 input streams. If you need to input more streams, please contact ZEGOCLOUD Technical Support to confirm and configure.
When the "contentType" of all stream mixing input streams is set to "audio", the SDK does not process layout fields internally. At this time, there is no need to pay attention to the "layout" parameter.
When the "contentType" of all stream mixing input streams is set to "audio", the SDK internally sets the resolution to 1*1 by default (i.e., the stream mixing output is pure audio). If you want the stream mixing output to have video screens or background images, you need to set the "contentType" of at least one input stream to "video".

The layout of the input stream takes the upper left corner of the output mixed stream screen as the coordinate system origin. Refer to the origin to set the layout of the input stream, that is, pass CGRect(left, top, width, height) to the "layout" parameter of the input stream. In addition, the layer hierarchy of the input stream is determined by the position of the input stream in the input stream list. The later the position in the list, the higher the layer hierarchy.

Rect parameter descriptions are as follows:

Parameter	Description
left	Corresponds to the x coordinate of the upper left corner of the input stream screen.
top	Corresponds to the y coordinate of the upper left corner of the input stream screen.
width	Corresponds to the width of the input stream screen.
height	Corresponds to the height of the input stream screen.

Warning

The above parameters may vary on different development platforms. Please refer to the documentation of each platform for details.

Assume starting a stream mixing task with an output screen resolution of 375×667, the input stream is a 150×150 mixed stream located 50 from the left and 300 from the top, then you need to pass CGRect(50, 300, 150, 150) to the "layout" parameter of the input stream.

The position of this input stream in the final output mixed stream is as shown below:

Developers can refer to the following sample code to implement common stream mixing layouts: two screens horizontally tiled, four screens horizontally and vertically tiled, one large screen filled and two small screens floating.

The following layout examples are explained using 360×640 resolution.

Stream mixing layout example 1: Two screens horizontally tiled


// Fill in the first input stream configuration. Each input stream needs to set Stream ID (the value in this parameter must be the actual ID of the input stream), input stream type, layout, etc.
CGRect firstRect = CGRectMake(0, 0, 180, 640);
ZegoMixerInput *firstInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_1" contentType:ZegoMixerInputContentTypeVideo layout:firstRect];
firstInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
firstInput.label.text = @"text watermark";
firstInput.label.left = 0;
firstInput.label.font.type = ZegoFontTypeSourceHanSans;
firstInput.label.top = 0;
firstInput.label.font.color = 123456;
firstInput.label.font.size = 24;
firstInput.label.font.transparency = 50;

// Fill in the second input stream configuration
CGRect secondRect = CGRectMake(180, 0, 180, 640);
ZegoMixerInput *secondInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_2" contentType:ZegoMixerInputContentTypeVideo layout:secondRect];
secondInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
secondInput.label.text = @"text watermark";
secondInput.label.left = 0;
secondInput.label.font.type = ZegoFontTypeSourceHanSans;
secondInput.label.top = 0;
secondInput.label.font.color = 123456;
secondInput.label.font.size = 24;
secondInput.label.font.transparency = 50;

// Set stream mixing input

NSArray<ZegoMixerInput *> *inputArray = @[firstInput, secondInput];

[task setInputList:inputArray];


// Fill in the first input stream configuration. Each input stream needs to set Stream ID (the value in this parameter must be the actual ID of the input stream), input stream type, layout, etc.
CGRect firstRect = CGRectMake(0, 0, 180, 640);
ZegoMixerInput *firstInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_1" contentType:ZegoMixerInputContentTypeVideo layout:firstRect];
firstInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
firstInput.label.text = @"text watermark";
firstInput.label.left = 0;
firstInput.label.font.type = ZegoFontTypeSourceHanSans;
firstInput.label.top = 0;
firstInput.label.font.color = 123456;
firstInput.label.font.size = 24;
firstInput.label.font.transparency = 50;

// Fill in the second input stream configuration
CGRect secondRect = CGRectMake(180, 0, 180, 640);
ZegoMixerInput *secondInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_2" contentType:ZegoMixerInputContentTypeVideo layout:secondRect];
secondInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
secondInput.label.text = @"text watermark";
secondInput.label.left = 0;
secondInput.label.font.type = ZegoFontTypeSourceHanSans;
secondInput.label.top = 0;
secondInput.label.font.color = 123456;
secondInput.label.font.size = 24;
secondInput.label.font.transparency = 50;

// Set stream mixing input

NSArray<ZegoMixerInput *> *inputArray = @[firstInput, secondInput];

[task setInputList:inputArray];

Stream mixing layout example 2: Four screens horizontally and vertically tiled

ZegoMixerInput *firstInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_1" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(0, 0, 180, 320)];
firstInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
firstInput.label.text = @"text watermark";
firstInput.label.left = 0;
firstInput.label.font.type = ZegoFontTypeSourceHanSans;
firstInput.label.top = 0;
firstInput.label.font.color = 123456;
firstInput.label.font.size = 24;
firstInput.label.font.transparency = 50;

ZegoMixerInput *secondInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_2" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(180, 0, 180, 320)];
secondInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
secondInput.label.text = @"text watermark";
secondInput.label.left = 0;
secondInput.label.font.type = ZegoFontTypeSourceHanSans;
secondInput.label.top = 0;
secondInput.label.font.color = 123456;
secondInput.label.font.size = 24;
secondInput.label.font.transparency = 50;

ZegoMixerInput *thirdInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_3" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(0, 320, 180, 320)];
thirdInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
thirdInput.label.text = @"text watermark";
thirdInput.label.left = 0;
thirdInput.label.font.type = ZegoFontTypeSourceHanSans;
thirdInput.label.top = 0;
thirdInput.label.font.color = 123456;
thirdInput.label.font.size = 24;
thirdInput.label.font.transparency = 50;

ZegoMixerInput *forthInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_4" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(180, 320, 180, 320)];
forthInput.renderMode = ZegoMixRenderModeFill;
forthInput.label.text = @"text watermark";
forthInput.label.left = 0;
forthInput.label.font.type = ZegoFontTypeSourceHanSans;
forthInput.label.top = 0;
forthInput.label.font.color = 123456;
forthInput.label.font.size = 24;
forthInput.label.font.transparency = 50;

// Set stream mixing input
NSArray<ZegoMixerInput *> *inputArray = @[firstInput, secondInput, thirdInput, forthInput];
[task setInputList:inputArray];

ZegoMixerInput *firstInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_1" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(0, 0, 180, 320)];
firstInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
firstInput.label.text = @"text watermark";
firstInput.label.left = 0;
firstInput.label.font.type = ZegoFontTypeSourceHanSans;
firstInput.label.top = 0;
firstInput.label.font.color = 123456;
firstInput.label.font.size = 24;
firstInput.label.font.transparency = 50;

ZegoMixerInput *secondInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_2" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(180, 0, 180, 320)];
secondInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
secondInput.label.text = @"text watermark";
secondInput.label.left = 0;
secondInput.label.font.type = ZegoFontTypeSourceHanSans;
secondInput.label.top = 0;
secondInput.label.font.color = 123456;
secondInput.label.font.size = 24;
secondInput.label.font.transparency = 50;

ZegoMixerInput *thirdInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_3" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(0, 320, 180, 320)];
thirdInput.renderMode = ZegoMixRenderModeFill;

// Text watermark for input stream
thirdInput.label.text = @"text watermark";
thirdInput.label.left = 0;
thirdInput.label.font.type = ZegoFontTypeSourceHanSans;
thirdInput.label.top = 0;
thirdInput.label.font.color = 123456;
thirdInput.label.font.size = 24;
thirdInput.label.font.transparency = 50;

ZegoMixerInput *forthInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_4" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(180, 320, 180, 320)];
forthInput.renderMode = ZegoMixRenderModeFill;
forthInput.label.text = @"text watermark";
forthInput.label.left = 0;
forthInput.label.font.type = ZegoFontTypeSourceHanSans;
forthInput.label.top = 0;
forthInput.label.font.color = 123456;
forthInput.label.font.size = 24;
forthInput.label.font.transparency = 50;

// Set stream mixing input
NSArray<ZegoMixerInput *> *inputArray = @[firstInput, secondInput, thirdInput, forthInput];
[task setInputList:inputArray];

Stream mixing layout example 3: One large screen filled and two small screens floating

The layer hierarchy of input streams is determined by the position of the input stream in the input stream list. The later the position in the list, the higher the layer hierarchy. As shown in the following sample code, the layer hierarchy of the second input stream and the third input stream is higher than that of the first input stream. Therefore, the second and third streams float on top of the first stream's screen.

ZegoMixerInput *firstInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_1" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(0, 0, 360, 640)];


ZegoMixerInput *secondInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_2" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(230, 200, 110, 200)];

ZegoMixerInput *thirdInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_3" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(230, 420, 110, 200)];

// Set stream mixing input
NSArray<ZegoMixerInput *> *inputArray = @[firstInput, secondInput, thirdInput];
[task setInputList:inputArray];

ZegoMixerInput *firstInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_1" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(0, 0, 360, 640)];


ZegoMixerInput *secondInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_2" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(230, 200, 110, 200)];

ZegoMixerInput *thirdInput = [[ZegoMixerInput alloc] initWithStreamID:@"streamID_3" contentType:ZegoMixerInputContentTypeVideo layout:CGRectMake(230, 420, 110, 200)];

// Set stream mixing input
NSArray<ZegoMixerInput *> *inputArray = @[firstInput, secondInput, thirdInput];
[task setInputList:inputArray];

Set stream mixing output

Stream mixing output can set up to 3 at most. When the output target is in URL format, currently only RTMP URL format is supported: rtmp://xxxxxxxx, and two identical stream mixing output addresses cannot be passed in.

The following code demonstrates outputting to the ZEGO server (stream ID is "output-stream"):

NSArray<ZegoMixerOutput *> *outputArray = @[[[ZegoMixerOutput alloc] initWithTarget:@"output-stream"]];
[task setOutputList:outputArray];

NSArray<ZegoMixerOutput *> *outputArray = @[[[ZegoMixerOutput alloc] initWithTarget:@"output-stream"]];
[task setOutputList:outputArray];

(Optional) Set stream mixing image watermark

Stream mixing image watermark settings

If you need the URL of the watermark image, please contact ZEGOCLOUD Technical Support to obtain it.

The following code demonstrates setting a ZEGO image watermark in the upper left corner of the screen:

ZegoWatermark *watermark = [[ZegoWatermark alloc] initWithImageURL:@"preset-id://zegowp.png" layout:CGRectMake(0, 0, videoConfig.resolution.width/2, videoConfig.resolution.height/20)];
[task setWatermark:watermark];

ZegoWatermark *watermark = [[ZegoWatermark alloc] initWithImageURL:@"preset-id://zegowp.png" layout:CGRectMake(0, 0, videoConfig.resolution.width/2, videoConfig.resolution.height/20)];
[task setWatermark:watermark];

(Optional) Set stream mixing background image

Stream mixing background image settings

If you need the URL of the background image, please contact ZEGOCLOUD Technical Support to obtain it.

[task setBackgroundImageURL:@"preset-id://zegobg.png"];

[task setBackgroundImageURL:@"preset-id://zegobg.png"];

(Optional) Set stream mixing sound level callback

Stream mixing sound level callback settings

Warning

In video scenarios, it is not recommended to turn on the sound level switch, otherwise the playing stream end pulling HLS protocol streams may encounter compatibility issues.

You can choose whether to enable stream mixing sound level callback notifications by setting the enableSoundLevel parameter. After enabling (value is "YES"), users can receive sound level information of each single stream through the onMixerSoundLevelUpdate callback when pulling mixed streams.

[task enableSoundLevel:YES];

[task enableSoundLevel:YES];

(Optional) Set advanced configuration

Advanced configuration settings

Advanced configuration applies to some customization needs, for example: configuring video encoding format.

If you need to know the specific supported configuration item information, please contact ZEGOCLOUD Technical Support.

Note

Normal scenarios do not need to set advanced configuration.

// Specify the mixed stream output video format as vp8, which takes effect only when using specific publishing protocols.
NSDictionary *config = @{@"video_encode": @"vp8"};
[task setAdvancedConfig: config];

// If the mixed stream output video format is set to vp8, please synchronously set the audio encoding format to LOW3 for the setting to take effect.
ZegoMixerAudioConfig *audioConfig = [ZegoMixerAudioConfig defaultConfig];
audioConfig.codecID = ZegoAudioCodecIDLow3;
[task setAudioConfig:audioConfig];

// Specify the mixed stream output video format as vp8, which takes effect only when using specific publishing protocols.
NSDictionary *config = @{@"video_encode": @"vp8"};
[task setAdvancedConfig: config];

// If the mixed stream output video format is set to vp8, please synchronously set the audio encoding format to LOW3 for the setting to take effect.
ZegoMixerAudioConfig *audioConfig = [ZegoMixerAudioConfig defaultConfig];
audioConfig.codecID = ZegoAudioCodecIDLow3;
[task setAudioConfig:audioConfig];

3 Start stream mixing task

After completing the configuration of the ZegoMixerTask stream mixing task object, call the start stream mixing interface to start this stream mixing task, and handle the logic of failed to start the stream mixing task in the callback Block.

Warning

If you need to play mixed stream CDN resources on the Web end, when using CDN recording, please choose AAC-LC for audio encoding. Since some browsers (such as Google Chrome and Microsoft Edge) do not support the HE-AAC audio encoding format, it will cause the recorded file to be unable to play.

[self.engine startMixerTask:task callback:^(ZegoMixerStartResult * _Nonnull result) {
    if (result.errorCode == 0) {
        NSLog(@"Start mixer task success");
    } else {
        NSLog(@"Start mixer task fail");
    }
}];

[self.engine startMixerTask:task callback:^(ZegoMixerStartResult * _Nonnull result) {
    if (result.errorCode == 0) {
        NSLog(@"Start mixer task success");
    } else {
        NSLog(@"Start mixer task fail");
    }
}];

4 Update stream mixing task configuration

When the stream mixing information changes, such as the input stream list of stream mixing increases or decreases, adjust the stream mixing video output bitrate, etc., modify the parameters of this stream mixing task object, and then call the startMixerTask interface again to update the configuration.

Warning

When updating stream mixing task configuration, "taskID" cannot be changed.

The following code demonstrates adding an input stream during the stream mixing task, with top, middle, and bottom layout:

CGRect firstRect = CGRectMake(0, 0, videoConfig.resolution.width, videoConfig.resolution.height/3);
ZegoMixerInput *firstInput = [[ZegoMixerInput alloc] initWithContentType:ZegoMixerInputContentTypeVideo streamID:@"stream-1" layout:firstRect];

CGRect secondRect = CGRectMake(0, videoConfig.resolution.height/3, videoConfig.resolution.width, videoConfig.resolution.height*(2/3));
ZegoMixerInput *secondInput = [[ZegoMixerInput alloc] initWithContentType:ZegoMixerInputContentTypeVideo streamID:@"stream-2" layout:secondRect];

CGRect thirdRect =CGRectMake(0, videoConfig.resolution.height*(2/3), videoConfig.resolution.width, videoConfig.resolution.height);
ZegoMixerInput *thirdInput = [[ZegoMixerInput alloc] initWithContentType:ZegoMixerInputContentTypeVideo streamID:@"stream-3" layout:thirdRect];

NSArray<ZegoMixerInput *> *inputArray = @[firstInput, secondInput, thirdInput];

// Reset the input stream list of the previously saved stream mixing object
[self.mixerTask setInputList:inputArray];

// Call the start stream mixing task interface again to update the stream mixing configuration
[self.engine startMixerTask:self.mixerTask callback:^(ZegoMixerStartResult * _Nonnull result) {
    if (result.errorCode == 0) {
        NSLog(@"Start mixer task success");
    } else {
        NSLog(@"Start mixer task fail");
    }
}];

CGRect firstRect = CGRectMake(0, 0, videoConfig.resolution.width, videoConfig.resolution.height/3);
ZegoMixerInput *firstInput = [[ZegoMixerInput alloc] initWithContentType:ZegoMixerInputContentTypeVideo streamID:@"stream-1" layout:firstRect];

CGRect secondRect = CGRectMake(0, videoConfig.resolution.height/3, videoConfig.resolution.width, videoConfig.resolution.height*(2/3));
ZegoMixerInput *secondInput = [[ZegoMixerInput alloc] initWithContentType:ZegoMixerInputContentTypeVideo streamID:@"stream-2" layout:secondRect];

CGRect thirdRect =CGRectMake(0, videoConfig.resolution.height*(2/3), videoConfig.resolution.width, videoConfig.resolution.height);
ZegoMixerInput *thirdInput = [[ZegoMixerInput alloc] initWithContentType:ZegoMixerInputContentTypeVideo streamID:@"stream-3" layout:thirdRect];

NSArray<ZegoMixerInput *> *inputArray = @[firstInput, secondInput, thirdInput];

// Reset the input stream list of the previously saved stream mixing object
[self.mixerTask setInputList:inputArray];

// Call the start stream mixing task interface again to update the stream mixing configuration
[self.engine startMixerTask:self.mixerTask callback:^(ZegoMixerStartResult * _Nonnull result) {
    if (result.errorCode == 0) {
        NSLog(@"Start mixer task success");
    } else {
        NSLog(@"Start mixer task fail");
    }
}];

5 Stop stream mixing

// Pass in the taskID that is currently mixing to stop this stream mixing task
[self.engine stopMixerTask:self.mixerTask.taskID];

// Pass in the taskID that is currently mixing to stop this stream mixing task
[self.engine stopMixerTask:self.mixerTask.taskID];

Automatic Stream Mixing Usage Steps

Please refer to "Create Engine" and "Login Room" in Quick Start - Implementation Flow to complete initialization and room login.

Warning

The prerequisite for automatic stream mixing is that the target room exists.
The user who initiates automatic stream mixing can mix streams published by other users in the room (only audio streams can be mixed), without having to login to the room or publish stream in the room.

2 Set stream mixing configuration

ZegoAutoMixerTask is an automatic stream mixing task configuration object defined in the SDK. By configuring this object, you can customize automatic stream mixing tasks.

Create automatic stream mixing task object

Create a new automatic stream mixing task object, then set input, output and other parameters respectively.

Only one automatic stream mixing task ID can exist in a room, that is, ensure the uniqueness of the automatic stream mixing task ID. It is recommended to associate the automatic stream mixing task ID with the room ID. You can directly use the room ID as the automatic stream mixing task ID.
The room ID that needs automatic stream mixing. If the room does not exist, automatic stream mixing cannot be performed.

ZegoAutoMixerTask *task = [[ZegoAutoMixerTask alloc] init];

task.taskID = @"taskID1";
task.roomID = @"roomID1";

ZegoAutoMixerTask *task = [[ZegoAutoMixerTask alloc] init];

task.taskID = @"taskID1";
task.roomID = @"roomID1";

(Optional) Set automatic stream mixing audio configuration

Automatic stream mixing audio configuration

Set automatic stream mixing audio related configurations through ZegoMixerAudioConfig, mainly including audio bitrate, number of channels, encoding ID, and multi-audio stream mixing mode.

ZegoMixerAudioConfig *audioConfig = [ZegoMixerAudioConfig init];

// Audio bitrate, unit is kbps, default is 48 kbps, cannot be modified after starting the stream mixing task
audioConfig.bitrate = 48;

// Audio channel, default is ZegoAudioChannelMono mono
audioConfig.channel = ZegoAudioChannelMono;

// Encoding ID, default is ZEGO_AUDIO_CODEC_ID_DEFAULT
audioConfig.codecID = ZegoAudioCodecIDNormal;

// Multi-audio stream mixing mode, default is ZegoAudioMixModeRaw
audioConfig.mixMode = ZegoAudioMixModeRaw;

[task setAudioConfig:audioConfig];

ZegoMixerAudioConfig *audioConfig = [ZegoMixerAudioConfig init];

// Audio bitrate, unit is kbps, default is 48 kbps, cannot be modified after starting the stream mixing task
audioConfig.bitrate = 48;

// Audio channel, default is ZegoAudioChannelMono mono
audioConfig.channel = ZegoAudioChannelMono;

// Encoding ID, default is ZEGO_AUDIO_CODEC_ID_DEFAULT
audioConfig.codecID = ZegoAudioCodecIDNormal;

// Multi-audio stream mixing mode, default is ZegoAudioMixModeRaw
audioConfig.mixMode = ZegoAudioMixModeRaw;

[task setAudioConfig:audioConfig];

Through the channel parameter, you can modify the audio channel. Currently the following audio channels are supported:

Enumeration	Description	Applicable Scenarios
ZegoAudioChannelUnknown	Unknown.	-
ZegoAudioChannelMono	Mono.	Scenarios with only mono.
ZegoAudioChannelStereo	Stereo.	Scenarios with stereo.

Through the codecID parameter, you can modify the encoding ID. Currently the following encoding IDs are supported:

Enumeration	Description	Applicable Scenarios
ZegoAudioCodecIDDefault	Default value.	Determined by the [scenario] when calling createEngineWithProfile.
ZegoAudioCodecIDNormal	Bitrate range 10 kbps ~ 128 kbps; supports stereo; latency around 500ms. Requires server transcoding when interoperating with Web SDK; does not require server cloud transcoding when relaying to CDN.	Can be used for RTC and CDN publishing.
ZegoAudioCodecIDNormal2	Good compatibility, bitrate range 16 kbps ~ 192 kbps; supports stereo; latency around 350ms; audio quality weaker than [Normal] at the same bitrate (lower bitrate). Requires server transcoding when interoperating with Web SDK; does not require server cloud transcoding when relaying to CDN.	Can be used for RTC and CDN publishing.
ZegoAudioCodecIDNormal3	Not recommended.	Can only be used for RTC publishing.
ZegoAudioCodecIDLow	Not recommended.	Can only be used for RTC publishing.
ZegoCodecIDLow2	Not recommended, maximum bitrate is 16 kbps.	Can only be used for RTC publishing.
ZegoAudioCodecIDLow3	Bitrate range 6 kbps ~ 192 kbps; supports stereo; latency around 200ms; audio quality significantly better than [Normal] and [Normal2] at the same bitrate (lower bitrate); lower CPU overhead. Does not require server cloud transcoding when interoperating with Web SDK; requires server transcoding when relaying to CDN.	Only for RTC publishing.

Through the mixMode parameter, you can modify the multi-audio stream mixing mode. Currently the following multi-audio stream mixing modes are supported:

Enumeration	Description	Applicable Scenarios
ZegoAudioMixModeRaw	Default mode, no special behavior.	Scenarios with no special audio requirements.
ZegoAudioMixModeFocused	Audio focus mode, can highlight the sound of a certain stream among multiple audio streams.	Scenarios that need to highlight the sound of a certain stream.

Set automatic stream mixing output list

Set automatic stream mixing output list through ZegoMixerOutput. Users can pull mixed streams from the output targets in the list.

// Stream mixing output target, URL or stream ID
NSArray<ZegoMixerOutput *> *outputArray = @[[ZegoMixerOutput alloc] initWithTarget:@"output-stream"]];
[task setOutputList:outputArray];

// Stream mixing output target, URL or stream ID
NSArray<ZegoMixerOutput *> *outputArray = @[[ZegoMixerOutput alloc] initWithTarget:@"output-stream"]];
[task setOutputList:outputArray];

(Optional) Set automatic stream mixing sound level callback

Automatic stream mixing sound level callback settings

Warning

In video scenarios, it is not recommended to turn on the sound level switch, otherwise the playing stream end pulling HLS protocol streams may encounter compatibility issues.

You can choose whether to enable automatic stream mixing sound level callback notifications by setting the enableSoundLevel parameter. After enabling (value is "YES"), users can receive sound level information of each single stream through the onAutoMixerSoundLevelUpdate callback when pulling mixed streams.

task.enableSoundLevel = YES;

task.enableSoundLevel = YES;

3 Start automatic stream mixing task

After completing the configuration of the ZegoAutoMixerTask automatic stream mixing task object, call the startAutoMixerTask interface to start this automatic stream mixing task, and receive the start automatic stream mixing task result in the ZegoMixerStartCallback callback.

[[ZegoExpressEngine sharedEngine] startAutoMixerTask:task callback:^(int errorCode, NSDictionary * _Nullable extendedData) {
    if (errorCode == 0) {
     // Start automatic stream mixing task successfully
    }
}];

[[ZegoExpressEngine sharedEngine] startAutoMixerTask:task callback:^(int errorCode, NSDictionary * _Nullable extendedData) {
    if (errorCode == 0) {
     // Start automatic stream mixing task successfully
    }
}];

4 Stop automatic stream mixing

Call the stopAutoMixerTask interface to stop automatic stream mixing.

Warning

Before starting the next automatic stream mixing task in the same room, please call the stopAutoMixerTask interface to end the previous automatic stream mixing task to avoid the situation where the audience is still pulling the output stream of the previous automatic stream mixing task when a host has already started the next automatic stream mixing task with other hosts for mixing. If the user does not actively end the current automatic stream mixing task, the task will automatically end after the room is closed.

// Pass in the previously created stream mixing task object
[[ZegoExpressEngine sharedEngine] stopAutoMixerTask:self.autoMixerTask callback:^(int errorCode) {
    if(errorCode == 0) {
     //Stop automatic stream mixing task successfully
    }
}];

// Pass in the previously created stream mixing task object
[[ZegoExpressEngine sharedEngine] stopAutoMixerTask:self.autoMixerTask callback:^(int errorCode) {
    if(errorCode == 0) {
     //Stop automatic stream mixing task successfully
    }
}];

Fully Automatic Stream Mixing Usage Steps

Implement automatic audio stream mixing for each room through ZEGO server configuration. For details, please contact ZEGOCLOUD Technical Support.

FAQ

Can the mixed stream be pushed to a third-party CDN? How to relay to multiple CDNs?

If you need to push the mixed stream to a third-party CDN, fill in the CDN's URL in the "target" parameter of ZegoMixerOutput.

The filled URL format needs to be in RTMP format: "rtmp://xxxxxxxx".

To relay to multiple CDNs, create N output stream objects ZegoMixerOutput and put them in the "outputList" output list in ZegoMixerTask.
How to set the layout of each stream in stream mixing?

Usage example of the "layout" parameter of ZegoMixerInput:
- Assume that the upper left corner coordinates of a certain stream are (50, 300), and the lower right corner coordinates are (200, 450), that is, the "layout" parameter is "CGRectMake(50, 300, 200, 450);".
- Assume that the resolution "resolution" in the "videoConfig" parameter of ZegoMixerTask is "CGSizeMake(375, 667)".
Then the position of this stream in the final output mixed stream is as shown below:
When the proportion of the "CGRect" layout of the stream mixing input object "ZegoMixerInput" does not match the resolution of the stream itself, how will the screen be cropped?

The SDK will perform proportional scaling. Assume an input stream has a resolution of "720 × 1280", that is, a ratio of "9:16", and the "layout" parameter of this stream's ZegoMixerInput is "CGRectMake(0, 0, 100, 100);", that is, a ratio of "1:1", the screen will display the middle part of this stream, that is, the upper and lower parts will be cropped off.
Participating co-hosting hosts want their respective audiences to see their video in the large window in the mixed stream layout. How to mix streams?

Each host layouts and initiates their own stream mixing.

For example: Host A sets the width and height of the layout of their own published stream A larger than the layout width and height of pulling host B's stream B, then initiates a stream mixing task to output a stream "A_Mix"; Host B sets the width and height of the layout of their own published stream B larger than the layout width and height of pulling host A's stream A, then initiates stream mixing to output a stream "B_Mix".

That is, a total of two stream mixing tasks need to be initiated.
What is the difference between the two ways of stream mixing: "start mixing immediately after the single host starts live streaming" and "start mixing only when the second host joins co-hosting"? What are the pros and cons?

The advantage of starting stream mixing from the beginning of single-host live streaming is simple implementation, but the disadvantage is that there will be some additional CDN cost overhead for mixing single streams.

Start publishing stream only from the beginning of single-host live streaming, and start stream mixing only when the second host joins co-hosting. The advantage is saving costs; the disadvantage is that development implementation is more complex. The audience side needs to pull the single-host stream first, and after the hosts start co-hosting and enable stream mixing, stop pulling the single-host stream and then switch to pulling the mixed stream. The method of mixing from the beginning does not require the audience side to switch from pulling the single-host stream to pulling the mixed stream.
Does stream mixing support circular or square screens?

Does not support circular shapes, squares can be achieved through layout.
When pushing pure audio mixed streams and setting a background image, encountering that the background image cannot be displayed normally, how to handle?

In this case, customers need to correctly set the width and height of the output layout according to their own business needs, and contact ZEGOCLOUD Technical Support to configure and enable black frame filling.

Stream Mixing

Feature Introduction

Overview

Stream Mixing Method Classification

Advantages

Sample Source Code Download

Prerequisites

Implementation Flow

Manual Stream Mixing Usage Steps

1 Initialize and login to room

2 Set stream mixing configuration

Create stream mixing task object

(Optional) Set stream mixing video configuration

(Optional) Set stream mixing audio configuration

Set stream mixing input streams

Set stream mixing output

(Optional) Set stream mixing image watermark

(Optional) Set stream mixing background image

(Optional) Set stream mixing sound level callback

(Optional) Set advanced configuration

3 Start stream mixing task

4 Update stream mixing task configuration

5 Stop stream mixing

Automatic Stream Mixing Usage Steps

1 Initialize and login to room

2 Set stream mixing configuration

Create automatic stream mixing task object

(Optional) Set automatic stream mixing audio configuration

Set automatic stream mixing output list

(Optional) Set automatic stream mixing sound level callback

3 Start automatic stream mixing task

4 Stop automatic stream mixing

Fully Automatic Stream Mixing Usage Steps

FAQ