Video Encoding and Decoding

Feature Overview

When developers publish and play video streams, they can configure detailed encoding and decoding settings, including enabling layered encoding, using hardware encoding/decoding, and setting encoding methods.

Layered Video Encoding

Layered video encoding divides the bitstream into a base layer and an enhancement layer. This encoding method provides better experience for users under different network conditions. The base layer ensures basic video quality, while the enhancement layer supplements the base layer. Users with good network conditions can fetch both base and enhancement layers for better experience, while users with poor network conditions can fetch only the base layer to maintain basic video quality.

When developers encounter the following situations in co-hosting or stream mixing scenarios, it is recommended to use the SDK's layered video encoding feature:

Need to display video streams of different quality on different terminals.
Need to maintain smooth co-hosting in poor network environments.
Need to adaptively fetch video stream quality based on network conditions.

For the advantages and disadvantages of layered video encoding, please refer to: Pros and Cons of Layered Video Encoding

Hardware Encoding and Decoding

Developers can choose to enable hardware encoding and hardware decoding. When hardware encoding/decoding is enabled, the GPU will be used for encoding and decoding, reducing CPU usage. If certain devices experience severe heating when publishing or playing high-resolution audio and video streams, you can enable hardware encoding/decoding.

Video Encoding Method

Developers can configure video encoding to align encoding across different platforms, thereby enabling multi-platform interoperability.

Usage scenarios:

Generally, the default encoding can be used.
If you need to reduce the bitrate at the same resolution and frame rate, you can use H.265.
If you need to interoperate with mini programs, you need to use H.264.

Download Example Source Code

Please refer to Download Example Source Code to get the source code.

For related source code, please check the files in the "/ZegoExpressExample/Examples/AdvancedVideoProcessing/EncodingAndDecoding" directory.

Prerequisites

Before implementing video encoding and decoding features, please ensure:

You have integrated ZEGO Express SDK in your project and implemented basic real-time audio and video functionality. For details, please refer to Quick Start - Integration and Quick Start - Implementation Flow.
You have created a project in the ZEGOCLOUD Console and applied for a valid AppID and AppSign. For details, please refer to Console - Project Management.

Implementation Steps

Layered Video Encoding

Using layered video encoding requires the following two steps:

Enable layered video encoding by specifying a specific encoder before publishing the stream.
Specify the layered video to fetch when playing the stream.

Enable Layered Video Encoding

Call setVideoConfig before calling startPublishingStream to set the parameter "codecID" in the zegoVideoConfig class to enable/disable the layered video encoding feature.

Setting "codecID" to "ZEGO_VIDEO_CODEC_ID_SVC" enables this feature.
Setting "codecID" to "ZEGO_VIDEO_CODEC_ID_DEFAULT", "ZEGO_VIDEO_CODEC_ID_VP8", or "ZEGO_VIDEO_CODEC_ID_H265" disables this feature.

ZegoVideoConfig videoConfig;
videoConfig.codecID = ZEGO_VIDEO_CODEC_ID_SVC;
engine->setVideoConfig(videoConfig);

std::string streamID = "MultiLayer-1";
engine->startPublishingStream(streamID);

ZegoVideoConfig videoConfig;
videoConfig.codecID = ZEGO_VIDEO_CODEC_ID_SVC;
engine->setVideoConfig(videoConfig);

std::string streamID = "MultiLayer-1";
engine->startPublishingStream(streamID);

Specify the Layered Video to Fetch

After the publishing side enables layered video encoding, the playing side calls setPlayStreamVideoType. By default, the playing side will fetch the appropriate video layer based on network conditions, such as fetching only the base layer in weak network conditions. Developers can also pass specific playing parameters to fetch a specific video layer. This interface can be called before or after playing the stream.

A total of the following three video layers are supported:

Enum Value	Description
ZEGO_VIDEO_STREAM_TYPE_DEFAULT	Select layer based on network status
ZEGO_VIDEO_STREAM_TYPE_SMALL	Specify to fetch base layer (small resolution)
ZEGO_VIDEO_STREAM_TYPE_BIG	Specify to fetch enhancement layer (large resolution)

Taking fetching the enhancement layer as an example:

engine->setPlayStreamVideoType(playStreamID,ZEGO_VIDEO_STREAM_TYPE_BIG);

engine->setPlayStreamVideoType(playStreamID,ZEGO_VIDEO_STREAM_TYPE_BIG);

Hardware Encoding and Decoding

Since a small number of devices have poor support for hardware encoding/decoding, the SDK uses software encoding and software decoding by default. If developers have the need to use hardware encoding, they can refer to this section to set it up themselves.

Enable Hardware Encoding

If developers need to enable hardware encoding, they can call enableHardwareEncoder.

Warning

This feature must be set before publishing the stream to take effect. If set after publishing the stream, it will take effect after stopping and re-publishing the stream.

// Enable hardware encoding
engine->enableHardwareEncoder(true);

// Enable hardware encoding
engine->enableHardwareEncoder(true);

Enable Hardware Decoding

If developers need to enable hardware decoding, they can call enableHardwareDecoder.

Warning

This feature must be set before playing the stream to take effect. If set after playing the stream, it will take effect after stopping and re-playing the stream.

// Enable hardware decoding
engine->enableHardwareDecoder(true);

// Enable hardware decoding
engine->enableHardwareDecoder(true);

Set Video Encoding Method

Call setVideoConfig before calling startPublishingStream to set the parameter "codecID" under the "VideoConfig" class to set the video encoding method. Currently, the supported video encoding methods are as follows:

Enum Value	Encoding Method	Usage Scenario
ZEGO_VIDEO_CODEC_ID_DEFAULT	Default encoding (H.264)	H.264 is a widely used high-precision video recording, compression, and publishing format with good compatibility.
ZEGO_VIDEO_CODEC_ID_SVC	Layered encoding (H.264 SVC)	Scenarios that require layered encoding.
ZEGO_VIDEO_CODEC_ID_VP8	VP8	Commonly used for Web video.
ZEGO_VIDEO_CODEC_ID_H265	H.265	Has better compression ratio, but compatibility needs to be considered.

Taking setting the encoding method to H.265 as an example:

ZegoVideoConfig videoConfig;
videoConfig.codecID = ZEGO_VIDEO_CODEC_ID_H265;
engine->setVideoConfig(videoConfig);

std::string streamID = "MultiLayer-1";
engine->startPublishingStream(streamID);

ZegoVideoConfig videoConfig;
videoConfig.codecID = ZEGO_VIDEO_CODEC_ID_H265;
engine->setVideoConfig(videoConfig);

std::string streamID = "MultiLayer-1";
engine->startPublishingStream(streamID);

API Reference List

Method	Description
setVideoConfig	Set video parameters
enableHardwareEncoder	Enable/Disable hardware encoding
enableHardwareDecoder	Enable/Disable hardware decoding

FAQ

Are there differences in bitrate, resolution, and other parameters between fetching the base layer and enhancement layer in layered video encoding?

The base layer resolution in layered video encoding is 50% of the enhancement layer's width and height respectively. The bitrate of fetching the base layer is approximately 25% of fetching the enhancement layer's bitrate, with other parameters being consistent.

For example: If the user sets the encoding resolution to "800 × 600", the enhancement layer resolution will be "800 × 600", and the base layer resolution will be "400 × 300".
If relaying or directly publishing to CDN, and the audience fetches the stream from CDN, is layered encoding effective? What are the bitrate and resolution of the stream fetched from CDN?

Layered video encoding uses ZEGO's private protocol. The playing side can only fetch video streams of different layers from ZegoServer.
In the CDN relaying scenario, the stream published by the publishing side to the ZEGO server can use layered encoding, and layered encoded streams can also be fetched from the ZEGO server. However, the stream relayed by the ZEGO server to the CDN server cannot use layered encoding and will be a high-quality stream. The stream fetched from CDN has the same bitrate and resolution as the enhancement layer in layered encoding.
In the direct CDN publishing scenario, since it does not go through the ZEGO server, layered encoding is not effective. The resolution and bitrate of the stream fetched from CDN are consistent with the resolution and bitrate set by the publishing user.

What are the pros and cons of layered video encoding?

Pros:

Layered video encoding can generate different bitstreams or extract different bitstreams as needed. Using layered video encoding to encode once is more efficient than using ordinary encoding methods to encode multiple times.
Layered video encoding is more flexible in application.
Layered video encoding has stronger network adaptability.

Cons:

Slightly lower compression efficiency: Under the same conditions, the compression efficiency of layered video encoding is about 20% lower than that of ordinary encoding methods. That is, to achieve the same video quality as ordinary encoding methods, the bitrate of layered video encoding needs to be 20% higher than that of ordinary encoding methods. The more layers, the more the efficiency decreases. (Currently, the SDK only supports 1 base layer and 1 enhancement layer)
Lower encoding efficiency: Under the same conditions, layered video encoding has higher encoding computational complexity than ordinary encoding methods, so the encoding efficiency is about 10% lower than ordinary encoding methods.
No hardware encoding support: Layered video encoding does not support hardware encoding and has a larger burden on CPU performance, but supports hardware decoding.