Previously, we talked about how to achieve manageable cost-efficiencies in the SMOOTH Live streaming solution. We will continue how to achieve manageable user experiences in it.
1. How to Allow Efficient Cost Management
The SMOOTH Live solution allows designating network resources to labeled users and managing networks to optimize cost structure, achieving manageable cost-efficiencies.
Please refer to the previous article, “How to Achieve Cost-Efficiency In The SMOOTH Live Video Solution“, for more details.
If you want to know what the SMOOTH Live solution is about, please refer to the solution introduction article, titled “ZEGOCLOUD SMOOTH Live, A Live Streaming Solution with RTC quality and CDN pricing“.
2. How to Upgrade User Experiences
1) How to implement instant-opening for the first video frame
We use a few tactics to implement instant-opening.
a) The resource allocation tactic
Traditional CDN uses the domain name resolution as a scheduling system to assign access servers, which is a traditional solution for nearby access and load balancing. However, it cannot assign quality resources to specific priority users.
ZEGOCLOUD built its proprietary scheduling system that allows user-level network resource scheduling. The resource allocation strategy will consider various dimensions including the same telecommunication operator, the same city/province, timing, and user priority level.
For an example of a superstar host, the platform wanted to ensure her show with the best user experience in an annual ceremony. The platform can label the host and her fans-users. During show time, the host and the fans in her live streaming room will be assigned the best access servers for the right timing in their respective cities.
b) 0-RTT to establish a connection
We adopt the 0-RTT tactic to establish a connection, reducing the timeto connect to less than 1 RTT. Normally it will take 3 handshakes to establish a connection with TCP. We use a private UDP-based protocol to establish a connection with the 0-RTT tactic. Signals will go with the payload datagram, and data transmission will start right away without the 3 handshakes.
c) 1-hop time back-sourcing
Traditional CDN back-sources stream content hop by hop in sequence, i.e. if the stream is not found on the access server, the server will query with the scheduling center, and look for the stream on the next-hop server. It will continue and go for several hops to find the source stream.
ZEGOCLOUD adopts a different method that will start back-sourcing-by-hop in parallel, i.e., once the stream is not found on the access server, the scheduling center will notify all the servers on the to-be route, and each server will start back-sourcing in parallel. Within 1-hop time, the route will be ready for streaming.
d)Adaptive playback algorithm
Traditional media player maintains a 3-5 second fixed playback buffer and will start rendering once the buffer is full. The disadvantage is obvious, i.e. the latency is at least 3-5-second long, and it is impossible to achieve instant opening.
ZEGOCLOUD created an adaptive playback algorithm, in which the data level in the playback buffer can adjust adaptively. Once the key frame is ready, it will be rendered right away, allowing users can see the video picture instantly.
With the instant-opening enhancement, according to data analytics, 99% of users can open and view the first video frame within a second; 85.07% of users can do it within 500 milliseconds, and 96.8% of users can do it within 1 second in regions with good networks like Thailand.
2) How to implement high definition for video frames
The quality of video frames is determined by bit rate and resolution approximately. We will discuss 3 methods here.
a) Use H.265 to encode
H.265 is the next generation encoder for the currently prevalent H.264. Compared to H.264, it can reduce bit-rate by 30%, saving transmission costs significantly. However, not all smartphones support H.265, and we maintain a whitelist and adaptive transcoders to ensure compatibility with smartphones that don’t support H.265.
b) ROI (Regions Of Interest)encoding
It is a newly developed technology that allows assigning more bit-rate to regions of interest on a video image. As a result, we can get a high definition of the regions of interest, meanwhile, reducing the bit rate as a whole to save cost.
c) Super-resolution
It is an AI-based technology that works on the receiver to enhance video image quality. It can reduce image noise or eliminate color blocks to make the video image look clearer. Also, it can increase the resolution of the original video images, and supplement them with more pixel details to improve their quality.
3) How to implement smooth playback for video streaming
Three factors have an impact on smoothness: data transmission, frame rate, and playback tactic.
a) Data transmission
The MSDN and the UDP-based protocol jointly guarantee the quality of data transmission, which is the solid foundation of playback smoothness on the receiver side. It has been discussed in great detail previously, we won’t repeat it anymore.
b) Frame rate
The standard playback frame rate is 24 FPS for movies, TV broadcasts,and even smartphones. In general, the higher the frame rate is, the smoother the video looks to human eyes. The common playback frame rate is 15 FPS for live streaming since most hosts won’t be moving activelyin shows.
ZEGOCLOUD created an AI-based enhancement to generate supplement video frames with references to the rear and front frames using deep learning technology. The frame rate can be increased to like 30 FPS. Playback smoothness will be improved significantly without extra consumption of bandwidth.
c) Playback tactic
ZEGOCLOUD created an adaptive playback algorithm, with which a dynamic playback buffer level is maintained. The size of the buffer is fixed, however, a threshold called the water level is used to mark the data level to be buffered, and it changes adaptively according to network conditions.
If the network condition is good, the water level is minimized to, say 10 milliseconds, if the network condition is worsening, the water level is adjusted to be higher, say 100 milliseconds.
The solution adopts a fast and slow playback tactic. i.e., if the actual data level is higher than the water level, the player will play a little bit faster gracefully than normal to reduce it back to the water level, likewise, if the data level is lower than the water level, the player will play a little bits lower than normal to bring the data level back to the water level mark.
3. Conclusion
SMOOTH Live has been the representative of the latest advancement in live streaming solutions. If you are still using traditional CDNs for live streaming, and want to upgrade for better cost efficiency and user experiences, please contact us to speak with our solution expert.
Let’s Build APP Together
Start building with real-time video, voice & chat SDK for apps today!