Image matting refers to dividing an image into foreground and background. Usually, the foreground is the part that people pay most of their attention to, while the background is the part that can be ignored under most circumstances. Image matting demands refining the details of the figure edge, especially its details, such as hair or translucent tulle clothes.
Introduction to Image Matting Technology
Let’s dive into a more technical explanation of what this technology is. Image matting can be expressed by the following formula:
Ri represents the final result, Ai represents the transparency mask required for matting, Bi represents the new background that needs to be replaced. Fi represents the foreground, which generally refers to people and related objects. In Ai, the value of the foreground position is greater than 0, while the background position value is equal to 0. Matting is essentially obtaining a high-quality transparency mask (alpha graph).
Most traditional algorithms generate alpha graphs using trimap (manual drawing).
The rendering of Trimap
Trimap contains three different pixel values. The position with a pixel value of 0 represents the determined background; the position with a pixel value of 1 represents the determined foreground; finally, the position with a pixel value of 0.5 represents the undetermined area. This part of the position can be either the foreground or the background. The rendering of trimap requires users to have specific professional knowledge, so not many people can use it. Trimap also requires that human beings interact with computers. Thus it cannot be done in real-time automatically.
The matting algorithm solves the problem of whether the pixel belongs to the foreground or background by using methods including random walking, KNN, closed-form, and others in the undetermined region.
ZEGOCLOUD Image Matting Technology
ZEGOCLOUD adopts deep learning to develop its proprietary image matting technology. It adopts an encoder and decoder structure, and the only thing needed is an image as input, and it will produce the final alpha map.
The encoder and decoder structure can encode and compress the input image and extract its depth features. Finally, it fits the actual alpha image samples through decoding. ZEGOCLOUD encoder uses lightweight mobilenetv3_ Small architecture, which enables real-time computing on edge devices.
Use Cases of Image Matting
Background replacement of live video streaming
Live-streaming sessions are most likely to take place at home. Usually, there is no independent studio, and the background is a simple coin of just the host’s home. In this scenario, protecting one own privacy becomes the main point.
ZEGOCLOUD matting technology allows quick extract the portrait of the host and replacing it with the appropriate image as the background.
Background replacement of credential photo
We often need to use different types of photos depending on the documents or certificates we have to prepare. For example, the background color of photos should have a specific color, either blue or white or any other. Going to a professional photo studio for credential photos has a cost, of course, in terms of time and money.
ZEGOCLOUD smart credential photo matting technology is straightforward and easy to use without professional knowledge. It works as follows:
- It can detect and pinpoint the critical points of the face
- it automatically crops the original image to retain the head and shoulder
- finally, it uses the image matting algorithm to replace the background with a desired colorful one.
ZEGOCLOUD matting algorithm is lightweight. The whole deep learning model file is only 6 MB, and it takes only 100 milliseconds to process a credential photo with an Intel i5 CPU.
Image matting technology in Art
In art examination scenes, including dance, music, and advertisements, some elements in the background need to be removed. However, related objects to the artistic scene, for example, the instruments and props during a musical performance, should be retained.
ZEGOCLOUD matting technology in this scenario can finely extract the portrait, replace the background, and retain objects related to the performance.
To maintain the integrity and stereoscopic sense of the scene, ZEGOCLOUD proposes the approach of bokeh background. The algorithm refers to the continuous timing information between video frames and can suppress picture flicker. It blurs the background while retaining the portrait. The algorithm’s whole deep learning model file is only 3 MB, which is exceptionally lightweight. It only takes 20 milliseconds to process a frame on a Mac Book pro empowered with an M1 chip.
This is all you need to know about the available features and functioning of Image Matting technology. We will introduce image matting technology in-depth in our next series article. Make sure to follow our blog to stay up-to-date on real-time technologies matters.
If you work on a product that fits into related use cases or scenarios, please don’t hesitate to contact us to discuss it.
Talk to Expert
Learn more about our solutions and get your question answered.
Take your apps to the next level with our voice, video and chat APIs
- 10,000 minutes for free
- 4,000+ corporate clients
- 3 Billion daily call minutes