Audio Stream Moderation Callback

Description

When developers have audio stream content moderation requirements, after initiating the Start Audio Stream Moderation task, a POST request will be made to the callback URL to obtain the audio stream moderation recognition results. When the ReturnFinishInfo parameter in the Start Audio Stream Moderation task is 1, after the moderation task is completed, a moderation task status callback will be initiated.

Callback instructions

Request method: POST.

Note

The callback data format is JSON. You need to perform UrlDecode decoding on it.

Request URL: Please refer to Console - Cloud Market - Shumei Content Moderation and follow the page instructions to complete the configuration of relevant callback URLs.
Transfer protocol: HTTPS/HTTP. HTTPS is recommended.

Callback parameters

Moderation result callback

Common parameters	Type	Description
Event	String	Callback event. This callback returns `censor_audio_v2_result`.
AppId	Number	AppID.
Timestamp	Number	Server current time, Unix timestamp (seconds).
Nonce	String	Random number.
Signature	String	Signature. For details, see Verification instructions.

Business parameters	Type	Description
Code	Number	Return code. 0 indicates success.
Message	String	Description of the operation result.
TaskId	String	TaskId of the moderation task. Corresponds to the TaskId returned in the response parameters of Start Audio Stream Moderation.
ResultTaskId	String	ID of this callback information.
Detail	Object	Moderation result details of the 10s segment.
└ RiskLevel	String	Risk level (exists when code is 0). Developers can perform corresponding processing on the audio stream and user corresponding to the violating segment based on this value. Note: This field will return the "risk type list" with the highest risk degree in the returned 10s audio (for example, "politics" usually has a higher risk degree). The priority of the risk degree of the specific "risk type list" is determined internally by Shumei moderation. PASS: Normal content, recommended to pass directly. REVIEW: Suspicious content, recommended for manual review. REJECT: Violating content, recommended to block directly.
└ AudioText	String	Semantic text corresponding to the 10s audio segment.
└ AudioUrl	String	URL of the 10s audio segment.
└ PreAudioUrl	String	This parameter has a value only when the parameter ReturnPreAudio=1 in the start moderation task request. It represents the URL address of the 20s audio segment of the current violating audio segment + the previous audio segment.
└ RiskDescription	String	The specific "risk type list" and detailed categories of the violating content with the highest risk degree in the current 10s segment. It will be divided into level-1 tags, level-2 tags, and level-3 tags according to the detailed degree. Note: This is only for reference when people understand the risk cause. Please do not rely on the value of this parameter for logical processing. When RiskLevel is PASS, return "Normal". When a custom list is hit, return the custom list name. In other cases, the display format is: level-1 tag: level-2 tag: level-3 tag.
└ RiskLabel1	String	Level-1 tag. When RiskLevel is PASS, return "normal".
└ RiskLabel2	String	Level-2 tag. Level-2 tag belongs to level-1 tag. When riskLevel is PASS, it is empty.
└ RiskLabel3	String	Level-3 tag. Level-3 tag belongs to level-2 tag. When riskLevel is PASS, it is empty.
└ VadStatus	Number	Silence status of the audio segment: 0: Silent segment. When it is a silent segment, other partial fields will return empty, subject to the actual callback content. 1: Non-silent segment.
└ RiskDetail	Object	Risk details. It will display the risk information with the highest risk degree in the 10s segment. For example, if the current 10s segment hits two "risk type lists" of "pornography" and "politics", and the backend determines that the risk degree of "politics" is higher than "pornography", the current return will be risk details related to "politics". For details, see RiskDetail.
└ RiskInfoList	Array of Object	List of all risk information in the 10s audio segment. Sorted from large to small according to the risk degree priority customized internally by Shumei. For details, see RiskInfoList.
└ BusinessInfoList	Array of Object	List of all business information. If the following fields cannot meet your needs, you can contact technical support for adjustments. For details, see BusinessInfoList.
AuxInfo	Object	Auxiliary information.
└ RoomId	String	Room ID of the moderation.
└ ProcessBeginTime	Number	Time when the 10s audio starts moderation (13-digit Unix timestamp).
└ ProcessFinishTime	Number	Time when the 10s audio finishes moderation (13-digit Unix timestamp).
└ UserId	String	User ID.

RiskDetail

Common parameters	Type	Description
RiskSource	Number	Risk source: 1000: No risk. 1001: Audio semantics. 1003: Audio features (e.g., pitch, timbre, voiceprint, melody, etc.).
AudioText	String	Text translated from 10s audio semantics.
MatchedLists	Array of Object	Custom lists hit (configured by developers contacting technical support).
Name	String	Custom list name.

RiskInfoList

Common parameters	Type	Description
RiskLevel	Number	Risk level (exists when code is 0). Developers can perform corresponding processing on the audio stream and user corresponding to the violating segment based on this value. PASS: Normal content, recommended to pass directly. REVIEW: Suspicious content, recommended for manual review. REJECT: Violating content, recommended to block directly.
RiskDescription	String	The specific "risk type list" and detailed categories of the violating content with the highest risk degree in the current 10s segment. It will be divided into level-1 tags, level-2 tags, and level-3 tags according to the detailed degree. Note: This is only for reference when people understand the risk cause. Please do not rely on the value of this parameter for logical processing. When RiskLevel is PASS, return "Normal". When a custom list is hit, return the custom list name. In other cases, the display format is: level-1 tag: level-2 tag: level-3 tag.
RiskLabel1	String	Level-1 tag. When RiskLevel is PASS, return "normal".
RiskLabel2	String	Level-2 tag. Level-2 tag belongs to level-1 tag. When riskLevel is PASS, it is empty.
RiskLabel3	String	Level-3 tag. Level-3 tag belongs to level-2 tag. When riskLevel is PASS, it is empty.
RiskDetail	Object	Risk details, displaying the risk information with the highest risk degree in the 10s segment. For example, if the current 10s segment hits two "risk type lists" of "pornography" and "politics", and the backend determines that the risk degree of "politics" is higher than "pornography", the current return will be risk details related to "politics".
RiskSource	Number	Risk source: 1000: No risk. 1001: Audio semantics. 1003: Audio features (e.g., pitch, timbre, voiceprint, melody, etc.).
AudioText	String	Text translated from 10s audio semantics. Only the risk information with the highest risk degree has a value in this field.
MatchedLists	Array of Object	Custom lists hit (configured by developers contacting technical support).
Name	String	Custom list name.

BusinessInfoList

Common parameters	Type	Description
BusinessDescription	String	Chinese description of the business tag. The format is the Chinese name of "level-1 business tag: level-2 business tag: level-3 business tag", such as portrait: portrait posture: sitting posture.
BusinessLabel1	String	Level-1 business tag.
BusinessLabel2	String	Level-2 business tag.
BusinessLabe3	String	Level-3 business tag.

Moderation task status callback

When the ReturnFinishInfo parameter in the Start Audio Stream Moderation task is 1, after the audio stream moderation task is completed, a moderation task status callback will be initiated.

Common parameters	Type	Description
Event	String	Callback event. This callback returns `censor_audio_v2_status`.
AppId	Number	AppID.
Timestamp	Number	Server current time, Unix timestamp (seconds).
Nonce	String	Random number.
Signature	String	Signature. For details, see Verification instructions.
Business parameters	Type	Description
Code	Number	Return code. 0 indicates success.
Message	String	Description of the operation result.
TaskId	String	TaskId of the moderation task, corresponding to the TaskId returned by Start Audio Stream Moderation.
Status	Number	Moderation status. 0: Moderation completed.
AuxInfo	Object	Auxiliary information.
└ RoomId	String	Room ID.
└ CensorStreamTime	Number	Total duration of the moderation stream of this task (unit: seconds).

Data examples

Moderation result callback example

{
    //Common parameters
    "Event": "censor_audio_v2_result",
    "AppId": 1,
    "Timestamp": 1724743250,
    "Nonce": "7407715855877898783",
    "Signature": "5cc9e67af0ba0c95f99bd73f79a36485f574ad11",
    //Business parameters
    "Code": 0,
    "Message": "success",
    "TaskId": "384a8a77aeb352d3ec8144ab4640cc52",
    "ResultTaskId": "384a8a77aeb352d3ec8144ab4640cc52_2",
    "Detail": {
        "RiskLevel": "REJECT",
        "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
        "AudioUrl": "http://xxxx/POST_AUDIOSTREAM%2FMP3%2F20240606%2F384a8a77aeb352d3ec8144ab4640cc52_2.mp3?Expires=1720269317&OSSAccessKeyId=LTAI5tLsVBxJ8nhyy5gQVW3K&Signature=8JZymbV%2F6Psm72k6S2Xq3Dcrg14%3D",
        "PreAudioUrl": "http://xxxx/POST_AUDIOSTREAM%2FMP3%2F20240606%2F384a8a77aeb352d3ec8144ab4640cc52_2_pre.mp3?Expires=1720269317&OSSAccessKeyId=LTAI5tLsVBxJ8nhyy5gQVW3K&Signature=BKUDNNWPklQldaEMSFOvSts6O84%3D",
        "RiskDescription": "ad:contact:contact",
        "RiskLabel1": "ad",
        "RiskLabel2": "lianxifangshi",
        "RiskLabel3": "lianxifangshi",
        "VadStatus": 1,
        "RiskDetail": {
            "RiskSource": 1001,
            "AudioText":  "Let's be friends on Facebook, my ID is facebook.johndoe",
            "MatchedLists": null
        },
        "RiskInfoList": [
            {
                "RiskLevel": "REJECT",
                "RiskDescription": "ad:contact:contact",
                "RiskLabel1": "ad",
                "RiskLabel2": "lianxifangshi",
                "RiskLabel3": "lianxifangshi",
                "RiskDetail": {
                    "RiskSource": 1001,
                    "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
                    "MatchedLists": null
                }
            },
            {
                "RiskLevel": "REJECT",
                "RiskDescription": "ad:contact:contact",
                "RiskLabel1": "ad",
                "RiskLabel2": "lianxifangshi",
                "RiskLabel3": "lianxifangshi",
                "RiskDetail": {
                    "RiskSource": 1001,
                    "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
                    "MatchedLists": null
                }
            }
        ],
        "BusinessInfoList": [
            {
                "BusinessDescription": "language:English:English",
                "BusinessLabel1": "language",
                "BusinessLabel2": "English",
                "BusinessLabel3": "English"
            }
        ]
    },
    "AuxInfo": {
        "RoomId": "room_1",
        "ProcessBeginTime": 1717677317155,
        "ProcessFinishTime": 1717677317554,
        "UserId":"user_1"
    }
}

{
    //Common parameters
    "Event": "censor_audio_v2_result",
    "AppId": 1,
    "Timestamp": 1724743250,
    "Nonce": "7407715855877898783",
    "Signature": "5cc9e67af0ba0c95f99bd73f79a36485f574ad11",
    //Business parameters
    "Code": 0,
    "Message": "success",
    "TaskId": "384a8a77aeb352d3ec8144ab4640cc52",
    "ResultTaskId": "384a8a77aeb352d3ec8144ab4640cc52_2",
    "Detail": {
        "RiskLevel": "REJECT",
        "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
        "AudioUrl": "http://xxxx/POST_AUDIOSTREAM%2FMP3%2F20240606%2F384a8a77aeb352d3ec8144ab4640cc52_2.mp3?Expires=1720269317&OSSAccessKeyId=LTAI5tLsVBxJ8nhyy5gQVW3K&Signature=8JZymbV%2F6Psm72k6S2Xq3Dcrg14%3D",
        "PreAudioUrl": "http://xxxx/POST_AUDIOSTREAM%2FMP3%2F20240606%2F384a8a77aeb352d3ec8144ab4640cc52_2_pre.mp3?Expires=1720269317&OSSAccessKeyId=LTAI5tLsVBxJ8nhyy5gQVW3K&Signature=BKUDNNWPklQldaEMSFOvSts6O84%3D",
        "RiskDescription": "ad:contact:contact",
        "RiskLabel1": "ad",
        "RiskLabel2": "lianxifangshi",
        "RiskLabel3": "lianxifangshi",
        "VadStatus": 1,
        "RiskDetail": {
            "RiskSource": 1001,
            "AudioText":  "Let's be friends on Facebook, my ID is facebook.johndoe",
            "MatchedLists": null
        },
        "RiskInfoList": [
            {
                "RiskLevel": "REJECT",
                "RiskDescription": "ad:contact:contact",
                "RiskLabel1": "ad",
                "RiskLabel2": "lianxifangshi",
                "RiskLabel3": "lianxifangshi",
                "RiskDetail": {
                    "RiskSource": 1001,
                    "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
                    "MatchedLists": null
                }
            },
            {
                "RiskLevel": "REJECT",
                "RiskDescription": "ad:contact:contact",
                "RiskLabel1": "ad",
                "RiskLabel2": "lianxifangshi",
                "RiskLabel3": "lianxifangshi",
                "RiskDetail": {
                    "RiskSource": 1001,
                    "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
                    "MatchedLists": null
                }
            }
        ],
        "BusinessInfoList": [
            {
                "BusinessDescription": "language:English:English",
                "BusinessLabel1": "language",
                "BusinessLabel2": "English",
                "BusinessLabel3": "English"
            }
        ]
    },
    "AuxInfo": {
        "RoomId": "room_1",
        "ProcessBeginTime": 1717677317155,
        "ProcessFinishTime": 1717677317554,
        "UserId":"user_1"
    }
}

Moderation task status callback example

{
    //Common parameters
    "Event": "censor_audio_v2_status",
    "AppId": 1,
    "Timestamp": 1724743250,
    "Nonce": "7407715855877898783",
    "Signature": "5cc9e67af0ba0c95f99bd73f79a36485f574ad11",
    //Business parameters
    "Code": 0,
    "Message": "success",
    "TaskId": "384a8a77aeb352d3ec8144ab4640cc52",
    "Status": 0,
    "AuxInfo": {
        "RoomId": "room_1",
        "CensorStreamTime": 31
    }
}

{
    //Common parameters
    "Event": "censor_audio_v2_status",
    "AppId": 1,
    "Timestamp": 1724743250,
    "Nonce": "7407715855877898783",
    "Signature": "5cc9e67af0ba0c95f99bd73f79a36485f574ad11",
    //Business parameters
    "Code": 0,
    "Message": "success",
    "TaskId": "384a8a77aeb352d3ec8144ab4640cc52",
    "Status": 0,
    "AuxInfo": {
        "RoomId": "room_1",
        "CensorStreamTime": 31
    }
}

Return response

Return HTTP status code 2XX (e.g., 200) indicates success, and other responses indicate failure.

Callback retry strategy

If the ZEGO server does not receive a response, or the received HTTP status code is not 2XX (e.g., 200), it will attempt to retry, up to 5 retries. The interval between each retry request and the previous request is 2s, 4s, 8s, 16s, and 32s respectively. If the 5th retry still fails, no more retries will be made, and the callback will be lost.

FAQ

If I don't call the stop audio/video stream moderation API after calling the start audio/video stream moderation API, what will happen?

It will automatically end after about 5 minutes of pulling an empty stream. It is recommended to actively call the API to end it when ending moderation.