logo
Video Call
Server API
Stream Mixing and Transcoding APIs
On this page

Audio Stream Moderation Callback


Description

When developers have audio stream content moderation requirements, after initiating the Start Audio Stream Moderation task, a POST request will be made to the callback URL to obtain the audio stream moderation recognition results. When the ReturnFinishInfo parameter in the Start Audio Stream Moderation task is 1, after the moderation task is completed, a moderation task status callback will be initiated.

Callback instructions

  • Request method: POST.
Note

The callback data format is JSON. You need to perform UrlDecode decoding on it.

Callback parameters

Moderation result callback

Common parametersTypeDescription
EventStringCallback event. This callback returns censor_audio_v2_result.
AppIdNumberAppID.
TimestampNumberServer current time, Unix timestamp (seconds).
NonceStringRandom number.
SignatureStringSignature. For details, see Verification instructions.
Business parametersTypeDescription
CodeNumberReturn code.
0 indicates success.
MessageStringDescription of the operation result.
TaskIdStringTaskId of the moderation task.
Corresponds to the TaskId returned in the response parameters of Start Audio Stream Moderation.
ResultTaskIdStringID of this callback information.
DetailObjectModeration result details of the 10s segment.
└ RiskLevelStringRisk level (exists when code is 0).
Developers can perform corresponding processing on the audio stream and user corresponding to the violating segment based on this value.
Note: This field will return the "risk type list" with the highest risk degree in the returned 10s audio (for example, "politics" usually has a higher risk degree). The priority of the risk degree of the specific "risk type list" is determined internally by Shumei moderation.
  • PASS: Normal content, recommended to pass directly.
  • REVIEW: Suspicious content, recommended for manual review.
  • REJECT: Violating content, recommended to block directly.
└ AudioTextStringSemantic text corresponding to the 10s audio segment.
└ AudioUrlStringURL of the 10s audio segment.
└ PreAudioUrlStringThis parameter has a value only when the parameter ReturnPreAudio=1 in the start moderation task request. It represents the URL address of the 20s audio segment of the current violating audio segment + the previous audio segment.
└ RiskDescriptionStringThe specific "risk type list" and detailed categories of the violating content with the highest risk degree in the current 10s segment.
It will be divided into level-1 tags, level-2 tags, and level-3 tags according to the detailed degree.
Note: This is only for reference when people understand the risk cause. Please do not rely on the value of this parameter for logical processing.
  • When RiskLevel is PASS, return "Normal".
  • When a custom list is hit, return the custom list name.
  • In other cases, the display format is: level-1 tag: level-2 tag: level-3 tag.
└ RiskLabel1StringLevel-1 tag.
When RiskLevel is PASS, return "normal".
└ RiskLabel2StringLevel-2 tag.
Level-2 tag belongs to level-1 tag. When riskLevel is PASS, it is empty.
└ RiskLabel3StringLevel-3 tag.
Level-3 tag belongs to level-2 tag. When riskLevel is PASS, it is empty.
└ VadStatusNumberSilence status of the audio segment:
  • 0: Silent segment. When it is a silent segment, other partial fields will return empty, subject to the actual callback content.
  • 1: Non-silent segment.
└ RiskDetailObjectRisk details.
It will display the risk information with the highest risk degree in the 10s segment. For example, if the current 10s segment hits two "risk type lists" of "pornography" and "politics", and the backend determines that the risk degree of "politics" is higher than "pornography", the current return will be risk details related to "politics". For details, see RiskDetail.
└ RiskInfoListArray of ObjectList of all risk information in the 10s audio segment.
Sorted from large to small according to the risk degree priority customized internally by Shumei. For details, see RiskInfoList.
└ BusinessInfoListArray of ObjectList of all business information.
If the following fields cannot meet your needs, you can contact technical support for adjustments. For details, see BusinessInfoList.
AuxInfoObjectAuxiliary information.
└ RoomIdStringRoom ID of the moderation.
└ ProcessBeginTimeNumberTime when the 10s audio starts moderation (13-digit Unix timestamp).
└ ProcessFinishTimeNumberTime when the 10s audio finishes moderation (13-digit Unix timestamp).
└ UserIdStringUser ID.

RiskDetail

Common parametersTypeDescription
RiskSourceNumberRisk source:
  • 1000: No risk.
  • 1001: Audio semantics.
  • 1003: Audio features (e.g., pitch, timbre, voiceprint, melody, etc.).
AudioTextStringText translated from 10s audio semantics.
MatchedListsArray of ObjectCustom lists hit (configured by developers contacting technical support).
NameStringCustom list name.

RiskInfoList

Common parametersTypeDescription
RiskLevelNumberRisk level (exists when code is 0).
Developers can perform corresponding processing on the audio stream and user corresponding to the violating segment based on this value.
  • PASS: Normal content, recommended to pass directly.
  • REVIEW: Suspicious content, recommended for manual review.
  • REJECT: Violating content, recommended to block directly.
RiskDescriptionStringThe specific "risk type list" and detailed categories of the violating content with the highest risk degree in the current 10s segment.
It will be divided into level-1 tags, level-2 tags, and level-3 tags according to the detailed degree.
Note: This is only for reference when people understand the risk cause. Please do not rely on the value of this parameter for logical processing.
  • When RiskLevel is PASS, return "Normal".
  • When a custom list is hit, return the custom list name.
  • In other cases, the display format is: level-1 tag: level-2 tag: level-3 tag.
RiskLabel1StringLevel-1 tag.
When RiskLevel is PASS, return "normal".
RiskLabel2StringLevel-2 tag.
Level-2 tag belongs to level-1 tag. When riskLevel is PASS, it is empty.
RiskLabel3StringLevel-3 tag.
Level-3 tag belongs to level-2 tag. When riskLevel is PASS, it is empty.
RiskDetailObjectRisk details, displaying the risk information with the highest risk degree in the 10s segment.
For example, if the current 10s segment hits two "risk type lists" of "pornography" and "politics", and the backend determines that the risk degree of "politics" is higher than "pornography", the current return will be risk details related to "politics".
RiskSourceNumberRisk source:
  • 1000: No risk.
  • 1001: Audio semantics.
  • 1003: Audio features (e.g., pitch, timbre, voiceprint, melody, etc.).
AudioTextStringText translated from 10s audio semantics.
Only the risk information with the highest risk degree has a value in this field.
MatchedListsArray of ObjectCustom lists hit (configured by developers contacting technical support).
NameStringCustom list name.

BusinessInfoList

Common parametersTypeDescription
BusinessDescriptionStringChinese description of the business tag.
The format is the Chinese name of "level-1 business tag: level-2 business tag: level-3 business tag", such as portrait: portrait posture: sitting posture.
BusinessLabel1StringLevel-1 business tag.
BusinessLabel2StringLevel-2 business tag.
BusinessLabe3StringLevel-3 business tag.

Moderation task status callback

When the ReturnFinishInfo parameter in the Start Audio Stream Moderation task is 1, after the audio stream moderation task is completed, a moderation task status callback will be initiated.

Common parametersTypeDescription
EventStringCallback event. This callback returns censor_audio_v2_status.
AppIdNumberAppID.
TimestampNumberServer current time, Unix timestamp (seconds).
NonceStringRandom number.
SignatureStringSignature. For details, see Verification instructions.
Business parametersTypeDescription
CodeNumberReturn code. 0 indicates success.
MessageStringDescription of the operation result.
TaskIdStringTaskId of the moderation task, corresponding to the TaskId returned by Start Audio Stream Moderation.
StatusNumberModeration status.
0: Moderation completed.
AuxInfoObjectAuxiliary information.
└ RoomIdStringRoom ID.
└ CensorStreamTimeNumberTotal duration of the moderation stream of this task (unit: seconds).

Data examples

Moderation result callback example

{
    //Common parameters
    "Event": "censor_audio_v2_result",
    "AppId": 1,
    "Timestamp": 1724743250,
    "Nonce": "7407715855877898783",
    "Signature": "5cc9e67af0ba0c95f99bd73f79a36485f574ad11",
    //Business parameters
    "Code": 0,
    "Message": "success",
    "TaskId": "384a8a77aeb352d3ec8144ab4640cc52",
    "ResultTaskId": "384a8a77aeb352d3ec8144ab4640cc52_2",
    "Detail": {
        "RiskLevel": "REJECT",
        "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
        "AudioUrl": "http://xxxx/POST_AUDIOSTREAM%2FMP3%2F20240606%2F384a8a77aeb352d3ec8144ab4640cc52_2.mp3?Expires=1720269317&OSSAccessKeyId=LTAI5tLsVBxJ8nhyy5gQVW3K&Signature=8JZymbV%2F6Psm72k6S2Xq3Dcrg14%3D",
        "PreAudioUrl": "http://xxxx/POST_AUDIOSTREAM%2FMP3%2F20240606%2F384a8a77aeb352d3ec8144ab4640cc52_2_pre.mp3?Expires=1720269317&OSSAccessKeyId=LTAI5tLsVBxJ8nhyy5gQVW3K&Signature=BKUDNNWPklQldaEMSFOvSts6O84%3D",
        "RiskDescription": "ad:contact:contact",
        "RiskLabel1": "ad",
        "RiskLabel2": "lianxifangshi",
        "RiskLabel3": "lianxifangshi",
        "VadStatus": 1,
        "RiskDetail": {
            "RiskSource": 1001,
            "AudioText":  "Let's be friends on Facebook, my ID is facebook.johndoe",
            "MatchedLists": null
        },
        "RiskInfoList": [
            {
                "RiskLevel": "REJECT",
                "RiskDescription": "ad:contact:contact",
                "RiskLabel1": "ad",
                "RiskLabel2": "lianxifangshi",
                "RiskLabel3": "lianxifangshi",
                "RiskDetail": {
                    "RiskSource": 1001,
                    "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
                    "MatchedLists": null
                }
            },
            {
                "RiskLevel": "REJECT",
                "RiskDescription": "ad:contact:contact",
                "RiskLabel1": "ad",
                "RiskLabel2": "lianxifangshi",
                "RiskLabel3": "lianxifangshi",
                "RiskDetail": {
                    "RiskSource": 1001,
                    "AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
                    "MatchedLists": null
                }
            }
        ],
        "BusinessInfoList": [
            {
                "BusinessDescription": "language:English:English",
                "BusinessLabel1": "language",
                "BusinessLabel2": "English",
                "BusinessLabel3": "English"
            }
        ]
    },
    "AuxInfo": {
        "RoomId": "room_1",
        "ProcessBeginTime": 1717677317155,
        "ProcessFinishTime": 1717677317554,
        "UserId":"user_1"
    }
}

Moderation task status callback example

{
    //Common parameters
    "Event": "censor_audio_v2_status",
    "AppId": 1,
    "Timestamp": 1724743250,
    "Nonce": "7407715855877898783",
    "Signature": "5cc9e67af0ba0c95f99bd73f79a36485f574ad11",
    //Business parameters
    "Code": 0,
    "Message": "success",
    "TaskId": "384a8a77aeb352d3ec8144ab4640cc52",
    "Status": 0,
    "AuxInfo": {
        "RoomId": "room_1",
        "CensorStreamTime": 31
    }
}

Return response

Return HTTP status code 2XX (e.g., 200) indicates success, and other responses indicate failure.

Callback retry strategy

If the ZEGO server does not receive a response, or the received HTTP status code is not 2XX (e.g., 200), it will attempt to retry, up to 5 retries. The interval between each retry request and the previous request is 2s, 4s, 8s, 16s, and 32s respectively. If the 5th retry still fails, no more retries will be made, and the callback will be lost.

FAQ

  1. If I don't call the stop audio/video stream moderation API after calling the start audio/video stream moderation API, what will happen?

    It will automatically end after about 5 minutes of pulling an empty stream. It is recommended to actively call the API to end it when ending moderation.

Previous

On-Demand Screenshot Callback

Next

Video Stream Moderation Callback