Audio Stream Moderation Callback
Description
When developers have audio stream content moderation requirements, after initiating the Start Audio Stream Moderation task, a POST request will be made to the callback URL to obtain the audio stream moderation recognition results. When the ReturnFinishInfo parameter in the Start Audio Stream Moderation task is 1, after the moderation task is completed, a moderation task status callback will be initiated.
Callback instructions
- Request method: POST.
The callback data format is JSON. You need to perform UrlDecode decoding on it.
- Request URL: Please refer to Console - Cloud Market - Shumei Content Moderation and follow the page instructions to complete the configuration of relevant callback URLs.
- Transfer protocol: HTTPS/HTTP. HTTPS is recommended.
Callback parameters
Moderation result callback
| Common parameters | Type | Description |
|---|---|---|
| Event | String | Callback event. This callback returns censor_audio_v2_result. |
| AppId | Number | AppID. |
| Timestamp | Number | Server current time, Unix timestamp (seconds). |
| Nonce | String | Random number. |
| Signature | String | Signature. For details, see Verification instructions. |
| Business parameters | Type | Description |
|---|---|---|
| Code | Number | Return code. 0 indicates success. |
| Message | String | Description of the operation result. |
| TaskId | String | TaskId of the moderation task. Corresponds to the TaskId returned in the response parameters of Start Audio Stream Moderation. |
| ResultTaskId | String | ID of this callback information. |
| Detail | Object | Moderation result details of the 10s segment. |
| └ RiskLevel | String | Risk level (exists when code is 0). Developers can perform corresponding processing on the audio stream and user corresponding to the violating segment based on this value. Note: This field will return the "risk type list" with the highest risk degree in the returned 10s audio (for example, "politics" usually has a higher risk degree). The priority of the risk degree of the specific "risk type list" is determined internally by Shumei moderation.
|
| └ AudioText | String | Semantic text corresponding to the 10s audio segment. |
| └ AudioUrl | String | URL of the 10s audio segment. |
| └ PreAudioUrl | String | This parameter has a value only when the parameter ReturnPreAudio=1 in the start moderation task request. It represents the URL address of the 20s audio segment of the current violating audio segment + the previous audio segment. |
| └ RiskDescription | String | The specific "risk type list" and detailed categories of the violating content with the highest risk degree in the current 10s segment. It will be divided into level-1 tags, level-2 tags, and level-3 tags according to the detailed degree. Note: This is only for reference when people understand the risk cause. Please do not rely on the value of this parameter for logical processing.
|
| └ RiskLabel1 | String | Level-1 tag. When RiskLevel is PASS, return "normal". |
| └ RiskLabel2 | String | Level-2 tag. Level-2 tag belongs to level-1 tag. When riskLevel is PASS, it is empty. |
| └ RiskLabel3 | String | Level-3 tag. Level-3 tag belongs to level-2 tag. When riskLevel is PASS, it is empty. |
| └ VadStatus | Number | Silence status of the audio segment:
|
| └ RiskDetail | Object | Risk details. It will display the risk information with the highest risk degree in the 10s segment. For example, if the current 10s segment hits two "risk type lists" of "pornography" and "politics", and the backend determines that the risk degree of "politics" is higher than "pornography", the current return will be risk details related to "politics". For details, see RiskDetail. |
| └ RiskInfoList | Array of Object | List of all risk information in the 10s audio segment. Sorted from large to small according to the risk degree priority customized internally by Shumei. For details, see RiskInfoList. |
| └ BusinessInfoList | Array of Object | List of all business information. If the following fields cannot meet your needs, you can contact technical support for adjustments. For details, see BusinessInfoList. |
| AuxInfo | Object | Auxiliary information. |
| └ RoomId | String | Room ID of the moderation. |
| └ ProcessBeginTime | Number | Time when the 10s audio starts moderation (13-digit Unix timestamp). |
| └ ProcessFinishTime | Number | Time when the 10s audio finishes moderation (13-digit Unix timestamp). |
| └ UserId | String | User ID. |
RiskDetail
| Common parameters | Type | Description |
|---|---|---|
| RiskSource | Number | Risk source:
|
| AudioText | String | Text translated from 10s audio semantics. |
| MatchedLists | Array of Object | Custom lists hit (configured by developers contacting technical support). |
| Name | String | Custom list name. |
RiskInfoList
| Common parameters | Type | Description |
|---|---|---|
| RiskLevel | Number | Risk level (exists when code is 0). Developers can perform corresponding processing on the audio stream and user corresponding to the violating segment based on this value.
|
| RiskDescription | String | The specific "risk type list" and detailed categories of the violating content with the highest risk degree in the current 10s segment. It will be divided into level-1 tags, level-2 tags, and level-3 tags according to the detailed degree. Note: This is only for reference when people understand the risk cause. Please do not rely on the value of this parameter for logical processing.
|
| RiskLabel1 | String | Level-1 tag. When RiskLevel is PASS, return "normal". |
| RiskLabel2 | String | Level-2 tag. Level-2 tag belongs to level-1 tag. When riskLevel is PASS, it is empty. |
| RiskLabel3 | String | Level-3 tag. Level-3 tag belongs to level-2 tag. When riskLevel is PASS, it is empty. |
| RiskDetail | Object | Risk details, displaying the risk information with the highest risk degree in the 10s segment. For example, if the current 10s segment hits two "risk type lists" of "pornography" and "politics", and the backend determines that the risk degree of "politics" is higher than "pornography", the current return will be risk details related to "politics". |
| RiskSource | Number | Risk source:
|
| AudioText | String | Text translated from 10s audio semantics. Only the risk information with the highest risk degree has a value in this field. |
| MatchedLists | Array of Object | Custom lists hit (configured by developers contacting technical support). |
| Name | String | Custom list name. |
BusinessInfoList
| Common parameters | Type | Description |
|---|---|---|
| BusinessDescription | String | Chinese description of the business tag. The format is the Chinese name of "level-1 business tag: level-2 business tag: level-3 business tag", such as portrait: portrait posture: sitting posture. |
| BusinessLabel1 | String | Level-1 business tag. |
| BusinessLabel2 | String | Level-2 business tag. |
| BusinessLabe3 | String | Level-3 business tag. |
Moderation task status callback
When the ReturnFinishInfo parameter in the Start Audio Stream Moderation task is 1, after the audio stream moderation task is completed, a moderation task status callback will be initiated.
| Common parameters | Type | Description |
|---|---|---|
| Event | String | Callback event. This callback returns censor_audio_v2_status. |
| AppId | Number | AppID. |
| Timestamp | Number | Server current time, Unix timestamp (seconds). |
| Nonce | String | Random number. |
| Signature | String | Signature. For details, see Verification instructions. |
| Business parameters | Type | Description |
| Code | Number | Return code. 0 indicates success. |
| Message | String | Description of the operation result. |
| TaskId | String | TaskId of the moderation task, corresponding to the TaskId returned by Start Audio Stream Moderation. |
| Status | Number | Moderation status. 0: Moderation completed. |
| AuxInfo | Object | Auxiliary information. |
| └ RoomId | String | Room ID. |
| └ CensorStreamTime | Number | Total duration of the moderation stream of this task (unit: seconds). |
Data examples
Moderation result callback example
{
//Common parameters
"Event": "censor_audio_v2_result",
"AppId": 1,
"Timestamp": 1724743250,
"Nonce": "7407715855877898783",
"Signature": "5cc9e67af0ba0c95f99bd73f79a36485f574ad11",
//Business parameters
"Code": 0,
"Message": "success",
"TaskId": "384a8a77aeb352d3ec8144ab4640cc52",
"ResultTaskId": "384a8a77aeb352d3ec8144ab4640cc52_2",
"Detail": {
"RiskLevel": "REJECT",
"AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
"AudioUrl": "http://xxxx/POST_AUDIOSTREAM%2FMP3%2F20240606%2F384a8a77aeb352d3ec8144ab4640cc52_2.mp3?Expires=1720269317&OSSAccessKeyId=LTAI5tLsVBxJ8nhyy5gQVW3K&Signature=8JZymbV%2F6Psm72k6S2Xq3Dcrg14%3D",
"PreAudioUrl": "http://xxxx/POST_AUDIOSTREAM%2FMP3%2F20240606%2F384a8a77aeb352d3ec8144ab4640cc52_2_pre.mp3?Expires=1720269317&OSSAccessKeyId=LTAI5tLsVBxJ8nhyy5gQVW3K&Signature=BKUDNNWPklQldaEMSFOvSts6O84%3D",
"RiskDescription": "ad:contact:contact",
"RiskLabel1": "ad",
"RiskLabel2": "lianxifangshi",
"RiskLabel3": "lianxifangshi",
"VadStatus": 1,
"RiskDetail": {
"RiskSource": 1001,
"AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
"MatchedLists": null
},
"RiskInfoList": [
{
"RiskLevel": "REJECT",
"RiskDescription": "ad:contact:contact",
"RiskLabel1": "ad",
"RiskLabel2": "lianxifangshi",
"RiskLabel3": "lianxifangshi",
"RiskDetail": {
"RiskSource": 1001,
"AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
"MatchedLists": null
}
},
{
"RiskLevel": "REJECT",
"RiskDescription": "ad:contact:contact",
"RiskLabel1": "ad",
"RiskLabel2": "lianxifangshi",
"RiskLabel3": "lianxifangshi",
"RiskDetail": {
"RiskSource": 1001,
"AudioText": "Let's be friends on Facebook, my ID is facebook.johndoe",
"MatchedLists": null
}
}
],
"BusinessInfoList": [
{
"BusinessDescription": "language:English:English",
"BusinessLabel1": "language",
"BusinessLabel2": "English",
"BusinessLabel3": "English"
}
]
},
"AuxInfo": {
"RoomId": "room_1",
"ProcessBeginTime": 1717677317155,
"ProcessFinishTime": 1717677317554,
"UserId":"user_1"
}
}Moderation task status callback example
{
//Common parameters
"Event": "censor_audio_v2_status",
"AppId": 1,
"Timestamp": 1724743250,
"Nonce": "7407715855877898783",
"Signature": "5cc9e67af0ba0c95f99bd73f79a36485f574ad11",
//Business parameters
"Code": 0,
"Message": "success",
"TaskId": "384a8a77aeb352d3ec8144ab4640cc52",
"Status": 0,
"AuxInfo": {
"RoomId": "room_1",
"CensorStreamTime": 31
}
}Return response
Return HTTP status code 2XX (e.g., 200) indicates success, and other responses indicate failure.
Callback retry strategy
If the ZEGO server does not receive a response, or the received HTTP status code is not 2XX (e.g., 200), it will attempt to retry, up to 5 retries. The interval between each retry request and the previous request is 2s, 4s, 8s, 16s, and 32s respectively. If the 5th retry still fails, no more retries will be made, and the callback will be lost.
FAQ
-
If I don't call the stop audio/video stream moderation API after calling the start audio/video stream moderation API, what will happen?
It will automatically end after about 5 minutes of pulling an empty stream. It is recommended to actively call the API to end it when ending moderation.
