Sound Level and Audio Spectrum

2024-01-31

Feature Overview

Concept	Description	Application Scenarios	Scenario Diagram
Sound Level	Refers to the volume of a specific stream, hereinafter referred to as "sound level".	During publishing and playing, determine which users on the mic are speaking and display it in the UI.
Audio Spectrum	Refers to the energy value of digital audio signals at various frequency points.	In KTV singing scenarios, under the premise of publishing or playing streams, let the host or audience see animations of pitch and volume changes.

Sample Source Code Download

Please refer to Download Sample Source Code to obtain the source code.

For related source code, please check the files in the "/ZegoExpressExample/Examples/AdvancedAudioProcessing/SoundLevelAndAudioSpectrum" directory.

Prerequisites

Before implementing sound level and audio spectrum features, please ensure:

A project has been created in the ZEGOCLOUD Console, and valid AppID and AppSign have been obtained. For details, please refer to Console - Project Information.
ZEGO Express SDK has been integrated into the project, and basic audio/video publishing and playing functionality has been implemented. For details, please refer to Quick Start - Integration and Quick Start - Implementation.

Usage Steps for Non-Mixed Stream Scenarios

1 Enable Sound Level and Audio Spectrum Monitoring

The SDK disables sound level and audio spectrum monitoring by default. Users need to actively call relevant interfaces to enable monitoring. You can separately initiate listening for corresponding callbacks for sound level or audio spectrum.

Interface prototypes:

// Start sound level monitoring
virtual void startSoundLevelMonitor() = 0;

// Use this interface to pass in the corresponding config when VAD functionality is needed
virtual void startSoundLevelMonitor(ZegoSoundLevelConfig config) = 0;

// Start audio spectrum monitoring
virtual void startAudioSpectrumMonitor() = 0;

// Start sound level monitoring
virtual void startSoundLevelMonitor() = 0;

// Use this interface to pass in the corresponding config when VAD functionality is needed
virtual void startSoundLevelMonitor(ZegoSoundLevelConfig config) = 0;

// Start audio spectrum monitoring
virtual void startAudioSpectrumMonitor() = 0;

Call examples:

Call the startSoundLevelMonitor interface to start listening for sound level:

// Start sound level monitoring
engine->startSoundLevelMonitor();
// Use this interface when VAD functionality is needed
engine->startSoundLevelMonitor(config);

// Start sound level monitoring
engine->startSoundLevelMonitor();
// Use this interface when VAD functionality is needed
engine->startSoundLevelMonitor(config);

Call the startAudioSpectrumMonitor interface to start listening for audio spectrum:

// Start audio spectrum monitoring
engine->startAudioSpectrumMonitor();

// Start audio spectrum monitoring
engine->startAudioSpectrumMonitor();

2 Listen to Sound Level and Audio Spectrum Callbacks

After enabling sound level and audio spectrum monitoring, the SDK will periodically notify users of current sound level and audio spectrum data through relevant callbacks (onCapturedSoundLevelUpdate, onRemoteSoundLevelUpdate, onCapturedAudioSpectrumUpdate, onRemoteAudioSpectrumUpdate). Users only need to override relevant callback functions to implement UI display.

Interface prototypes:

/**
* Local captured audio sound level callback
* @param soundLevel Local captured sound level value, ranging from 0.0 to 100.0
*/
virtual void onCapturedSoundLevelUpdate(double soundLevel) {

}
/**
* Local captured audio sound level callback, supports voice activity detection

* @note To trigger this callback, you must call the [startSoundLevelMonitor] function to start the sound level monitor.
* @note Callback notification period is the parameter value set when calling [startSoundLevelMonitor].
*
* @param soundLevelInfo Local captured sound level value, ranging from 0.0 to 100.0
virtual void onCapturedSoundLevelInfoUpdate(const ZegoSoundLevelInfo& soundLevelInfo) {

}

/**
* Remote audio sound level callback
* @param soundLevels Remote sound level key-value pairs, key is stream ID, value is the corresponding stream's sound level value
*/
virtual void onRemoteSoundLevelUpdate(const std::map<std::string, double>& soundLevels) {

}
/**
* Remote playing audio sound level callback, supports voice activity detection
*
* @note To trigger this callback, you must call the [startSoundLevelMonitor] function to start the sound level monitor, and be in the state of playing streams.
* @note Callback notification period is the parameter value set when calling [startSoundLevelMonitor].
*
* @param soundLevelInfos Remote sound level key-value pairs, key is stream ID, value is the corresponding stream's sound level value, value ranges from 0.0 to 100.0
virtual void onRemoteSoundLevelInfoUpdate(const std::unordered_map<std::string, ZegoSoundLevelInfo>& soundLevelInfo) {

}

/**
* Local captured audio spectrum callback
* @param audioSpectrum Local captured audio spectrum value array, spectrum value range is [0-2^30]
*
*/
virtual void onCapturedAudioSpectrumUpdate(const ZegoAudioSpectrum& audioSpectrum) {

}

/**
* Remote playing audio spectrum callback
* @param audioSpectrums Remote audio spectrum key-value pairs, key is stream ID, value is the corresponding stream's audio spectrum value array, spectrum value range is [0-2^30]
*
*/
virtual void onRemoteAudioSpectrumUpdate(const std::map<std::string, ZegoAudioSpectrum>& audioSpectrums) {

}

/**
* Local captured audio sound level callback
* @param soundLevel Local captured sound level value, ranging from 0.0 to 100.0
*/
virtual void onCapturedSoundLevelUpdate(double soundLevel) {

}
/**
* Local captured audio sound level callback, supports voice activity detection

* @note To trigger this callback, you must call the [startSoundLevelMonitor] function to start the sound level monitor.
* @note Callback notification period is the parameter value set when calling [startSoundLevelMonitor].
*
* @param soundLevelInfo Local captured sound level value, ranging from 0.0 to 100.0
virtual void onCapturedSoundLevelInfoUpdate(const ZegoSoundLevelInfo& soundLevelInfo) {

}

/**
* Remote audio sound level callback
* @param soundLevels Remote sound level key-value pairs, key is stream ID, value is the corresponding stream's sound level value
*/
virtual void onRemoteSoundLevelUpdate(const std::map<std::string, double>& soundLevels) {

}
/**
* Remote playing audio sound level callback, supports voice activity detection
*
* @note To trigger this callback, you must call the [startSoundLevelMonitor] function to start the sound level monitor, and be in the state of playing streams.
* @note Callback notification period is the parameter value set when calling [startSoundLevelMonitor].
*
* @param soundLevelInfos Remote sound level key-value pairs, key is stream ID, value is the corresponding stream's sound level value, value ranges from 0.0 to 100.0
virtual void onRemoteSoundLevelInfoUpdate(const std::unordered_map<std::string, ZegoSoundLevelInfo>& soundLevelInfo) {

}

/**
* Local captured audio spectrum callback
* @param audioSpectrum Local captured audio spectrum value array, spectrum value range is [0-2^30]
*
*/
virtual void onCapturedAudioSpectrumUpdate(const ZegoAudioSpectrum& audioSpectrum) {

}

/**
* Remote playing audio spectrum callback
* @param audioSpectrums Remote audio spectrum key-value pairs, key is stream ID, value is the corresponding stream's audio spectrum value array, spectrum value range is [0-2^30]
*
*/
virtual void onRemoteAudioSpectrumUpdate(const std::map<std::string, ZegoAudioSpectrum>& audioSpectrums) {

}

Call examples:

class MyEventHandler: public IZegoEventHandler
{
    virtual void onCapturedSoundLevelUpdate(double soundLevel) {
        printf("onCapturedSoundLevelUpdate");
        ... // operate ui
    }
    virtual void onCapturedSoundLevelInfoUpdate(const ZegoSoundLevelInfo& soundLevelInfo) {
        printf("onCapturedSoundLevelInfoUpdate");
        ... // operate ui
    }

    virtual void onRemoteSoundLevelUpdate(const std::map<std::string, double>& soundLevels) {
        printf("onRemoteSoundLevelUpdate");
        ... // operate ui
    }
    virtual void onRemoteSoundLevelInfoUpdate(const std::unordered_map<std::string, ZegoSoundLevelInfo>& soundLevelLevelInfo) {
        printf("onRemoteSoundLevelInfoUpdate");
        ... // operate ui
    }

    virtual void onCapturedAudioSpectrumUpdate(const ZegoAudioSpectrum& audioSpectrum) {
        printf("onCapturedAudioSpectrumUpdate");
        ... // operate ui
    }

    virtual void onRemoteAudioSpectrumUpdate(const std::map<std::string, ZegoAudioSpectrum>& audioSpectrums) {
        printf("onRemoteAudioSpectrumUpdate");
        ... // operate ui
    }
};

class MyEventHandler: public IZegoEventHandler
{
    virtual void onCapturedSoundLevelUpdate(double soundLevel) {
        printf("onCapturedSoundLevelUpdate");
        ... // operate ui
    }
    virtual void onCapturedSoundLevelInfoUpdate(const ZegoSoundLevelInfo& soundLevelInfo) {
        printf("onCapturedSoundLevelInfoUpdate");
        ... // operate ui
    }

    virtual void onRemoteSoundLevelUpdate(const std::map<std::string, double>& soundLevels) {
        printf("onRemoteSoundLevelUpdate");
        ... // operate ui
    }
    virtual void onRemoteSoundLevelInfoUpdate(const std::unordered_map<std::string, ZegoSoundLevelInfo>& soundLevelLevelInfo) {
        printf("onRemoteSoundLevelInfoUpdate");
        ... // operate ui
    }

    virtual void onCapturedAudioSpectrumUpdate(const ZegoAudioSpectrum& audioSpectrum) {
        printf("onCapturedAudioSpectrumUpdate");
        ... // operate ui
    }

    virtual void onRemoteAudioSpectrumUpdate(const std::map<std::string, ZegoAudioSpectrum>& audioSpectrums) {
        printf("onRemoteAudioSpectrumUpdate");
        ... // operate ui
    }
};

3 Stop Sound Level and Audio Spectrum Monitoring

You can separately stop calling the switch for listening to corresponding callbacks for sound level or audio spectrum.

Interface prototypes:

// Stop sound level monitoring
virtual void stopSoundLevelMonitor() = 0;
// Stop audio spectrum monitoring
virtual void stopAudioSpectrumMonitor() = 0;

// Stop sound level monitoring
virtual void stopSoundLevelMonitor() = 0;
// Stop audio spectrum monitoring
virtual void stopAudioSpectrumMonitor() = 0;

Call examples:

Call the stopSoundLevelMonitor interface to stop listening for sound level:

// Stop sound level monitoring
engine->stopSoundLevelMonitor();

// Stop sound level monitoring
engine->stopSoundLevelMonitor();

Call the stopAudioSpectrumMonitor interface to stop listening for audio spectrum:

// Stop audio spectrum monitoring
engine->stopAudioSpectrumMonitor();

// Stop audio spectrum monitoring
engine->stopAudioSpectrumMonitor();

Usage Steps for Mixed Stream Scenarios

Stream mixing is the function of mixing multiple streams into one stream. When customers need to display the sound level information of each stream before mixing, the mixed stream sound level function can be used. Since the output of stream mixing is a single stream, using the sound level information of the mixed output stream cannot meet the requirement of displaying the sound level of each input stream. At this time, it is necessary to carry the sound level information of the input streams in the stream information during mixing, and then parse the sound level information of each input stream from the stream information when playing the mixed output stream.
When parsing the sound level information of each input stream from the stream information, we obtain the values corresponding to the sound level of each input stream, which is a dictionary. The key in the dictionary is the stream identifier, and the value is the sound level value. However, due to the size limitation of stream information, key cannot use the stream ID and can only use a numeric ID (soundLevelID) to identify the stream.
In manual stream mixing configuration, developers need to maintain the association relationship between numeric IDs (soundLevelID) and stream IDs. In callbacks, developers will get numeric IDs (soundLevelID) and corresponding sound level information.
In room automatic stream mixing, the stream mixing server and SDK will automatically handle the association between numeric IDs and stream IDs. In callbacks, developers get sound level information corresponding to stream IDs.

1 Listen to Mixed Stream Sound Level Callback Interfaces

Interface prototypes

Sound level update callback interface onMixerSoundLevelUpdate for each single stream in manual stream mixing:

/**
 * Sound level update callback for each single stream in stream mixing
 *
 * Callback notification period is 100 ms.
 * @param soundLevel Sound level key-value pairs for each single stream in stream mixing, key is the soundLevelID of each single stream, value is the sound level value of the corresponding single stream. Value range: value ranges from 0.0 to 100.0.
 */
public void onMixerSoundLevelUpdate(const std::unordered_map<unsigned int, float> &soundLevels){

}

/**
 * Sound level update callback for each single stream in stream mixing
 *
 * Callback notification period is 100 ms.
 * @param soundLevel Sound level key-value pairs for each single stream in stream mixing, key is the soundLevelID of each single stream, value is the sound level value of the corresponding single stream. Value range: value ranges from 0.0 to 100.0.
 */
public void onMixerSoundLevelUpdate(const std::unordered_map<unsigned int, float> &soundLevels){

}

Sound level update callback interface onAutoMixerSoundLevelUpdate for each single stream in automatic stream mixing:

Warning

You can get streamID in the callback interface only when you have logged in to the room where automatic stream mixing is located and play the mixed stream of this room.

/**
 * Sound level update callback for each single stream in automatic stream mixing
 *
 * Callback notification period is 100 ms.
 * @param soundLevels Sound level key-value pairs for each single stream in stream mixing, key is the streamID of each single stream, value is the sound level value of the corresponding single stream, value ranges from 0.0 to 100.0
 */
public void onAutoMixerSoundLevelUpdate(const std::unordered_map<std::string, float> &soundLevels){

}

/**
 * Sound level update callback for each single stream in automatic stream mixing
 *
 * Callback notification period is 100 ms.
 * @param soundLevels Sound level key-value pairs for each single stream in stream mixing, key is the streamID of each single stream, value is the sound level value of the corresponding single stream, value ranges from 0.0 to 100.0
 */
public void onAutoMixerSoundLevelUpdate(const std::unordered_map<std::string, float> &soundLevels){

}

2 Start Listening to Sound Level Callback Switch

When starting/updating stream mixing, you can start the switch for listening to sound level callbacks.

Manual stream mixing scenario

When calling the startMixerTask interface to initiate a manual stream mixing task, set the enableSoundLevel parameter to true to start listening to sound level, and specify a unique soundLevelID for each input stream:

ZegoMixerTask task;
task.taskID = "task123";
// Enable mixed stream sound level
task.enableSoundLevel = true;

ZegoMixerInput input;
// Assign a soundLevelID to the input stream
input.soundLevelID = 123;

// Other configurations

mSDKEnging->startMixerTask(task, null);

ZegoMixerTask task;
task.taskID = "task123";
// Enable mixed stream sound level
task.enableSoundLevel = true;

ZegoMixerInput input;
// Assign a soundLevelID to the input stream
input.soundLevelID = 123;

// Other configurations

mSDKEnging->startMixerTask(task, null);

Automatic stream mixing scenario

When calling the startAutoMixerTask interface to initiate an automatic stream mixing task, set the enableSoundLevel parameter to true to start listening to sound level:

ZegoAutoMixerTask task;
task.taskID = "autotask123";
// Enable mixed stream sound level
task.enableSoundLevel = true;
// Other configurations

mSDKEnging->startAutoMixerTask(task, null);

ZegoAutoMixerTask task;
task.taskID = "autotask123";
// Enable mixed stream sound level
task.enableSoundLevel = true;
// Other configurations

mSDKEnging->startAutoMixerTask(task, null);

3 Stop Listening to Sound Level Callback Switch

When updating a stream mixing task, you can set the switch to stop listening to sound level callbacks.

Manual stream mixing scenario

When calling the startMixerTask client interface to update a stream mixing task, set the enableSoundLevel parameter to false to stop listening to sound level:

ZegoMixerTask task;
// taskID must remain consistent with the previous one
task.taskID = "task123";
// Stop listening to mixed stream sound level
task.soundLevel = false;

mSDKEnging->startMixerTask(task, null);

ZegoMixerTask task;
// taskID must remain consistent with the previous one
task.taskID = "task123";
// Stop listening to mixed stream sound level
task.soundLevel = false;

mSDKEnging->startMixerTask(task, null);

Automatic stream mixing scenario

When calling the startAutoMixerTask client interface to update an automatic stream mixing task, set the enableSoundLevel parameter to false to stop listening to sound level:

ZegoAutoMixerTask task;
// taskID must remain consistent with the previous one
task.taskID = "autotask123";
// Stop listening to mixed stream sound level
task.enableSoundLevel = false;

mSDKEnging->startAutoMixerTask(task, null);

ZegoAutoMixerTask task;
// taskID must remain consistent with the previous one
task.taskID = "autotask123";
// Stop listening to mixed stream sound level
task.enableSoundLevel = false;

mSDKEnging->startAutoMixerTask(task, null);

FAQ

Why didn't I receive relevant callbacks after enabling the sound level and spectrum monitoring switches?

Local capture callbacks will be triggered immediately, with a callback value of 0 when not publishing; remote playing callbacks will only be triggered after successfully playing the stream startPlayingStream.

Sound Level and Audio Spectrum

Feature Overview

Sample Source Code Download

Prerequisites

Usage Steps for Non-Mixed Stream Scenarios

1 Enable Sound Level and Audio Spectrum Monitoring

2 Listen to Sound Level and Audio Spectrum Callbacks

3 Stop Sound Level and Audio Spectrum Monitoring

Usage Steps for Mixed Stream Scenarios

1 Listen to Mixed Stream Sound Level Callback Interfaces

2 Start Listening to Sound Level Callback Switch

3 Stop Listening to Sound Level Callback Switch

FAQ

Related Documents