I am working on an algorithm for dominant speaker selection in a closed proximity cluster of smartphones (iOS and Android) with arbitrary positions. The smartphones act as clients, and one of them serves as the server. The server performs dominant speaker selection on the RMS of each packet and applies a moving average to smooth out instant surges and lows. However, I am encountering an issue where even when a person speaks near one smartphone, another smartphone's microphone picks up the sound and incorrectly identifies it as the dominant speaker. This problem seems to be related to varying microphone sensitivities among different smartphone models, and manufacturers do not provide direct access to mic sensitivity information. Considering the limitation of not knowing the microphone sensitivity of each smartphone, I am seeking advice on alternative factors or methods that could improve the dominant speaker selection process. I want to achieve accurate dominant speaker identification regardless of mic sensitivity.

Specifically, I would like to know:

  • Are there any alternative or indirect ways to estimate the microphone sensitivity of smartphones, or is there any public database that provides such information? What other factors, apart from microphone sensitivity, can significantly influence the dominant speaker selection process?
  • What additional signal-processing techniques or algorithms can be applied to enhance the accuracy of dominant speaker identification?
  • Should I consider using multiple features from the audio signal, such as frequency characteristics, to improve the robustness of the algorithm?
  • I appreciate any insights, suggestions, or research papers that could help me address this challenge and achieve reliable dominant speaker selection in the described proximity cluster of smartphones. Thank you!

    More Khubaib Ahmad's questions See All
    Similar questions and discussions