Systems and methods are described that perform audio source localization in a manner that provides increased robustness and responsiveness in the presence of acoustic echo. The systems and methods calculate a difference between a signal level associated with one or more of the audio signals generated by a microphone array and an estimated level of acoustic echo associated with one or more of the audio signals. This information is then used to determine whether and/or how to perform audio source localization. For example, a controller may use the difference to determine whether or not to freeze an audio source localization module that operates on the audio signals. As another example, the audio source localization module may incorporate the difference (or the estimated level of acoustic echo used to calculate the difference) into the logic that is used to determine the location of a desired audio source.