Microsoft Teams now uses AI to improve echo, interrupt, and acoustics

Microsoft has spent the last two years adding striking new productivity features to Teams, and now the company is reviewing how the fundamentals work with AI. We’ve all been on a call where someone has bad room acoustics that make it hard to hear, or we’ve seen two people trying to talk at the same time, creating an awkward “no, go ahead” moment. Microsoft’s new AI-enhanced voice quality enhancements should improve or even eliminate these day-to-day inconveniences.

Microsoft now uses machine learning models to improve the acoustics of the room, so you no longer sound like you’re hiding in a cave. “While we’ve done our best with digital signal processing to do a great job at Teams, we’ve now started using machine learning for the first time to create an echo cancellation where you can really reduce the “echo all the different devices,” he explains. Robert Aichner, chief program officer for Microsoft’s cloud of smart conversations and communications, in an interview with The Verge.

Microsoft has been testing it for months, measuring its models in the real world to make sure that computer users notice the echo reduction and improvements in call quality. The software maker used 30,000 hours of talk to help train its models and captured thousands of devices through crowdfunding where users of computers are paid to record their voice and play audio from their device.

“We also simulate about 100,000 different rooms … the acoustics of the room play an important role in the cancellation of the echo,” says Aichner. The result is great improvements in the audio quality of the calls and an elimination of the echo that also allows several people to speak at the same time. You can see all the improvements in action in the video above.

If Teams detects sound bouncing or reverberating in a room resulting in shallow audio, the model will also convert the captured audio and process it to sound as if Team participants are speaking into a proximity microphone instead of a mess echoed.

The most impressive part is the ability of people to interrupt each other on team calls now, without the awkward overlap where the other person can’t be heard because of the echo. Microsoft is now sending all this work to Teams, along with the improvements it has made in the past with AI-based noise suppression. All processing is done locally on client devices, rather than the cloud.

“We said we wanted to do it to the customer, because the cloud is still expensive if you want to make all the processed calls in the cloud … and obviously we should pass that cost on to the customer,” says Aichner. This would potentially mean restricting these major equipment enhancements to paying customers, and device routing means that features such as noise suppression are available on 90 percent of devices using Teams.

All of these new Microsoft Teams enhancements are now available, along with some real-time screen optimizations for video text and AI-based enhancements for bandwidth limitations during video calls or screen sharing.

Leave a Comment

Your email address will not be published. Required fields are marked *