Google's Latest Party Trick: AI that Isolates Voices in a Crowd

Google's latest practical machine learning exercise attempted to replicate the Cocktail Party Effect -- the human brain's ability to focus on one source of audio while filtering out others.  It uses a combined audio-visual approach to isolate the voice of a single speaker in videos with multiple overlapping speakers, which could have uses ranging from more accurate closed captioning, to better functioning voice-controlled devices. 

The author bemoans the potential privacy implications of creating better surveillance technology, but we'd argue that boat mostly sailed when the recording was made and shared in the first place.


Want to receive more content like this in your inbox?