Originally published at: MIT Develops AI That Handles Speech and Object Recognition All at Once | NVIDIA Technical Blog
MIT researchers have developed a deep learning system that can identify objects within an image, based on a spoken description of the picture, in real time. “We wanted to do speech recognition in a way that’s more natural, leveraging additional signals and information that humans have the benefit of using, but that machine learning algorithms…