Tiny AI models could supercharge autocorrect and voice assistants on your phone


Researchers have successfully shrunk a giant language model to use in commercial applications.

You're likely familiar with the impressive results of language models released in the past year as well as their massive parameter count—I've covered these developments here pretty extensively. This article summarizes two papers, one from Huawei and one from Google, that shrunk the behemoth models many times over and yet retained nearly-identical performance. The technique used was interesting:

Both papers use variations of a common compression technique known as knowledge distillation. It involves using the large AI model that you want to shrink (the “teacher”) to train a much smaller model (the “student”) in its image. To do so, you feed the same inputs into both and then tweak the student until its outputs match the teacher’s.


Want to receive more content like this in your inbox?