How to Serve Models

bugra.github.io

There are many ways to serve ml(machine learning) models, but these are the most common 3 patterns I observed over the years:
1) Materialize/Compute predictions offline and serve through a database,
2) Use model within the main application, model serving/deployment can be done with main application deployment,
3) Use model separately in a microservice architecture where you send input and get output

Yep yep yep. This is the clearest post I've read on this topic before; extremely helpful if you're thinking about how to design a production ML system right now. The author runs the search engineering team at Jet.com, and his recommendations are those of an experienced practitioner: he doesn't push the reader straight to the most architecturally "pure" approach (#3), very much recognizing the overhead required to run the microservices architecture that it requires.

Read more...
Linkedin

Want to receive more content like this in your inbox?