Publishing with Apache Kafka at The New York Times

The biggest architectural shift happening in modern distributed systems is moving to log-based communication between large numbers of microservices.

If the above sentence made your eyes glaze over, you are not alone. Data scientists and analysts typically don't spend a ton of time thinking about the technical implementation of how their data arrives at their doorstep, but this adoption of log-based communication will have very significant impacts on your work in coming years. You should get ahead of it.

Imagine that, instead of querying a customers table, you query a customers event stream that contains every update to a customer ever. You'll find that some queries get more complicated: if you want the current state of a customer you'll have to reconstruct it for yourself. But you'll also have more power: you'll be able to reconstruct that state at any given point in time. You'll also likely have access to data in much closer to real-time.

This article is an excellent intro into log-based architectures, as seen through the experience that the New York Times had in their migration. If this isn't a topic you're familiar with, spend the time to digest.


Want to receive more content like this in your inbox?