Running Apache Airflow At Lyft

Sticking with Airflow for a second, this is a stellar post where the Lyft data eng team talks about their production Airflow deployment (500+ DAGs!). They discuss:

  • overall architecture
  • monitoring & SLAs
  • customizations they've made
  • production performance and reliability

In this post, even in the process of outlining a very sophisticated Airflow environment, it's hard to miss the areas of the product where duct tape needed to be applied. The monitoring system, in particular, felt somewhat rudimentary relative to its criticality—there is clearly a lot of scope for a managed service to add value here.


Want to receive more content like this in your inbox?