We’re All Using Airflow Wrong and How to Fix It


Tl;dr: only use Kubernetes Operators

Running jobs with heterogenous dependencies as a part of a single DAG feels like it shouldn't be that hard, but as soon as you have two requirements.txt files things can get bad quickly.

The engineering team at Bluecore didn't love their original Airflow experience and developed an opinionated solution involving Docker and Kubernetes. They haven't looked back—the results have been nothing but positive.

Is this the right approach for everyone? I actually don't know. Airflow came to market prior to the rise of Docker and Kubernetes, but at this point I have a hard time imagining wanting to run a huge Airflow installation without the infrastructure they provide.


Want to receive more content like this in your inbox?