Reproducible research: Stripe’s approach to data science

The data team at Stripe has heavily invested in reproducibility, with great results. In this post, they share how their team publishes internal research that is then reproducible from scratch by any member of the team, current or future. Git, Jupyter, and internally-built tools are all at the heart of this workflow.

This is a must-read. Data teams need to think of their outputs as research, and need to be focused on building high-quality mechanisms by which this research gets produced and maintained.


Want to receive more content like this in your inbox?