Opinionated analysis development


This might feel a little obvious to you if you're in the trenches doing real data work, but what's fascinating to me is that it's a point that even needs to be made! Hilary Parker of Stitchfix is telling academic data programs that they need to actually teach students how to construct an analysis:

Traditionally, statistical training has focused primarily on mathematical derivations and proofs of statistical tests. The process of developing the technical artifact—that is, the paper, dashboard, or other deliverable—is much less frequently taught, presumably because of an aversion to cookbookery or prescribing specific software choices. In this paper I argue that it’s critical to teach analysts how to go about developing an analysis in order to maximize the probability that their analysis is reproducible, accurate, and collaborative. A critical component of this is adopting a blameless postmortem culture. By encouraging the use of and fluency in tooling that implements these opinions, as well as a blameless way of correcting course as analysts encounter errors, we as a community can foster the growth of processes that fail the practitioners as infrequently as possible.

Teaching someone how to derive the mean of the binomial distribution but not how to conduct a reproducible analysis and check it into source control feels to me like teaching Newtonian physics without ever covering the scientific method. Process turns out to be pretty damn important.


Want to receive more content like this in your inbox?