Dynamic Data Testing: Tests that Learn with Data


I love love love this post.

Dynamic testing strategies such as predicted ranges or unsupervised detection have some significant advantages. They are easier to set up and easier to maintain over time. They can also be used to test any data for any condition, regardless of the current quality of the data.

Most people don't even have static assertions in their data pipelines. Those who do often find that they are brittle (✋), which makes it hard to maintain a commitment to them over time. This post walks through an example of a dynamic data test that can learn from data to minimize maintenance burden and false positives while still providing great test coverage.


Want to receive more content like this in your inbox?