How to Track Data Changes with dbt Snapshots

Most modern data warehouse stacks don't track changes in mutable data. If you load a table in Snowflake via Stitch or Fivetran and a record in that table changes in the source, the corresponding change happens in Snowflake. Any record of the historical state of that record is destroyed completely.

Sometimes, this is 🤷, but sometimes this is a big problem. Critical tables that are used to build financial metrics, in particular, are sensitive to this problem, because finance teams tend to frown upon historicals changing once they've been reported to the board.

This post goes into how a Kimball modeling technique, Type-2 Slowly Changing Dimensions, solves this problem, and how to easily implement Type-2 SCDs in dbt.

This is an important and often-overlooked topic, and the post is a perfect entry point if you want to get yourself up to speed.


Want to receive more content like this in your inbox?