Wide Tables or Star Schemas?


Michael Kaminsky of Gradient Metrics takes us through a benchmark comparison of data architectures.

I'm so glad this post exists. I've witnessed (and been a part of) so many conversations about data modeling practices that have gotten almost religious: there are Kimball advocates who will clutch their star schema ERDs to the end. My belief is that there is plenty to recommend traditional star-schema-style modeling, but that modern data tech allows us much more flexibility in our design choices. Often times, performance considerations outweigh the need for the tidy-ness of a good star schema.

And that's where this post comes in. It does a solid job of benchmarking performance of the two common design patterns for data models and does so in the three leading data warehouse platforms. Turns out, it's faster to denormalize the data into a single large table on each database platform.

This doesn't necessarily mean that you should only make completely denormalized tables, but it should weigh into your design thinking.


Want to receive more content like this in your inbox?