Moving Thumbtack’s Data Infrastructure to GCP

This is an amazing walkthrough of a significant data engineering effort. I like the post so much because it puts scalability and maintainability at the very centerThumbtack had pain around managing compute instances so they switched to a managed architecture (Dataproc + Bigquery + GCS). As a result, they get better performance and more control. But most importantly, they changed how they spend their time.

We’ve seen tremendous productivity gains across the organization by our move to managed services(...). Going forward, our infrastructure investments will be focused on further empowering our engineering, analytics and data science teams to leverage our large-scale data in new ways.

Data teams should be seriously considering the impact of their tech choices on how they spend their time. Forcing yourself to maintain servers will prevent you from focusing on what matters: analyzing data.


