How We Improved Data Discovery for Data Scientists at Spotify

Wow. There has been so much activity in the data discovery / knowledge management / data catalog space within BigTech in the recent past. Linkedin, Airbnb, Lyft, and WeWork all have made meaningful contributions here and there are many different internal tools (some open source and some not) floating around.

I continue to care a lot about this because I think it's the Next Big Problem in data. Data warehouses and data ingestion tooling is mature, data transformation in the warehouse environment is increasingly mature, and now users are beginning to create massive numbers of datasets using this new environment they've been given. With any organization of sufficient size, curation and discovery becomes an issue very quickly.

I'm looking forward to spending more time on this problem in the coming months.


