Nemo: Data discovery at Facebook

engineering.fb.com

Facebook finally joins AirbnbLyftNetflix, and Uber (and many others) in creating its own in-house data catalog. If you're following the space (as I very much am), this isn't a revolutionary release—it's hitting on the same themes as other similar in-house products. And because it's built on top of Facebook's proprietary social graph search utility, Unicorn, it's unlikely to be open sourced at any point.

There are a lot of nice touches though. Here's my favorite paragraph:

Nemo indexing is generally aware of our data ecosystem. For example, if a data pipeline duplicates a column into a downstream table, the original column’s description and the upstream table’s name are also stored for the downstream artifact. Presto queries of data artifacts are noted, so if an engineer performs a Presto query, that will increase the Nemo score both generally, for that table, and for the specific engineer who performed the search.

Read more...
Linkedin

Want to receive more content like this in your inbox?