Lazydata: Scalable Data Dependencies for Python Projects

Very cool project!

Problem: Keeping all data files in git (e.g. via git-lfs) results in a bloated repository copy that takes ages to pull. Keeping code and data out of sync is a disaster waiting to happen.
Solution: lazydata only stores references to data files in git, and syncs data files on-demand when they are needed.
Why: The semantics of code and data are different - code needs to be versioned to merge it, and data just needs to be kept in sync. lazydata achieves exactly this in a minimal way.


Want to receive more content like this in your inbox?