The Cramer Distance as a Solution to Biased Wasserstein Gradients

arxiv.org

In this paper, the authors describe three natural properties of probability divergences that reflect requirements from machine learning: sum invariance, scale sensitivity, and unbiased sample gradients. The Wasserstein metric possesses the first two properties but, unlike the Kullback-Leibler divergence, does not possess the third. They provide empirical evidence suggesting that this is a serious issue in practice.

Read more...
Linkedin

Want to receive more content like this in your inbox?