The intersection of two current events have born fruit:
- I’m reading Christopher Bishop’s legendary book: “Pattern Recognition and Machine Learning”, (freely available as PDF from the book’s website!)
- I recently discovered Shiny for R, which allows building interactive web apps in R.
To try out Shiny, I created an interactive visualization for Kullback-Leibler divergence (or KL Divergence). Right now, it only supports two univariate Gaussians, which should be sufficient to build some intuition.
If you like it, let me know! If it turns out to be popular, I might add more features, or create similar visualizations for other concepts!
What is KL Divergence? What am I seeing?
Consider an unknown probability distribution , which we’re trying to approximate with probability distribution , then
can informally be interpreted as the amount of information being lost by approximating with . As you might imagine, this has several applications in Machine Learning. A recurring pattern is to fit parameters to a model by minimizing an approximation of (ie, making “as similar” to as possible). This blog post elaborates in a fun and informative way. If you have never heard about KL divergence before, Bishop provides a more formal (but still easy to understand) introduction in Section 1.6 of PRML.
Suggested exercises with the interactive plot
Using the visualization tool, find out (or verify) the answer to the following questions:
- Is ? Always? Never?
- When is ?
- Let and . Which is larger: ? Why?
- Is ever negative? When, or why not?