The Kullback-Leibler divergence, also known as relative entropy, is a measure of how different one probability distribution is from another. It is not a metric, meaning it is not symmetric and does not satisfy the triangle inequality. The KL divergence is often used in information theory and has applications in various fields such as statistics, coding theory, and machine learning. It can be interpreted as the expected excess surprise when using one distribution as a model instead of the true distribution. The KL divergence is a divergence and is related to information gain and Bayesian inference. It is a nonnegative function and has properties such as additivity and convexity.
https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence