2021-08-18 d7c5f9c356e864c91498d933feea99eb 99+ 3 m 0.5 k0 visits

熵，KL散度，交叉熵，JS散度

GAN需要KL散度和JS散度，所以先预热。

1.熵

信息量为：

$\begin{align} I(x) &= - \log(p(x)) \tag{1} \end{align}$

熵为信息量的算术平均：

$H(x) = - \sum_{i=1}^{n}p(x_i)log(p(x_i)) \tag{2}$

2.交叉熵

交叉熵为

$H(P,Q) = -\sum_{i=1}^np(x_i)logq(x_i)\tag{3}$

3.KL散度

对于同一个随机变量有两个单独的概率分布，我们可以使用KL散度(Kullback-Leibler divergence)来衡量两个分布的差异。在机器学习的损失函数的计算中，我们可以假设$P$为样本的真实分布，$Q$用来表示模型所预测的分布，使用KL散度来衡量两个分布之间的差异。KL散度等于交叉熵减去熵

$\begin{align} D_{KL}(P||Q) &= \sum_{i=1}^np(x_i)log(\frac{p(x_i)}{q(x_i)}) \notag\\ &=\sum_{i=1}^np(x_i)(logp(x_i)-logq(x_i)) \notag\\ &=\sum_{i=1}^n[p(x_i)logp(x_i)-p(x_i)logq(x_i)] \notag\\ &=\sum_{i=1}^np(x_i)logp(x_i)-\sum_{i=1}^np(x_i)logq(x_i) \\ &=-H(P)+H(P,Q)\tag{4} \end{align}$

$P$和$Q$概率分布越接近，$D_{KL}(P||Q)$越小。

KL散度与交叉熵区别与联系

https://blog.csdn.net/Dby_freedom/article/details/83374650

KL散度主要有两个性质：

（1）不对称性

尽管KL散度从直观上是个距离函数，但它并不是一个真正的度量，因为它不具有对称性，即$D_{KL}(P||Q)\neq D_{KL}(Q||P)$。

（2）非负性

即$D_{KL}(P||Q) \geq 0$。

4.JS散度

JS散度也是用于度量两个概率分布的相似度，其解决了KL散度不对称的缺点

$JS(P||Q) = \frac{1}{2}KL(P||\frac{P+Q}{2})+\frac{1}{2}KL(Q||\frac{P+Q}{2}) \tag{5}$

不同于KL主要在两方面：

（1）值域范围

JS散度的值域范围是[0,1]，相同则是0，相反为1。

（2）对称性

即$ JS(P||Q)=JS(Q||P)$，从数学表达式中就可以看出。

参考

https://www.cnblogs.com/Mrfanl/p/11938139.html

https://zhuanlan.zhihu.com/p/346518942

https://www.w3cschool.cn/article/83016451.html

熵，KL散度，交叉熵，JS散度

http://example.com/2021/08/18/entropy/

Author

Lavine Hu

Posted on

2021-08-18

Updated on

2021-12-20

熵，KL散度，交叉熵，JS散度

1.熵

2.交叉熵

3.KL散度

4.JS散度

参考

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Recents

Categories

Archives

Tags

Subscribe for updates