核心思想为：

Given two lists of scores（模型和人）
we can first calculate two permutation probability distributions from them（简化到用top1）
and then calculate the distance between the two distributions as the listwise loss function.（交叉熵）

4. Probability Models

4.1. Permutation Probability

$\pi=(2,3,1) $指的是对象2排在第一位

上面是topn的形式

因为总共有n！次排序组合

4.2. Top One Probability

topk：

$P_s(\pi)=\prod \limits_{j=1}^K\frac{\phi(S_{\pi(j)})}{\sum_{k=j}^n\phi(S_{\pi(k)})}$

总共有N ! / ( N − k ) ! 种不同排列，大大减少了计算复杂度

top1：

此时有n种不同排列情况

概率分布的含义：对于每个j，分别都处于第一的概率是多少

5.Learning Method: ListNet

We employ a new learning method for optimizing the listwise loss function based on top one probability, with Neural Network as model and Gradient Descent as optimization algorithm. We refer to the method as ListNet.