where $u \in \mathbb{R}^{N}$ represents a high-dimensional embedding”of the user, context pair and the $ v_j \in \mathbb{R}^{N}$ represent embeddings of each candidate video.

train：

to efficiently train such a model with millions of classes

1.hierarchical softmax，效果不佳

2.采用candidate sampling，correct for this sampling via importance weighting

At serving time

3.2 CANDIDATE GENERATION

3.3 Heterogeneous Signals

3.4 Label and Context Selection

3.5 Experiments with Features and Depth

4.RANKING

推荐系统推荐系统

youtubednn

2021-10-21 2aa4853dbbb01ab40c00e1a30cb08c0f 99+ a minute 0.1 k

特征工程

https://zhuanlan.zhihu.com/p/111296130

1.特征预处理

0.是否去重

1.缺失值

均值补全

2.异常值

检测异常值

数值范围

sigma准则

knn

箱线图

处理异常值

剔除

均值补全

2.特征表示

特征分类：数值特征，文本特征，类别特征

1.数值特征

1.直接使用数值

2.离散化

分桶

2.类别特征

1.one hot

2.embedding

3.其他

catboost

3.特征选择

https://blog.csdn.net/Datawhale/article/details/120582526

大致分为3种，filter，wrapper，embedded

机器学习特征工程

特征工程

2021-10-21 e5475f492ffac2531ba4e3bf95db9af3 99+ fast 0.0 k

协同过滤

https://www.jianshu.com/p/5463ab162a58

https://www.jianshu.com/p/20041e72e9ec

https://www.cnblogs.com/pinard/p/6349233.html

推荐系统召回

召回

2021-10-21 87b4ccd542f06fdb7fef16e47024311f 99+ fast 0.0 k

SimCSE Simple Contrastive Learning of Sentence Embeddings

https://arxiv.org/pdf/2104.08821.pdf

1.背景

1 target

对于$D=\{(x_i,x_i^{+})\}_{i=1}^{m}$,where $x_i$ and $x_i^{+}$ are semantically related. xi,xj+ are not semantically related

x->h

Contrastive learning aims to learn effective representation by pulling semantically close neighbors together and pushing apart non-neighbors

N is mini-batch size，分子是正样本，分母为负样本（有一个正样本,感觉是可以忽略）

分母会包含分子的项吗？从代码看，会的

loss

https://www.jianshu.com/p/d73e499ec859

def loss(self,y_pred,y_true,lamda=0.05):

    '''

    exist a query q1 and  ranked condidat list  [d1,d2,d3,...,dn]
     loss=  -log( exp^sim(q1,d1)/t  /   sum(exp^sim(q1,di)/t) i=2,...,n)

    [q1,q2]    [[d11,d12,d13],[d21,d22,d23]]
     similarities=[[sim(q1d11),sim(q1d12),sim(q1d13)],[sim(q2d21),sim(q2d22),sim(q2d23)] ] y_true=[y1 ,y2 ]

        loss = F.cross_entropy(similarities, y_true)
    ref ： https://www.jianshu.com/p/d73e499ec859
    '''

    # idxs = torch.arange(0, y_pred.shape[0])
    # y_true = idxs + 1 - idxs % 2 * 2
    y_pred = y_pred.reshape(-1, y_true.shape[1])

    # y_true=[0]*y_pred.sha pe[0]
    # similarities = F.cosine_similarity(y_pred.unsqueeze(1), y_pred.unsqueeze(0), dim=2)
    # similarities = similarities - torch.eye(y_pred.shape[0]) * 1e12
    y_pred = y_pred / lamda
    y_true = torch.argmax(y_true, dim=1)
    loss = F.cross_entropy(y_pred, y_true)
    return loss

2 representations评价指标

Alignment： calculates expected distance between embeddings of the paired instances（paired instances就是正例）

uniformity： measures how well the embeddings are uniformly distributed

2.结构

2.1 Unsupervised

$x_i->h_i^{z_i},x_i->h_i^{z_i^{‘}}$

z is a random mask for dropout，loss为

2.2 Supervised

引入非目标任务的有标签数据集，比如NLI任务，$(x_i,x_i^{+},x_i^{-})$,where $x_i$ is the premise, $x_i^{+}$and $x_i^{-}$are entailment and contradiction hypotheses.

$(h_i,h_j^{+})$为normal negatives，$(h_i,h_j^{-})$为hard negatives

NLP 文本表示

文本表示

transformer综述

T5

冷启动

分类任务的类别数量很大(万以上)怎么处理？

推荐之召回

youtubednn

2.SYSTEM OVERVIEW

3.CANDIDATE GENERATION

3.1 Recommendation as Classification

3.2 CANDIDATE GENERATION

3.3 Heterogeneous Signals

3.4 Label and Context Selection

3.5 Experiments with Features and Depth

4.RANKING

特征工程

1.特征预处理

0.是否去重

1.缺失值

2.异常值

2.特征表示

1.数值特征

2.类别特征

3.特征选择

协同过滤

推荐系统评价指标

SimCSE Simple Contrastive Learning of Sentence Embeddings

1.背景

2.结构

2.1 Unsupervised

2.2 Supervised

Recents

Categories

Archives

Tags

Subscribe for updates