2024-04-16 5e931702777b6d9c993e937937ddc6c3 99+ fast 0.0 k

deepmatch

好用的工具

https://hub.fastgit.org/shenweichen/DeepMatch

推荐系统召回

2024-04-16 576c42d018af3d01bc13ebf83718bfd2 99+ fast 0.0 k

排序

爱奇艺

https://www.6aiq.com/article/1614422394959

腾讯音乐

https://cloud.tencent.com/developer/article/2226972

快手

https://zhuanlan.zhihu.com/p/520181137

推荐系统排序

2022-06-09 e1f3854977813b636790b0f8b81256ae 99+ 2 m 0.2 k

天池新闻推荐

目标: 为不同用户（测试为5万）分别推荐top5的新闻文章（总数36万）

标签：不同用户在不同时间的点击新闻

特征：

整体框架也是：多路召回+排序

召回

# 定义一个多路召回的字典，将各路召回的结果都保存在这个字典当中
user_multi_recall_dict =  {'itemcf_sim_itemcf_recall': {},
                           'embedding_sim_item_recall': {},
                           'youtubednn_recall': {},
                           'youtubednn_usercf_recall': {}, 
                           'cold_start_recall': {}}

基于itemcf计算的item之间的相似度sim进行的召回
基于embedding搜索得到的item之间的相似度进行的召回
YoutubeDNN召回
YoutubeDNN得到的user之间的相似度进行的召回
基于冷启动策略的召回

排序

排序

LGB的排序模型
LGB的分类模型
深度学习的分类模型DIN

模型集成

输出结果加权融合
Staking

参考

https://tianchi.aliyun.com/notebook-ai/detail

推荐系统推荐系统

天池新闻推荐

2021-10-25 a7ac42a42c2a413e0be9c11a44e5b1df 99+ fast 0.0 k

冷启动

推荐系统冷启动

mark

https://zhuanlan.zhihu.com/p/79950668

推荐系统推荐系统

2021-10-25 35251cb0b902ed2f1e7014bb4351e813 99+ fast 0.0 k

推荐之召回

总结

https://blog.csdn.net/luanfenlian0992/article/details/107416438

https://zhuanlan.zhihu.com/p/364053939

推荐系统召回

2021-10-25 732083e6a31d307d239fc14309e422f1 99+ a minute 0.2 k

youtubednn

原文： https://storage.googleapis.com/pub-tools-public-publication-data/pdf/45530.pdf

几篇优秀博客：

https://zhuanlan.zhihu.com/p/52169807

https://zhuanlan.zhihu.com/p/52504407

https://zhuanlan.zhihu.com/p/61827629

https://zhuanlan.zhihu.com/p/46247835

下文为本人总结。

2.SYSTEM OVERVIEW

3.CANDIDATE GENERATION

3.1 Recommendation as Classification

把推荐问题转换成多分类问题

where $u \in \mathbb{R}^{N}$ represents a high-dimensional embedding”of the user, context pair and the $ v_j \in \mathbb{R}^{N}$ represent embeddings of each candidate video.

train：

to efficiently train such a model with millions of classes

1.hierarchical softmax，效果不佳

2.采用candidate sampling，correct for this sampling via importance weighting

At serving time

3.2 CANDIDATE GENERATION

3.3 Heterogeneous Signals

3.4 Label and Context Selection

3.5 Experiments with Features and Depth

4.RANKING

推荐系统推荐系统

2021-10-21 e5475f492ffac2531ba4e3bf95db9af3 99+ fast 0.0 k

协同过滤

https://www.jianshu.com/p/5463ab162a58

https://www.jianshu.com/p/20041e72e9ec

https://www.cnblogs.com/pinard/p/6349233.html

推荐系统召回

2021-10-21 87b4ccd542f06fdb7fef16e47024311f 99+ fast 0.0 k

推荐系统评价指标

https://zhuanlan.zhihu.com/p/67287992

http://sofasofa.io/forum_main_post.php?postid=1000292

推荐系统推荐系统

推荐系统评价指标

2021-10-11 be95c2216e34fa6768bcd52cfeff2447 99+ fast 0.0 k

Deep Learning Recommendation Model for Personalization and Recommendation Systems

https://arxiv.org/pdf/1906.00091.pdf

DLRM 结构(deep learning recommendation model)

参考

https://zhuanlan.zhihu.com/p/82839874

推荐系统排序

facebook推荐系统

2021-10-08 af7ba7a3a0850ec56ee25a8c906fa702 99+ 2 m 0.3 k

多路召回

1.定义

所谓的“多路召回”策略，就是指采用不同的策略、特征或简单模型，分别召回一部分候选集，然后再把这些候选集混合在一起后供后续排序模型使用的策略。值得注意的是，每一路召回需要尽可能的保持独立性与互斥性，从而在保证各链路能够并行召回的同时，增加召回的多样性。

2.多路召回融合策略（可以算是粗排）

平均法：C的计算方法：(0.7 + 0.5 + 0.3)/3

加权平均：假设三种策略的权重指定为0.4、0.3、0.2（人为给定或者算法拟合），则B的权重为（0.4 0.8 + 0.3 0.6 + 0.2* 0）/ （0.4+0.3+0.2）

动态加权法:计算三种召回策略的CTR，作为每天更新的动态权重。但是只考虑了点击率，并不全面。

3.例子

https://tianchi.aliyun.com/notebook-ai/detail?postId=144452

参考

https://zhuanlan.zhihu.com/p/388601198

推荐系统召回