2021-12-16 fc22eb346de888ff8528afcec9f6f88a 99+ fast 0.0 k

图神经工具

PyG， DGL对比

https://www.zhihu.com/question/399802947

GNN 小帮手

PyG,DGL

2021-12-15 437480ac4b8fefab890d449bd56c1834 99+ a minute 0.1 k

函数参数

1 引用传递

可变对象改变原来

不可变对象不改变原来

2 默认参数

https://blog.csdn.net/weixin_41972881/article/details/81562731

https://blog.csdn.net/weixin_45775963/article/details/103696945

def fun(va1,va2=[]):
    print(va2)
    va2.append(va1)
    return va2
te1=fun(10)
te1=fun(20)

va2如果没有传参，采用默认的，默认的会变化，不是一直是[]

va2如果是外部的传参，以传参为主，会覆盖

3 可变参数

1 *args

def test(*args)

print(args)

test(1,2,3,4)

test(*(1,2,3,4))

(1,2,3,4)

2 **kwargs

def test(**kwargs)

print(args)

test(x=1,y=2,z=3)

test(**{‘x’:1,’y’:2,’z’:3})

{‘x’:1,’y’:2,’z’:3}

python

函数参数

2021-12-14 ebdcdc87ea1e77ac7d009f72b7eb9ef1 99+ 2 m 0.3 k

Enhanced-RCNN An Efficient Method for Learning Sentence Similarity

特点：非预训练，参数量少

1 input encoding

得到两个encoding，RNN Encoding，RCNN Encoding

1 BiGRU

$\textbf{a}=\{a_1,a_2,…,a_{l_a}\},\textbf{a}$ 是句子，$l_a$ 是句子1的长度

得到RNN Encoding，$\overline{\textbf{p}}_i$统一表示$\overline{\textbf{a}}_i,\overline{\textbf{b}}_i$

2 CNN

在 BiGRU 编码的基础上，使用 CNN 来进行二次编码

结构如下，“newtork in network”,k 是卷积核的kernel size，比如k=1,卷积核为$1 \times 1$

对于每个 CNN 单元，具体的计算过程如下:

得到 RCNN Encoding $\widetilde{\textbf{p}}_i$

2 Interactive Sentence Representation

1 Soft-attention Alignment

attention：

加了attention的rnn encoding：

2 Interaction Modeling

$\overline{\textbf{p}}$是rnn encoding

$\hat{}$是加了attention的rnn encoding

$\widetilde{}$是rcnn encoding

最终得到Interactive Sentence Representation为$\textbf{o}_a,\textbf{o}_b$

3 Similarity Modeling

1 Fusion Layer

g是门控函数

2 Label Prediction

全连接层

4 loss

交叉熵

参考

https://sci-hub.st/10.1145/3366423.3379998

https://zhuanlan.zhihu.com/p/138061003

NLP 文本匹配

Enhanced-RCNN

2021-12-12 d933a4d315059012c62d01afce3156e2 99+ fast 0.0 k

知识图谱

东南的课程：

https://github.com/npubird/KnowledgeGraphCourse

知识图谱

知识图谱综述

2021-12-12 1a6cb8f58eadb9f20eae53a070af5922 99+ 5 m 0.8 k

Pre-train, Prompt, and Predict A Systematic Survey of Prompting Methods in Natural Language Processing

0 和pre-train，finetune区别

prompt感觉是一种特殊的finetune方式，还是先pre-train然后prompt tuning

目的：prompt narrowing the gap between pre-training and fine-tuning

1 怎么做

3步

1 Prompt Addition

$x^{‘}=f_{prompt}(x)$ x是input text

Apply a template, which is a textual string that has two slots: an input slot [X] for input x and an answer slot
[Z] for an intermediate generated answer text z that will later be mapped into y.
Fill slot [X] with the input text x.

2 Answer Search

f：fills in the location [Z] in prompt $x^{‘}$ with the potential answer z

Z：a set of permissible values for z

3 Answer Mapping

因为上面的 $\hat{z}$ 还不是 $\hat{y}$，比如情感分析，“excellent”, “fabulous”, “wonderful” -》positive

go from the highest-scoring answer $\hat{z}$ to the highest-scoring output $\hat{y}$

4 举个例子，文本情感分类的任务

原来

“ I love this movie.” -》 positive

现在

1 $x=$ “ I love this movie.” -》模板为： “ [x] Overall, it was a [z] movie.” -》$x^{‘}$为”I love this movie. Overall ,it was a [z] movie.”

2 下一步会进行答案搜索，顾名思义就是LM寻找填在[z] 处可以使得分数最高的文本 $\hat{z}$(比如”excellent”, “great”, “wonderful” )

3 最后是答案映射。有时LM填充的文本并非任务需要的最终形式(最终为positive，上述为”excellent”, “great”, “wonderful”)，因此要将此文本映射到最终的输出$\hat{y}$

2 Prompt方法分类

3 Prompt Engineering

1 one must first consider the prompt shape,

2 then decide whether to take a manual or automated approach to create prompts of the desired shape

1 Prompt Shape

Prompt的形状主要指的是[X]和[Z]的位置和数量。

如果在句中，一般称这种prompt为cloze prompt；如果在句末，一般称这种prompt为prefix prompt。

在实际应用过程中选择哪一种主要取决于任务的形式和模型的类别。cloze prompts和Masked Language Model的训练方式非常类似，因此对于使用MLM的任务来说cloze prompts更加合适；对于生成任务来说，或者使用自回归LM解决的任务，prefix prompts就会更加合适；Full text reconstruction models较为通用，因此两种prompt均适用。另外，对于文本对的分类，prompt模板通常要给输入预留两个空，[x1]和[x2]。

2 create prompts

1 Manual Template Engineering

2 Automated Template Learning

1 Discrete Prompts

the prompt 作用在文本上

D1: Prompt Mining

D2: Prompt Paraphrasing

D3: Gradient-based Search

D4: Prompt Generation

D5: Prompt Scoring

2 Continuous Prompts

the prompt 直接作用到模型的embedding空间

C1: Prefix Tuning

C2: Tuning Initialized with Discrete Prompts

C3: Hard-Soft Prompt Hybrid Tuning

4 Answer Engineering

two dimensions that must be considered when performing answer
engineering:1 deciding the answer shape and 2 choosing an answer design method.

1 Answer Shape

和Prompt Shape啥区别？？？

2 Answer Space Design Methods

1 Manual Design

2 automatic automatic

1 Discrete Answer Search

2 Continuous Answer Search

5 Multi-Prompt Learning

之前在讨论single prompt，现在介绍multiple prompts

6 Training Strategies for Prompting Methods

1 Training Settings

full-data

few-shot /zero-shot

2 Parameter Update Methods

参考

https://arxiv.org/abs/2107.13586

刘鹏飞博士 https://zhuanlan.zhihu.com/p/395115779

https://zhuanlan.zhihu.com/p/399295895

https://zhuanlan.zhihu.com/p/440169921

https://zhuanlan.zhihu.com/p/399295895

NLP Prompt

Prompt

2021-12-12 27305810d8db04e9529bfb0657ccb1e8 99+ 14 m 2.2 k

Generalizing from a Few Examples A Survey on Few-Shot Learning

paper： https://arxiv.org/abs/1904.05046

git: https://github.com/tata1661/FSL-Mate/tree/master/FewShotPapers#Applications

原文按应用对FSL做了总结，与NLP相关的有：

High-risk learning: Acquiring new word vectors from tiny data, in EMNLP, 2017. A. Herbelot and M. Baroni. paper
MetaEXP: Interactive explanation and exploration of large knowledge graphs, in TheWebConf, 2018. F. Behrens, S. Bischoff, P. Ladenburger, J. Rückin, L. Seidel, F. Stolp, M. Vaichenker, A. Ziegler, D. Mottin, F. Aghaei, E. Müller, M. Preusse, N. Müller, and M. Hunger. paper code
Few-shot representation learning for out-of-vocabulary words, in ACL, 2019. Z. Hu, T. Chen, K.-W. Chang, and Y. Sun. paper
Learning to customize model structures for few-shot dialogue generation tasks, in ACL, 2020. Y. Song, Z. Liu, W. Bi, R. Yan, and M. Zhang. paper
Few-shot slot tagging with collapsed dependency transfer and label-enhanced task-adaptive projection network, in ACL, 2020. Y. Hou, W. Che, Y. Lai, Z. Zhou, Y. Liu, H. Liu, and T. Liu. paper
Meta-reinforced multi-domain state generator for dialogue systems, in ACL, 2020. Y. Huang, J. Feng, M. Hu, X. Wu, X. Du, and S. Ma. paper
Few-shot knowledge graph completion, in AAAI, 2020. C. Zhang, H. Yao, C. Huang, M. Jiang, Z. Li, and N. V. Chawla. paper
Universal natural language processing with limited annotations: Try few-shot textual entailment as a start, in EMNLP, 2020. W. Yin, N. F. Rajani, D. Radev, R. Socher, and C. Xiong. paper code
Simple and effective few-shot named entity recognition with structured nearest neighbor learning, in EMNLP, 2020. Y. Yang, and A. Katiyar. paper code
Discriminative nearest neighbor few-shot intent detection by transferring natural language inference, in EMNLP, 2020. J. Zhang, K. Hashimoto, W. Liu, C. Wu, Y. Wan, P. Yu, R. Socher, and C. Xiong. paper code
Few-shot learning for opinion summarization, in EMNLP, 2020. A. Bražinskas, M. Lapata, and I. Titov. paper code
Adaptive attentional network for few-shot knowledge graph completion, in EMNLP, 2020. J. Sheng, S. Guo, Z. Chen, J. Yue, L. Wang, T. Liu, and H. Xu. paper code
Few-shot complex knowledge base question answering via meta reinforcement learning, in EMNLP, 2020. Y. Hua, Y. Li, G. Haffari, G. Qi, and T. Wu. paper code
Self-supervised meta-learning for few-shot natural language classification tasks, in EMNLP, 2020. T. Bansal, R. Jha, T. Munkhdalai, and A. McCallum. paper code
Uncertainty-aware self-training for few-shot text classification, in NeurIPS, 2020. S. Mukherjee, and A. Awadallah. paper code
Learning to extrapolate knowledge: Transductive few-shot out-of-graph link prediction, in NeurIPS, 2020:. J. Baek, D. B. Lee, and S. J. Hwang. paper code
MetaNER: Named entity recognition with meta-learning, in TheWebConf, 2020. J. Li, S. Shang, and L. Shao. paper
Conditionally adaptive multi-task learning: Improving transfer learning in NLP using fewer parameters & less data, in ICLR, 2021. J. Pilault, A. E. hattami, and C. Pal. paper code
Revisiting few-sample BERT fine-tuning, in ICLR, 2021. T. Zhang, F. Wu, A. Katiyar, K. Q. Weinberger, and Y. Artzi. paper code
Few-shot conversational dense retrieval, in SIGIR, 2021. S. Yu, Z. Liu, C. Xiong, T. Feng, and Z. Liu. paper code
Relational learning with gated and attentive neighbor aggregator for few-shot knowledge graph completion, in SIGIR, 2021. G. Niu, Y. Li, C. Tang, R. Geng, J. Dai, Q. Liu, H. Wang, J. Sun, F. Huang, and L. Si. paper
Few-shot language coordination by modeling theory of mind, in ICML, 2021. H. Zhu, G. Neubig, and Y. Bisk. paper code
Graph-evolving meta-learning for low-resource medical dialogue generation, in AAAI, 2021. S. Lin, P. Zhou, X. Liang, J. Tang, R. Zhao, Z. Chen, and L. Lin. paper
KEML: A knowledge-enriched meta-learning framework for lexical relation classification, in AAAI, 2021. C. Wang, M. Qiu, J. Huang, and X. He. paper
Few-shot learning for multi-label intent detection, in AAAI, 2021. Y. Hou, Y. Lai, Y. Wu, W. Che, and T. Liu. paper code
SALNet: Semi-supervised few-shot text classification with attention-based lexicon construction, in AAAI, 2021. J.-H. Lee, S.-K. Ko, and Y.-S. Han. paper
Learning from my friends: Few-shot personalized conversation systems via social networks, in AAAI, 2021. Z. Tian, W. Bi, Z. Zhang, D. Lee, Y. Song, and N. L. Zhang. paper code
Relative and absolute location embedding for few-shot node classification on graph, in AAAI, 2021. Z. Liu, Y. Fang, C. Liu, and S. C.H. Hoi. paper
Few-shot question answering by pretraining span selection, in ACL-IJCNLP, 2021. O. Ram, Y. Kirstain, J. Berant, A. Globerson, and O. Levy. paper code
A closer look at few-shot crosslingual transfer: The choice of shots matters, in ACL-IJCNLP, 2021. M. Zhao, Y. Zhu, E. Shareghi, I. Vulic, R. Reichart, A. Korhonen, and H. Schütze. paper code
Learning from miscellaneous other-classwords for few-shot named entity recognition, in ACL-IJCNLP, 2021. M. Tong, S. Wang, B. Xu, Y. Cao, M. Liu, L. Hou, and J. Li. paper code
Distinct label representations for few-shot text classification, in ACL-IJCNLP, 2021. S. Ohashi, J. Takayama, T. Kajiwara, and Y. Arase. paper code
Entity concept-enhanced few-shot relation extraction, in ACL-IJCNLP, 2021. S. Yang, Y. Zhang, G. Niu, Q. Zhao, and S. Pu. paper code
On training instance selection for few-shot neural text generation, in ACL-IJCNLP, 2021. E. Chang, X. Shen, H.-S. Yeh, and V. Demberg. paper code
Unsupervised neural machine translation for low-resource domains via meta-learning, in ACL-IJCNLP, 2021. C. Park, Y. Tae, T. Kim, S. Yang, M. A. Khan, L. Park, and J. Choo. paper code
Meta-learning with variational semantic memory for word sense disambiguation, in ACL-IJCNLP, 2021. Y. Du, N. Holla, X. Zhen, C. Snoek, and E. Shutova. paper code
Multi-label few-shot learning for aspect category detection, in ACL-IJCNLP, 2021. M. Hu, S. Z. H. Guo, C. Xue, H. Gao, T. Gao, R. Cheng, and Z. Su. paper
TextSETTR: Few-shot text style extraction and tunable targeted restyling, in ACL-IJCNLP, 2021. P. Rileya, N. Constantb, M. Guob, G. Kumarc, D. Uthusb, and Z. Parekh. paper
Few-shot text ranking with meta adapted synthetic weak supervision, in ACL-IJCNLP, 2021. S. Sun, Y. Qian, Z. Liu, C. Xiong, K. Zhang, J. Bao, Z. Liu, and P. Bennett. paper code
PROTAUGMENT: Intent detection meta-learning through unsupervised diverse paraphrasing, in ACL-IJCNLP, 2021. T. Dopierre, C. Gravier, and W. Logerais. paper code
AUGNLG: Few-shot natural language generation using self-trained data augmentation, in ACL-IJCNLP, 2021. X. Xu, G. Wang, Y.-B. Kim, and S. Lee. paper code
Meta self-training for few-shot neural sequence labeling, in KDD, 2021. Y. Wang, S. Mukherjee, H. Chu, Y. Tu, M. Wu, J. Gao, and A. H. Awadallah. paper code
Knowledge-enhanced domain adaptation in few-shot relation classification, in KDD, 2021. J. Zhang, J. Zhu, Y. Yang, W. Shi, C. Zhang, and H. Wang. paper code
Few-shot text classification with triplet networks, data augmentation, and curriculum learning, in NAACL-HLT, 2021. J. Wei, C. Huang, S. Vosoughi, Y. Cheng, and S. Xu. paper code
Few-shot intent classification and slot filling with retrieved examples, in NAACL-HLT, 2021. D. Yu, L. He, Y. Zhang, X. Du, P. Pasupat, and Q. Li. paper
Non-parametric few-shot learning for word sense disambiguation, in NAACL-HLT, 2021. H. Chen, M. Xia, and D. Chen. paper code
Towards few-shot fact-checking via perplexity, in NAACL-HLT, 2021. N. Lee, Y. Bang, A. Madotto, and P. Fung. paper
ConVEx: Data-efficient and few-shot slot labeling, in NAACL-HLT, 2021. M. Henderson, and I. Vulic. paper
Few-shot text generation with natural language instructions, in EMNLP, 2021. T. Schick, and H. Schütze. paper
Towards realistic few-shot relation extraction, in EMNLP, 2021. S. Brody, S. Wu, and A. Benton. paper code
Few-shot emotion recognition in conversation with sequential prototypical networks, in EMNLP, 2021. G. Guibon, M. Labeau, H. Flamein, L. Lefeuvre, and C. Clavel. paper code
Learning prototype representations across few-shot tasks for event detection, in EMNLP, 2021. V. Lai, F. Dernoncourt, and T. H. Nguyen. paper
Exploring task difficulty for few-shot relation extraction, in EMNLP, 2021. J. Han, B. Cheng, and W. Lu. paper code
Honey or poison? Solving the trigger curse in few-shot event detection via causal intervention, in EMNLP, 2021. J. Chen, H. Lin, X. Han, and L. Sun. paper code
Nearest neighbour few-shot learning for cross-lingual classification, in EMNLP, 2021. M. S. Bari, B. Haider, and S. Mansour. paper
Knowledge-aware meta-learning for low-resource text classification, in EMNLP, 2021. H. Yao, Y. Wu, M. Al-Shedivat, and E. P. Xing. paper code
Few-shot named entity recognition: An empirical baseline study, in EMNLP, 2021. J. Huang, C. Li, K. Subudhi, D. Jose, S. Balakrishnan, W. Chen, B. Peng, J. Gao, and J. Han. paper
MetaTS: Meta teacher-student network for multilingual sequence labeling with minimal supervision, in EMNLP, 2021. Z. Li, D. Zhang, T. Cao, Y. Wei, Y. Song, and B. Yin. paper
Meta-LMTC: Meta-learning for large-scale multi-label text classification, in EMNLP, 2021. R. Wang, X. Su, S. Long, X. Dai, S. Huang, and J. Chen. paper

NLP 小样本

小样本

2021-12-09 d811ccb886d4564a5d237abcea635e61 99+ fast 0.0 k

AutoTokenizer和BertTokenizer区别

https://github.com/huggingface/transformers/issues/5587

NLP 小帮手

AutoTokenizer

2021-12-09 41f67037ea30c74d9a64aaffd980fe99 99+ a minute 0.2 k

继承

1 继承

子没有重写，则继承父

class A:
    x=1
class B(A):
    pass
class C(A):
    pass
B.x=2
print(A.x,B.x,C.x)
A.x=3
print(A.x,B.x,C.x)




1 2 1
3 2 3

2 super

https://blog.csdn.net/weixin_40734030/article/details/122861895

目的：使得子类初始化的时候调用父类的init

例子：

class test1:
    def __init__(self):
        self.a=1

class test2(test1):
    def __init__(self):
        super(test2, self).__init__()
        self.b=2

tt=test2()
# print(tt.a)
print(tt.b)
print(tt.a)


2
1
############################
class test1:
    def __init__(self):
        self.a=1

class test2(test1):
    def __init__(self):
        # super(test2, self).__init__()
        self.b=2

tt=test2()
# print(tt.a)
print(tt.b)
print(tt.a)

2
AttributeError: 'test2' object has no attribute 'a'


class pointwise_hybird_contrasive(hybird):
    def __init__(self,config_roberta, path,num):
        super(pointwise_hybird_contrasive, self).__init__(config_roberta, path,num)
super(pointwise_hybird_contrasive, self).\__init\__(config_roberta, path,num)就是对父类hybird的属性进行初始化

python

继承

2021-12-09 8b5797ab7deb9d62dbfec54f2659ac6b 99+ fast 0.1 k

pytorch常见操作

1 pytorch中对tensor操作

https://blog.csdn.net/HailinPan/article/details/109818774

2 模型加载

1 model.load_state_dict(torch.load(path))

2 model=BertModel.from_pretrained

后者的底层为前者

用法不同，前者model为一个对象，然后用load_state_dict加载权重；后者BertModel为一个类，然后用from_pretrained创建对象并加载权重

机器学习深度学习框架 pytorch

pytorch常见操作

2021-12-07 9432b49a79086904a621b6c0f4f85c94 99+ 2 m 0.3 k

huggingface

NLP小帮手，huggingface的transformer

git： https://github.com/huggingface/transformers

paper： https://arxiv.org/abs/1910.03771v5

整体结构

简单教程：

https://blog.csdn.net/weixin_44614687/article/details/106800244

from_pretrained

底层为load_state_dict

Some weights of the model checkpoint at ../../../../test/data/chinese-roberta-wwm-ext were not used when initializing listnet_bert: ['cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing listnet_bert from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing listnet_bert from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of listnet_bert were not initialized from the model checkpoint at ../../../../test/data/chinese-roberta-wwm-ext and are newly initialized: ['Linear2.weight', 'Linear1.weight', 'Linear1.bias', 'Linear2.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


两部分：1 加载的预训练模型中有参数没有用到  2 自己的模型有参数没有初始化
finetune的时候报这个 很正常
predict的时候应该不会有

关于model

BertModel -> our model

1 加载transformers中的模型

1	from transformers import BertPreTrainedModel, BertModel,AutoTokenizer,AutoConfig

2 基于1中的模型搭建自己的结构