prompt trick

目的

通过模板使得预测任务与预训练模型的训练任务相统一,拉近预训练任务目标与下游微调目标的差距

和finetune差异

finetune:PTM向下兼容specific task

prompt:specific task向上兼容PTM

应用场景

由于其当前预测任务与预训练模型的训练任务相统一,所以我们可以在训练数据较少,甚至没有的情况下去完成当前任务,总结一下,其比较适合的应用场景:

  1. zero-shot
  2. few-shot
  3. 冷启动

参考

https://zhuanlan.zhihu.com/p/424888379

https://zhuanlan.zhihu.com/p/440169921

Pre-train, Prompt, and Predict A Systematic Survey of Prompting Methods in Natural Language Processing

0 和pre-train,finetune区别

prompt感觉是一种特殊的finetune方式,还是先pre-train然后prompt tuning

目的:prompt narrowing the gap between pre-training and fine-tuning

1 怎么做

3步

1 Prompt Addition

$x^{‘}=f_{prompt}(x)$ x是input text

  1. Apply a template, which is a textual string that has two slots: an input slot [X] for input x and an answer slot
    [Z] for an intermediate generated answer text z that will later be mapped into y.
  2. Fill slot [X] with the input text x.

f:fills in the location [Z] in prompt $x^{‘}$ with the potential answer z

Z:a set of permissible values for z

3 Answer Mapping

因为上面的 $\hat{z}$ 还不是 $\hat{y}$,比如情感分析,“excellent”, “fabulous”, “wonderful” -》positive

go from the highest-scoring answer $\hat{z}$ to the highest-scoring output $\hat{y}$

4 举个例子,文本情感分类的任务

原来

“ I love this movie.” -》 positive

现在

1 $x=$ “ I love this movie.” -》模板为: “ [x] Overall, it was a [z] movie.” -》$x^{‘}$为”I love this movie. Overall ,it was a [z] movie.”

2 下一步会进行答案搜索,顾名思义就是LM寻找填在[z] 处可以使得分数最高的文本 $\hat{z}$(比如”excellent”, “great”, “wonderful” )

3 最后是答案映射。有时LM填充的文本并非任务需要的最终形式(最终为positive,上述为”excellent”, “great”, “wonderful”),因此要将此文本映射到最终的输出$\hat{y}$

2 Prompt方法分类

3 Prompt Engineering

1 one must first consider the prompt shape,

2 then decide whether to take a manual or automated approach to create prompts of the desired shape

1 Prompt Shape

Prompt的形状主要指的是[X]和[Z]的位置和数量。

如果在句中,一般称这种prompt为cloze prompt;如果在句末,一般称这种prompt为prefix prompt

在实际应用过程中选择哪一种主要取决于任务的形式和模型的类别。cloze prompts和Masked Language Model的训练方式非常类似,因此对于使用MLM的任务来说cloze prompts更加合适;对于生成任务来说,或者使用自回归LM解决的任务,prefix prompts就会更加合适;Full text reconstruction models较为通用,因此两种prompt均适用。另外,对于文本对的分类,prompt模板通常要给输入预留两个空,[x1]和[x2]。

2 create prompts

1 Manual Template Engineering

2 Automated Template Learning

1 Discrete Prompts

the prompt 作用在文本上

D1: Prompt Mining

D2: Prompt Paraphrasing

D3: Gradient-based Search

D4: Prompt Generation

D5: Prompt Scoring

2 Continuous Prompts

the prompt 直接作用到模型的embedding空间

C1: Prefix Tuning

C2: Tuning Initialized with Discrete Prompts

C3: Hard-Soft Prompt Hybrid Tuning

4 Answer Engineering

two dimensions that must be considered when performing answer
engineering:1 deciding the answer shape and 2 choosing an answer design method.

1 Answer Shape

和Prompt Shape啥区别???

2 Answer Space Design Methods

1 Manual Design
2 automatic automatic

5 Multi-Prompt Learning

之前在讨论single prompt,现在介绍multiple prompts

6 Training Strategies for Prompting Methods

1 Training Settings

full-data

few-shot /zero-shot

2 Parameter Update Methods

参考

https://arxiv.org/abs/2107.13586

刘鹏飞博士 https://zhuanlan.zhihu.com/p/395115779

https://zhuanlan.zhihu.com/p/399295895

https://zhuanlan.zhihu.com/p/440169921

https://zhuanlan.zhihu.com/p/399295895

  

:D 一言句子获取中...