Loading [MathJax]/jax/output/CommonHTML/jax.js

Felix Flexible Text Editing Through Tagging and Insertion

google继lasertagger之后的又一篇text edit paper

In contrast to conventional sequence-to-sequence (seq2seq) models, FELIX is efficient in low-resource settings and fast at inference time, while being capable of modeling flexible input-output transformations. We achieve this by decomposing the text-editing task into two sub-tasks: tagging to decide on the subset of input tokens and their order in the output text and insertion to in-fill the missing tokens in the output not present in the input.

1 Introduction

In particular, we have designed FELIX with the following requirements in mind: Sample efficiency, Fast inference time, Flexible text editing

2 Model description

FELIX decomposes the conditional probability of generating an output sequence y from an input
x as follows:

p(y|x)=pins(y|ym)ptag(yt,π|x)

2.1 Tagging Model

trained to optimize both the tagging and pointing loss:

L=Lpointing+λLtagging

Tagging :

tag sequence yt由3种tag组成:KEEPDELETEINSERT(INS)

Tags are predicted by applying a single feedforward layer f to the output of the encoder hL (the source sentence is first encoded using a 12-layer BERT-base model). yti=argmax(f(hLi))

Pointing:

Given a sequence x and the predicted tags yt , the re-ordering model generates a permutation π so that from πand yt we can reconstruct the insertion model input ym. Thus we have:

p(ym|x)ip(π(i)|x,yt,i)p(yti|x)

Our implementation is based on a pointer network. The output of this model is a series of predicted pointers (source token → next target token)

The input to the Pointer layer at position i:

hL+1i=f([hLi;e(yti);e(pi)])

其中e(yti)is the embedding of the predicted tag,e(pi) is the positional embedding

The pointer network attends over all hidden states, as such:

p(π(i)|hL+1i)=attention(hL+1i,hL+1π(i))

其中hL+1i as Q, hL+1π(i) as K

When realizing the pointers, we use a constrained beam search

2.2 Insertion Model

To represent masked token spans we consider two options: masking and infilling. In the former case the tagging model predicts how many tokens need to be inserted by specializing the INSERT tag into INSk, where k translates the span into k MASK tokens. For the infilling case the tagging model predicts a generic INS tag.

Note that we preserve the deleted​ span in the input to the insertion model by enclosing it between [REPL] and [/REPL] tags.

our insertion model is also based on a 12-layer BERT-base and we can directly take advantage of the BERT-style pretrained checkpoints.

参考

https://aclanthology.org/2020.findings-emnlp.111.pdf

Felix Flexible Text Editing Through Tagging and Insertion

http://example.com/2021/09/30/felix/

Author

Lavine Hu

Posted on

2021-09-30

Updated on

2021-11-26

Licensed under

Comments

未找到相关的 Issues 进行评论

请联系 @hlw95 初始化创建


:D 一言句子获取中...