Pre-Training with Whole Word Masking for Chinese BERT

BERT-wwm-ext

wwm:whole word mask

ext: we also use extended training data (mark with ext in the model name)

预训练

1 改变mask策略

Whole Word Masking,wwm

cws: Chinese Word Segmentation

对比四种mask策略

参考

Pre-Training with Whole Word Masking for Chinese BERT

https://arxiv.org/abs/1906.08101v3

Revisiting Pre-trained Models for Chinese Natural Language Processing

https://arxiv.org/abs/2004.13922

github:https://hub.fastgit.org/ymcui/Chinese-BERT-wwm

Pre-Training with Whole Word Masking for Chinese BERT

http://example.com/2021/11/04/bert-wwm/

Author

Lavine Hu

Posted on

2021-11-04

Updated on

2022-05-28

Licensed under

Comments

:D 一言句子获取中...