bertviz:attention可视化工具

看不同layer,不同head的attention

注意:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from bertviz.neuron_view import show
from bertviz.transformers_neuron_view import BertModel, BertTokenizer
model1=BertModel.from_pretrained(path)
model_type = 'bert'

show(model1, model_type, tokenizer, sentence_a, sentence_b, layer=4, head=3)
可以
###########################

from bertviz.neuron_view import show
from transformers import BertTokenizer, BertModel
model1=BertModel.from_pretrained(path)
model_type = 'bert'

show(model1, model_type, tokenizer, sentence_a, sentence_b, layer=4, head=3)
报错

参考

https://zhuanlan.zhihu.com/p/457043243

huggingface

NLP小帮手,huggingface的transformer

git: https://github.com/huggingface/transformers

paper: https://arxiv.org/abs/1910.03771v5

整体结构

简单教程:

https://blog.csdn.net/weixin_44614687/article/details/106800244

from_pretrained

底层为load_state_dict

1
2
3
4
5
6
7
8
9
10
Some weights of the model checkpoint at ../../../../test/data/chinese-roberta-wwm-ext were not used when initializing listnet_bert: ['cls.predictions.transform.dense.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.dense.bias', 'cls.predictions.decoder.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing listnet_bert from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing listnet_bert from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of listnet_bert were not initialized from the model checkpoint at ../../../../test/data/chinese-roberta-wwm-ext and are newly initialized: ['Linear2.weight', 'Linear1.weight', 'Linear1.bias', 'Linear2.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


两部分:1 加载的预训练模型中有参数没有用到 2 自己的模型有参数没有初始化
finetune的时候报这个 很正常
predict的时候应该不会有

关于model

BertModel -> our model

1 加载transformers中的模型

1
from transformers import BertPreTrainedModel, BertModel,AutoTokenizer,AutoConfig

2 基于1中的模型搭建自己的结构


:D 一言句子获取中...