pytorch常见操作

1 pytorch中对tensor操作

https://blog.csdn.net/HailinPan/article/details/109818774

2 模型加载

1 model.load_state_dict(torch.load(path))

2 model=BertModel.from_pretrained

后者的底层为前者

用法不同,前者model为一个对象,然后用load_state_dict加载权重;后者BertModel为一个类,然后用from_pretrained创建对象并加载权重

权重初始化

参数初始权重为什么不全0或者任意相同值

如果我们将神经网络中的权重集初始化为零或者相同,那么同一层的所有神经元将在反向传播期间开始产生相同的输出和相同的梯度。导致同一层每个神经元完全一样,等价于只有一个

常用的三种权值初始化方法

随机初始化、Xavier initialization、He initialization

参考

https://mdnice.com/writing/6fe7dfe1954945d180d6b36562658af8

https://m.ofweek.com/ai/2021-06/ART-201700-11000-30502442.html

https://blog.csdn.net/qq_15505637/article/details/79362970

pytorch搭建神经网络

0.准备数据,处理数据

1.搭建网络结构

https://www.cnblogs.com/tian777/p/15341522.html

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class pointwise_hybird_contrasive(hybird):
def __init__(self,config_roberta, path,num):
super(pointwise_hybird_contrasive, self).__init__(config_roberta, path,num)
# self.softmax=torch.nn.Softmax()
# self.CrossEntropyLoss=torch.nn.CrossEntropyLoss()
# self.FFN2=
# self.softmax = nn.Softmax(dim=1)
return
def forward(self, input_ids, input_mask, segment_ids, all_en_query, all_en_ans):
ch_match_embedding = self.ch_matching_model(input_ids, input_mask, segment_ids)
en_match_embedding = self.en_matching_model(all_en_query, all_en_ans)

hybird_represent = torch.cat([ch_match_embedding, en_match_embedding], dim=1)
output = self.FFN2(self.relu(self.FFN1(self.dropout(hybird_represent))))

# y_pred_prob, y_pred = torch.max(self.softmax(output.data), 1)

return output
def loss(self,predict,target):
# predict=predict.reshape(-1,target.shape[1])
# predict = torch.squeeze(predict, dim=1)
# predict=torch.unsqueeze(predict, dim=0)
# target=torch.argmax(target,dim=1)
# target= torch.unsqueeze(target, dim=0)
# self.loss(predict,target)
CrossEntropyLoss=torch.nn.CrossEntropyLoss()
return CrossEntropyLoss(predict,target)

def predict(self, output):
softmax = nn.Softmax(dim=1)
y_pred_prob, y_pred = torch.max(softmax(output.data), 1)
# y_pred = y_pred.cpu().numpy()
y_pred_prob = y_pred_prob.cpu().numpy()
for i in range(len(y_pred_prob)):
if not y_pred[i]:
y_pred_prob[i] = 1 - y_pred_prob[i]
return y_pred_prob

nn.Module

https://www.cnblogs.com/tian777/p/15341522.html

1 init

2 forward

3 loss

pytorch各种交叉熵函数的汇总具体使用

https://blog.csdn.net/comway_Li/article/details/121490170

L2和L1正则化

https://blog.csdn.net/guyuealian/article/details/88426648

优化器固定实现L2正则化,源码注释:weight_decay (:obj:float, optional, defaults to 0):
Weight decay (L2 penalty)

1
2
3
4
5
6
param_optimizer = list(model.named_parameters())
no_decay = ['bias', 'LayerNorm.bias', 'LayerNorm.weight']
optimizer_grouped_parameters = [
{'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01},
{'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)], 'weight_decay': 0.0}
]

4 predict

2.构建训练框架

a.数据加载器

Dataset/TensorDataset -》 Sampler -》 Dataloader

https://zhuanlan.zhihu.com/p/337850513#

https://blog.csdn.net/ljp1919/article/details/116484330

https://blog.csdn.net/qq_39507748/article/details/105385709

b.优化器

https://pytorch.org/docs/stable/optim.html

1
2
3
4
5
optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
optimizer=optim.SGD([
{'params': model.base.parameters()},
{'params': model.classifier.parameters(), 'lr': 1e-3}
], lr=1e-2, momentum=0.9)

c 训练

optimizer.zero_grad() 梯度归零, loss.backward() 反向传播 , optimizer.step() 参数更新

https://blog.csdn.net/PanYHHH/article/details/107361827

d. 验证

with torch.no_grad()

验证,测试时候用:可显著减少显存占用

https://wstchhwp.blog.csdn.net/article/details/108405102

https://blog.csdn.net/weixin_44134757/article/details/105775027

e. 评价指标

f. 模型保存

https://blog.csdn.net/m0_37605642/article/details/120325062

https://blog.csdn.net/weixin_41278720/article/details/80759933

g. 可视化

https://blog.csdn.net/Wenyuanbo/article/details/118937790

3.预测

加载模型,输入数据,调用网络结构

参考

https://blog.csdn.net/qq_45847624/article/details/114885655


:D 一言句子获取中...