0.准备数据,处理数据 1.搭建网络结构 https://www.cnblogs.com/tian777/p/15341522.html
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 class pointwise_hybird_contrasive(hybird): def __init__(self,config_roberta, path,num): super(pointwise_hybird_contrasive, self).__init__(config_roberta, path,num) # self.softmax=torch.nn.Softmax() # self.CrossEntropyLoss=torch.nn.CrossEntropyLoss() # self.FFN2= # self.softmax = nn.Softmax(dim=1) return def forward(self, input_ids, input_mask, segment_ids, all_en_query, all_en_ans): ch_match_embedding = self.ch_matching_model(input_ids, input_mask, segment_ids) en_match_embedding = self.en_matching_model(all_en_query, all_en_ans) hybird_represent = torch.cat([ch_match_embedding, en_match_embedding], dim=1) output = self.FFN2(self.relu(self.FFN1(self.dropout(hybird_represent)))) # y_pred_prob, y_pred = torch.max(self.softmax(output.data), 1) return output def loss(self,predict,target): # predict=predict.reshape(-1,target.shape[1]) # predict = torch.squeeze(predict, dim=1) # predict=torch.unsqueeze(predict, dim=0) # target=torch.argmax(target,dim=1) # target= torch.unsqueeze(target, dim=0) # self.loss(predict,target) CrossEntropyLoss=torch.nn.CrossEntropyLoss() return CrossEntropyLoss(predict,target) def predict(self, output): softmax = nn.Softmax(dim=1) y_pred_prob, y_pred = torch.max(softmax(output.data), 1) # y_pred = y_pred.cpu().numpy() y_pred_prob = y_pred_prob.cpu().numpy() for i in range(len(y_pred_prob)): if not y_pred[i]: y_pred_prob[i] = 1 - y_pred_prob[i] return y_pred_prob
nn.Module
https://www.cnblogs.com/tian777/p/15341522.html
1 init 2 forward 3 loss pytorch各种交叉熵函数的汇总具体使用
https://blog.csdn.net/comway_Li/article/details/121490170
L2和L1正则化
https://blog.csdn.net/guyuealian/article/details/88426648
优化器固定实现L2正则化,源码注释:weight_decay (:obj:float
, optional
, defaults to 0): Weight decay (L2 penalty)
1 2 3 4 5 6 param_optimizer = list(model.named_parameters()) no_decay = ['bias', 'LayerNorm.bias', 'LayerNorm.weight'] optimizer_grouped_parameters = [ {'params': [p for n, p in param_optimizer if not any(nd in n for nd in no_decay)], 'weight_decay': 0.01}, {'params': [p for n, p in param_optimizer if any(nd in n for nd in no_decay)], 'weight_decay': 0.0} ]
4 predict 2.构建训练框架 a.数据加载器 Dataset/TensorDataset -》 Sampler -》 Dataloader
https://zhuanlan.zhihu.com/p/337850513#
https://blog.csdn.net/ljp1919/article/details/116484330
https://blog.csdn.net/qq_39507748/article/details/105385709
b.优化器 https://pytorch.org/docs/stable/optim.html
1 2 3 4 5 optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9) optimizer=optim.SGD([ {'params': model.base.parameters()}, {'params': model.classifier.parameters(), 'lr': 1e-3} ], lr=1e-2, momentum=0.9)
c 训练 optimizer.zero_grad() 梯度归零, loss.backward() 反向传播 , optimizer.step() 参数更新
https://blog.csdn.net/PanYHHH/article/details/107361827
d. 验证 with torch.no_grad()
验证,测试时候用:可显著减少显存占用
https://wstchhwp.blog.csdn.net/article/details/108405102
https://blog.csdn.net/weixin_44134757/article/details/105775027
e. 评价指标 f. 模型保存 https://blog.csdn.net/m0_37605642/article/details/120325062
https://blog.csdn.net/weixin_41278720/article/details/80759933
g. 可视化 https://blog.csdn.net/Wenyuanbo/article/details/118937790
3.预测 加载模型,输入数据,调用网络结构
参考 https://blog.csdn.net/qq_45847624/article/details/114885655