2021-09-23 5a2eab6117a810045f8814ed348d2914 99+ 2 m 0.4 k0 visits

gradient boosting

推导过程

Gradient Boosting为boosting算法的一种，采用和AdaBoost同样的加法模型，在第m次迭代中，前m-1个基学习器都是固定的，即

$f_m(x) = f_{m-1}(x) + \rho_m h_m(x) \tag{1}$

核心思想是得到基学习器$h_m(x)$和权重$p_m$

参数空间的梯度下降很常见，即

$\theta = \theta - \alpha \cdot \frac{\partial}{\partial \theta}L(\theta)$

若将$f(x)$当成参数，则同样可以使用函数空间的梯度下降法：

$f_m(x) = f_{m-1}(x) - \rho_m \cdot \frac{\partial}{\partial f_{m-1}(x)}L(y,f_{m-1}(x)) \tag{2}$

对比（1）（2），我们发现$h_m(x) \approx -\frac{\partial L(y,f_{m-1}(x))}{\partial f_{m-1}(x)}$

算法流程:

1.初始化：$f_0(x) = \mathop{\arg\min}\limits_\gamma \sum\limits_{i=1}^N L(y_i, \gamma)$

2.for m=1 to M:
(a). 计算负梯度： $\tilde{y}_i = -\frac{\partial L(y_i,f_{m-1}(x_i))}{\partial f_{m-1}(x_i)}, \quad i = 1,2 \cdots N$
(b). 通过最小化平方误差，用基学习器$h_m(x)$拟合$\tilde{y_i}$，$w_m = \mathop{\arg\min}\limits_w \sum\limits_{i=1}^{N} \left[\tilde{y}_i - h_m(x_i\,;\,w) \right]^2$
(c). 使用line search确定步长$ρ_m$，使$L$最小，$\rho_m = \mathop{\arg\min}\limits_{\rho} \sum\limits_{i=1}^{N} L(y_i,f_{m-1}(x_i) + \rho h_m(x_i\,;\,w_m))$
(d). $f_m(x) = f_{m-1}(x) + \rho_m h_m(x\,;\,w_m)$

3.输出$f_M(x)$

参考

https://www.cnblogs.com/zhubinwang/p/5170087.html

https://www.cnblogs.com/massquantity/p/9174746.html

gradient boosting

http://example.com/2021/09/23/gradient_boosting/

Author

Lavine Hu

Posted on

2021-09-23

Updated on

2024-04-14

gradient boosting

推导过程

算法流程:

参考

Author

Posted on

Updated on

Licensed under

Like this article? Support the author with

Recents

Categories

Archives

Tags

Subscribe for updates