﻿ 引入拉格朗日算子的最佳线性回归模型选择
 大地测量与地球动力学  2021, Vol. 41 Issue (11): 1111-1117  DOI: 10.14075/j.jgg.2021.11.003

YAN Guangfeng, CEN Minyi. Optimum Linear Regression Model Selection Algorithm with Lagrange Multipliers[J]. Journal of Geodesy and Geodynamics, 2021, 41(11): 1111-1117.

YAN Guangfeng, PhD, lecturer, majors in measurement adjustment and data processing, E-mail: gf_y1989@163.com.

1. 内江师范学院地理与资源科学学院，四川省内江市东桐路705号，641100;
2. 西南交通大学地球科学与环境工程学院，成都市犀安路999号，611756

1 附有约束条件的线性回归模型 1.1 顾及自变量和因变量测量误差的线性回归模型

 ${\boldsymbol{y}} = {\boldsymbol{\bar A\xi }} + {{\boldsymbol{e}}_y} = {\boldsymbol{A\xi }} + {{\boldsymbol{E}}_A}{\boldsymbol{\xi }} + {{\boldsymbol{e}}_y}$ (1)

 $\left[ \begin{array}{l} {{\boldsymbol{e}}_{As}}\\ {{\boldsymbol{e}}_y} \end{array} \right]\sim \left( {\left[ \begin{array}{l} {\boldsymbol{0}}\\ {\boldsymbol{0}} \end{array} \right], \;{\sigma _0}\left[ \begin{array}{l} {{\boldsymbol{Q}}_{As}}\;\;\;{\boldsymbol{0}}\\ {\boldsymbol{0}}\;\;\;\;\;{{\boldsymbol{Q}}_y} \end{array} \right]} \right)$ (2)

 $\left\{ \begin{array}{l} {\boldsymbol{a}} = {\boldsymbol{\bar a}} + {{\boldsymbol{e}}_{As}}\\ {\boldsymbol{y}} = {\boldsymbol{A\xi }} + {{\boldsymbol{E}}_A}{\boldsymbol{\xi }} + {{\boldsymbol{e}}_y} \end{array} \right.$ (3)

 $\left\{ \begin{array}{l} {\boldsymbol{a}} = {\boldsymbol{\bar a}} + {{\boldsymbol{e}}_{As}}\\ {\boldsymbol{y}} = {\boldsymbol{A}}{{\boldsymbol{\xi }}_{(i)}} + {{\boldsymbol{E}}_{A(i)}}{{\boldsymbol{\xi }}_{(i)}} + {\boldsymbol{A}}δ{\boldsymbol{\xi }} + {{\boldsymbol{E}}_{A(i)}}δ{\boldsymbol{\xi }} + {\boldsymbol{R}}{{\boldsymbol{e}}_{As}} + {{\boldsymbol{e}}_y} \end{array} \right.$ (4)

 $\varphi \left( {{{\boldsymbol{e}}_y}, \;{{\boldsymbol{e}}_{As}}} \right) = {\boldsymbol{e}}_y^{\mathop{\rm T}\nolimits} {\boldsymbol{Q}}_y^{ - 1}{{\boldsymbol{e}}_y} + {\boldsymbol{e}}_{As}^{\mathop{\rm T}\nolimits} {\boldsymbol{Q}}_{As}^{ - 1}{{\boldsymbol{e}}_{As}}$ (5)

 ${{\boldsymbol{\hat \beta }}_{(i + 1)}} = {\left( {{\boldsymbol{B}}_{(i)}^{\mathop{\rm T}\nolimits} {{\boldsymbol{Q}}^{ - 1}}{{\boldsymbol{B}}_{(i)}}} \right)^{ - 1}}{\boldsymbol{B}}_{(i)}^{\mathop{\rm T}\nolimits} {{\boldsymbol{Q}}^{ - 1}}{{\boldsymbol{l}}_{(i)}}$ (6)
 ${{\boldsymbol{v}}_{(i + 1)}} = {{\boldsymbol{B}}_{(i)}}{\left( {{\boldsymbol{B}}_{_{(i)}}^{\mathop{\rm T}\nolimits} {{\boldsymbol{Q}}^{ - 1}}{{\boldsymbol{B}}_{(i)}}} \right)^{ - 1}}{\boldsymbol{B}}_{_{(i)}}^{\mathop{\rm T}\nolimits} {{\boldsymbol{Q}}^{ - 1}}{{\boldsymbol{l}}_{(i)}} - {{\boldsymbol{l}}_{(i)}}$ (7)

1.2 附有参数约束的线性回归模型

 $\left\{ \begin{array}{l} {\boldsymbol{a}} = {\boldsymbol{\bar a}} + {{\boldsymbol{e}}_{As}}\\ {\boldsymbol{y}} = {\boldsymbol{A\xi }} + {{\boldsymbol{E}}_A}{\boldsymbol{\xi }} + {{\boldsymbol{e}}_y}\\ {{\boldsymbol{G}}_j}\left( {\boldsymbol{\xi }} \right) = {{\boldsymbol{b}}_j} \end{array} \right.$ (8)

 $\left\{ \begin{array}{l} {\boldsymbol{a}} = {\boldsymbol{\bar a}} + {{\boldsymbol{e}}_{As}}\\ {\boldsymbol{y}} = {\boldsymbol{A}}{{\boldsymbol{\xi }}_{(m)}} + {{\boldsymbol{E}}_{A(m)}}{{\boldsymbol{\xi }}_{(m)}} + {\boldsymbol{A}}δ{\boldsymbol{\xi }} + \\ \;\;\;\;\;\;\;\;{{\boldsymbol{E}}_{A(m)}}δ{\boldsymbol{\xi }} + {\boldsymbol{R}}{{\boldsymbol{e}}_{As}} + {{\boldsymbol{e}}_y}\\ {{\boldsymbol{G}}_j}\left( {{{\boldsymbol{\xi }}_{(m)}}} \right) + {{{\boldsymbol{G'}}}_j}\left( {{{\boldsymbol{\xi }}_{(m)}}} \right)\delta {\boldsymbol{\xi }} = {{\boldsymbol{b}}_j} \end{array} \right.$ (9)

 $\varphi \left( {{{\boldsymbol{e}}_y}, \;{{\boldsymbol{e}}_{As}}} \right) = {\boldsymbol{e}}_y^{\mathop{\rm T}\nolimits} {\boldsymbol{Q}}_y^{ - 1}{{\boldsymbol{e}}_y} + {\boldsymbol{e}}_{As}^{\mathop{\rm T}\nolimits} {\boldsymbol{Q}}_{As}^{ - 1}{{\boldsymbol{e}}_{As}} +\\ \;\;\;\;\;\; 2{{\boldsymbol{K}}^{\rm{T}}}\left( {{{{\boldsymbol{G'}}}_j}\left( {{\xi _{(m)}}} \right)\delta {\boldsymbol{\xi }} + {{\boldsymbol{G}}_j}\left( {{{\boldsymbol{\xi }}_{(m)}}} \right) - {{\boldsymbol{b}}_j}} \right)$ (10)

 $\begin{array}{l} \;\;{{{\boldsymbol{\hat \beta }}}_{(m + 1)}} = \left( {{\boldsymbol{N}}_{BB(m)}^{ - 1} - {\boldsymbol{N}}_{BB(m)}^{ - 1}{{{\boldsymbol{G'}}}_j}{{\left( {{{\boldsymbol{\xi }}_{(m)}}} \right)}^{\rm{T}}}} \right.\\ \;\;\;\;\left. {{\boldsymbol{N}}_{GG(m)}^{ - 1}{{{\boldsymbol{G'}}}_j}\left( {{{\boldsymbol{\xi }}_{(m)}}} \right){\boldsymbol{N}}_{BB(m)}^{ - 1}} \right){{\boldsymbol{W}}_{(m)}} - \\ {\boldsymbol{N}}_{BB(m)}^{ - 1}{{{\boldsymbol{G'}}}_j}{\left( {{{\boldsymbol{\xi }}_{(m)}}} \right)^{\rm{T}}}{\boldsymbol{N}}_{GG(m)}^{ - 1}\left( {{{\boldsymbol{G}}_j}\left( {{{\boldsymbol{\xi }}_{(m)}}} \right) - {{\boldsymbol{b}}_j}} \right) \end{array}$ (11)
 ${{\boldsymbol{v}}_{(m + 1)}} = {{\boldsymbol{B}}_{(m)}}{\hat \beta _{(m + 1)}} - {{\boldsymbol{l}}_{(m)}}$ (12)

 ${\boldsymbol{K}} = {\boldsymbol{N}}_{GG(m)}^{ - 1}\left( {{{{\boldsymbol{G'}}}_j}\left( {{{\boldsymbol{\xi }}_{(m)}}} \right){\boldsymbol{N}}_{BB(m)}^{ - 1}{{\boldsymbol{W}}_{(m)}} + {{\boldsymbol{G}}_j}\left( {{{\boldsymbol{\xi }}_{(m)}}} \right) - {{\boldsymbol{b}}_j}} \right)$ (13)
 ${{\boldsymbol{D}}_{KK}} = \hat \sigma _0^2{{\boldsymbol{Q}}_{KK}} = \hat \sigma _0^2{\boldsymbol{N}}_{GG(m)}^{ - 1}$ (14)

2 最佳线性回归模型选择

2.1 模型初选

 $\left\{ \begin{array}{l} {H_0}:{F_{(f + 1)}} = 0\;{\rm{is\;the\;primary\;regression\;model}}\\ {H_1}:{F_1} = 0\;{\rm{is\;the\;primary\;regression\;\;model}}\\ {H_2}:{F_2} = 0\;{\rm{is\;the\;primary\;regression\;\;model}}\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\; \vdots \\ {H_f}:{F_f} = 0\;{\rm{is\;the\;primary\;regression\;\;model}} \end{array} \right.$ (15)

 $T = \mathop {\max }\limits_{s = 1, 2...c'} \left| {\frac{{{k_s}}}{{{{\hat \sigma }_{{k_s}{k_s}}}}}} \right|$ (16)

 ${\hat \sigma _0} = \sqrt {{\boldsymbol{v}}_{{\rm{(}}m + {\rm{1)}}}^{\rm{T}}{\boldsymbol{Q}}{{\boldsymbol{v}}_{{\rm{(}}m + {\rm{1)}}}}/\left( {n - t} \right)}$ (17)

 ${P_1} \approx \prod\limits_{i = 1}^{c'} {\left( {1 - {\alpha _1}} \right)} = {\left( {1 - {\alpha _1}} \right)^{c'}}$ (18)

 $\begin{array}{l} \;\;\;\;1 - \alpha = {\left( {1 - {\alpha _1}} \right)^{c'}} = 1 + c'\left( { - {\alpha _1}} \right) + \\ \left( {c'\left( {c' - 1} \right)/2} \right){\left( { - {\alpha _1}} \right)^2} + \left( {c'\left( {c' - 1} \right)\left( {c' - 2} \right)} \right./\\ \;\;\;\;\;\;\;\;\;\;\;\;\left. {\left( {3 \times 2} \right)} \right){\left( { - {\alpha _1}} \right)^3} + \cdots \end{array}$ (19)

 $1 - \alpha \approx 1 - c'{\alpha _1}$ (20)
 ${\alpha _1} \approx \alpha /c'$ (21)

1) 将待选线性回归模型中参数最多的模型表示为式(4)形式，以参数最小二乘解为初值，根据式(6)和式(7)迭代求解无约束线性回归模型的系数矩阵误差EA(m)、待估参数ξ(m)和验后单位权中误差${\hat \sigma _0}$

2) 将其他待选模型统一为式(8)形式，并在(ξ(m)eAs(m))处采用泰勒级数展开，得到式(9)形式。

3) 设原假设H0和备选假设HAH0：待选模型中参数个数最少者为PRM；HA：其他模型为PRM。

4) 对于由参数约束最多的模型，根据式(13)和式(14)得到拉格朗日算子向量K及其对应的方差协方差阵DKK

5) 根据式(16)和式(21)构造假设检验统计量T并进行t检验，若T < tα1/2H0成立，得到PRM，算法结束；否则执行步骤6)。

6) 从待选模型组合中删除原假设对应的线性回归模型，得到新的待选模型组合，并重复步骤3)~5)，直到H0成立，算法结束。

2.2 最佳模型选择

 ${{\boldsymbol{D}}_{{{\hat \beta }_{{\rm{(}}m{\rm{)}}}}{{\hat \beta }_{{\rm{(}}m{\rm{)}}}}}} = \hat \sigma _0^2{{\boldsymbol{Q}}_{{{\hat \beta }_{{\rm{(}}m{\rm{)}}}}{{\hat \beta }_{{\rm{(}}m{\rm{)}}}}}} = \hat \sigma _0^2{\boldsymbol{N}}_{BB{\rm{(}}m{\rm{)}}}^{ - 1}$ (22)

 $\begin{array}{l} {{\boldsymbol{D}}_{{{\hat \beta }_j}{{\hat \beta }_j}}} = \hat \sigma _j^2{{\boldsymbol{Q}}_{{{\hat \beta }_j}{{\hat \beta }_j}}} = \hat \sigma _j^2\left( {{\boldsymbol{N}}_{BB(m)}^{ - 1} - {\boldsymbol{N}}_{BB(m)}^{ - 1}{{{\boldsymbol{G'}}}_j}{{\left( {{{\boldsymbol{\xi }}_{(m)}}} \right)}^{\rm{T}}}} \right.\\ \left. {\;\;\;\;\;\;\;\;\;\;\;\;\;{\boldsymbol{N}}_{GG(m)}^{ - 1}{{{\boldsymbol{G'}}}_j}\left( {{{\boldsymbol{\xi }}_{(m)}}} \right){\boldsymbol{N}}_{BB(m)}^{ - 1}} \right) \end{array}$ (23)

 ${\hat \sigma _j} = \sqrt {{\boldsymbol{v}}_{j{\rm{(}}m + 1{\rm{)}}}^{\rm{T}}{\boldsymbol{Q}}{{\boldsymbol{v}}_{j{\rm{(}}m + 1{\rm{)}}}}/\left( {n - t + {{c'}_j}} \right)}$ (24)

3 算例分析

1) 仿射变换模型：

 $\left\{ \begin{array}{l} {x_{{\rm{I}}{{\rm{I}}_i}}} = {X_0} + {a_1}{x_{{{\rm{I}}_i}}} + {a_2}{y_{{{\rm{I}}_i}}}\\ {y_{{\rm{I}}{{\rm{I}}_i}}} = {Y_0} - {b_1}{x_{{{\rm{I}}_i}}} + {b_2}{y_{{{\rm{I}}_i}}} \end{array} \right.$ (25)

 $\left\{ \begin{array}{l} \frac{{\partial \Delta {x_i}}}{{\partial {x_{{{\rm{I}}_i}}}}} = \frac{{\partial \Delta {y_i}}}{{\partial {y_{{{\rm{I}}_i}}}}}\\ \frac{{\partial \Delta {x_i}}}{{\partial {y_{{{\rm{I}}_i}}}}} = - \frac{{\partial \Delta {y_i}}}{{\partial {x_{{{\rm{I}}_i}}}}} \end{array} \right.$ (26)

 $\left\{ \begin{array}{l} {a_1} - {b_2} = 0\\ {a_2} - {b_1} = 0 \end{array} \right.$ (27)

 $a_1^2 + a_2^2 = 1$ (28)

 ${a_2} = 0$ (29)

4 结语

1) 最佳模型的形式完全由观测数据的实际情况决定，将众多待选模型统一为附有参数约束的线性回归模型，该观点是线性回归分析、坐标转换和自回归分析等问题建模时获得最佳模型的依据，由此可将模型的优选问题转化为含有多个备选假设的假设检验问题。

2) 对于合理的参数约束，顾及参数约束的线性回归模型，其参数解算精度较无约束的线性回归模型可得到一定程度提高，因此在实际建模时，对模型进行优选分析具有重要意义。

3) OLRS-LM算法可准确找出既符合观测数据实际、参数解算精度又高的最佳平差模型，其以拉格朗日算子构造假设检验统计量，能够客观、量化地诊断参数约束与观测数据之间的兼容性，较以残差平方和构造假设检验统计量的线性假设法的检验功效更高，结果更可靠。

Optimum Linear Regression Model Selection Algorithm with Lagrange Multipliers
YAN Guangfeng1     CEN Minyi2
1. School of Geography and Resource Science, Neijiang Normal University, 705 Dongtong Road, Neijiang 641100, China;
2. Faculty of Geosciences and Environmental Engineering, Southwest Jiaotong University, 999 Xi'an Road, Chengdu 611756, China
Abstract: On the basis of linear regression model, considering the measurement errors of independent variables and dependent variables at the same time, this paper first unifies many models to be selected into the linear regression model with constraints, adopts the hypothesis testing theory with multiple alternative hypotheses, and then constructs the hypothesis testing statistics with Lagrange multipliers. We propose the optimum selection algorithm of the linear regression model. The experimental results show that the proposed algorithm can obtain the optimum linear regression model, which is in accordance with actual observations and simpler than the improved linear hypothesis method.
Key words: linear regression model; optimum adjustment model; Lagrange multiplier; hypothesis testing