文章信息
- 崔冶敏, 孙慧慧
- CUI Ye-min, SUN Hui-hui
- 纵向数据半参数下的负二项模型
- Negative Binomial Semiparametric Models of Longitudinal Data
- 广西民族大学学报(自然科学版), 2017, 23(2): 58-60
- Journal of Guangxi University for Nationalities(Natural Science Edition), 2017, 23(2): 58-60
-
文章历史
- 收稿日期: 2017-03-13
纵向数据是指对同一组个体在不同时间点上重复观测得到的数据,在经济学、生物学、医学等领域有着广泛的应用.半参数模型既有参数分量,又含有非参数分量,能够刻画纵向数据内部的相关性.Zeger和Diggle[1]用迭代算法讨论半参数模型的估计问题,He[2]采用回归样条估计非参数部分,柴根象等[3]利用二阶段估计方法讨论该模型的估计问题,赵江[4]研究了半参数回归函数的收敛性,田萍[5-6]等讨论了纵向数据下半参数模型的相合性及渐近性质.本文建立纵向半参数负二项模型,对此模型参数估计,并得出估计的性质,最后并利用R软件随机模拟.
1 模型介绍考虑以下纵向数据半参数模型
| $ {y_{ij}} = x_{ij}^T{\beta _p} + V\left( {{t_{ij}}} \right) + {\varepsilon _{ij}},i = 1,2, \cdots ,m,j = 1,2, \cdots ,{n_i} $ | (1) |
其中β是p维未知参数,Xij是协变量,Xijβj是参数固定效应部分,tij为随机变量且独立同分布,εij为模型误差,E(εij)=0,Var(εij)=σ2 < ∞,yij是第i个个体在时间tij的响应值,N=mni.
2 参数估计对变量xij=(xij1, xij2, …, xijp)给定m×ni个观测样本,同一般线性模型一样,利用观测数据对模型中的参数βp及函数V(tij)做出估计.用π(xij, tij)表示当x=xij, t=tij时事件发生的概率,则Yij是服从参数n和π(xij, tij)的负二项分布.V(tij)是关于时间tij的函数.
设
| $ V\left( {{t_{ij}}} \right) = {\beta _{p + 1}}{T_1}\left( {{t_{ij}}} \right) + {\beta _{p + 2}}{T_2}\left( {{t_{ij}}} \right) + \cdots + {\beta _{p + q}}{T_q}\left( {{t_{ij}}} \right) = T_{ij}^T{\beta _q} $ | (2) |
其中Tij=(T1(tij), T2(tij), …, Tq(tij))T是基函数向量.Wij=(xijT, TijT)Tβ=(βpT, βqT)T,
则相应得半参数负二项模型为:
| $ {y_{ij}} = x_{ij}^T{\beta _p} + T_{ij}^T{\beta _q} + {\varepsilon _{ij}} = W_{ij}^T\beta + {\varepsilon _{ij}},i = ,1,2, \cdots ,m,j = 1,2, \cdots ,{n_i} $ |
则
(Y11=y11, Y12=y12, …)的对数似然函数为
| $ \begin{array}{l} L\left( {\beta ,{y_{11}},{y_{12}}, \cdots ,{y_{m{n_i}}}} \right)\prod\limits_{i = 1}^m {\prod\limits_{j = 1}^{{n_i}} {C_{n + {y_{ij}} - 1}^{n - 1}\pi {{\left( {{x_{ij}},{t_{ij}}} \right)}^n}\left( {1 - \pi {{\left( {{x_{ij}},{t_{ij}}} \right)}^{{y_{ij}}}}} \right)} } \\ \ln L\left[ {P\left( {{Y_{ij}} = {y_{ij}}} \right)} \right] = \ln \left( {\prod\limits_{i = 1}^m {\prod\limits_{j = 1}^{{n_i}} {C_{n + {y_{ij}} - 1}^{n - 1}} } } \right) + n\prod\limits_{i = 1}^m {\prod\limits_{j = 1}^{{n_i}} {\ln \pi \left( {{x_{ij}},{t_{ij}}} \right)} } + {y_{ij}}\prod\limits_{i = 1}^m {\prod\limits_{j = 1}^{{n_i}} {\ln \left( {1 - \pi \left( {{x_{ij}},{t_{ij}}} \right)} \right)} } \\ = \ln \left( {\prod\limits_{i = 1}^m {\prod\limits_{j = 1}^{{n_i}} {C_{n + {y_{ij}} - 1}^{n - 1}} } } \right) + n\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {\ln \left( {\frac{{\exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)}}{{1 + \exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)}}} \right)} } + \sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {\ln \left( {\frac{1}{{1 + \exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)}}} \right)} } {y_{ij}}\\ = \ln \left( {\prod\limits_{i = 1}^m {\prod\limits_{j = 1}^{{n_i}} {C_{n + {y_{ij}} - 1}^{n - 1}} } } \right) + n\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {\ln \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right) - \left( {n + 1} \right)} } + \prod\limits_{i = 1}^m {\prod\limits_{j = 1}^{{n_i}} {\ln \left( {1 + \exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)} \right)} } + \prod\limits_{i = 1}^m {\prod\limits_{j = 1}^{{n_i}} {\ln {y_{ij}}} } \end{array} $ | (3) |
故对数似然函数对β进行一阶求导得出如下结果:
| $ \begin{array}{l} \frac{{\partial \ln L\left( {\beta ,{y_{11}},{y_{12}}, \cdots ,{y_{1n}},{y_{21}},{y_{22}}, \cdots ,{y_{mni}}} \right)}}{{{\partial _\beta }}}\\ = n\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {W_{ij}^T - \left( {n + 1} \right)} } \sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {\frac{{\exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)W_{ij}^T}}{{1 + \exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)}}} } \\ = n\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {W_{ij}^T - \left( {n + 1} \right)} } \sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {W_{ij}^T\left( {1 - \frac{1}{{1 + \exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)}}} \right)} } \\ = \left( {n + 1} \right)\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {\frac{{W_{ij}^T}}{{1 + \exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)}}} } - \sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {W_{ij}^T} } \end{array} $ | (4) |
关于β求解方程,得到β的极大似然估计
下面在(3) 式基础上对似然函数进行参数的二阶偏导数,并给出Fisher信息阵.
| $ \begin{array}{l} \frac{{{\partial ^2}\ln L\left( {\beta ,{y_{11}}, \cdots ,{y_{n1}},{y_{21}}, \cdots ,{y_{m{n_i}}}} \right)}}{{\partial \beta \times \partial {\beta ^T}}}\\ = - \left( {n + 1} \right)\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {{{\left[ {1 + \exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right)} \right]}^{ - 2}} \cdot \exp \left( {W_{ij}^T\beta + {\varepsilon _{ij}}} \right){W_{ij}}W_{ij}^T} } \\ = - \left( {n + 1} \right)\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {{W_{ij}}W_{ij}^T\pi \left( {{x_{ij}},{t_{ij}}} \right)\left( {1 - \pi \left( {{x_{ij}},{t_{ij}}} \right)} \right)} } \end{array} $ | (5) |
由Fisher信息阵的性质可知Iij(θ)=
| $ \begin{array}{l} I\left( \beta \right) = E{\left( { - \frac{{{\partial ^2}\ln L\left( {\beta ,{y_{11}}, \cdots ,{y_{n1}},{y_{21}}, \cdots ,{y_{m{n_i}}}} \right)}}{{\partial \beta \times \partial {\beta ^T}}}} \right)_{\left( {p + q} \right) \times \left( {p + q} \right)}}\\ = {\left[ {\left( {n + 1} \right)\sum\limits_{i = 1}^m {\sum\limits_{j = 1}^{{n_i}} {{W_{ij}}W_{ij}^T\pi \left( {{x_{ij}},{t_{ij}}} \right)\left( {1 - \pi \left( {{x_{ij}},{t_{ij}}} \right)} \right)} } } \right]_{\left( {p + q} \right) \times \left( {p + q} \right)}} \end{array} $ |
对数似然函数ln(β, y11, y12, …, y1n1, …, ymni)为目标函数,当β=β(t)时,
| $ {u^{\left( t \right)}} = \frac{{\partial \ln L\left( {\beta ,{y_{11}},{y_{12}}, \cdots ,{y_{1n}},{y_{21}},{y_{22}}, \cdots ,{y_{mni}}} \right)}}{{{\partial _\beta }}}, $ |
| $ {v^{\left( t \right)}} = \frac{{{\partial ^2}\ln L\left( {\beta ,{y_{11}}, \cdots ,{y_{n1}},{y_{21}}, \cdots ,{y_{m{n_i}}}} \right)}}{{\partial \beta \times \partial {\beta ^T}}}, $ |
| $ {U^{\left( t \right)}} = {\left( {u_0^{\left( t \right)},u_1^{\left( t \right)}, \cdots ,u_{p + q - 1}^{\left( t \right)}} \right)^T},{V^{\left( t \right)}} = {\left( {{v^{\left( t \right)}}} \right)_{\left( {p + q} \right) \times \left( {p + q} \right)}} $ |
则可以得到负二项模型参数β的最大似然估计的Newton-Raphson迭代公式是
| $ {\beta ^{\left( {t + 1} \right)}} = {\beta ^{\left( t \right)}} - {\left( {{V^{\left( t \right)}}} \right)^{ - 1}}{U^{\left( t \right)}}. $ |
参数的极大似然估计(MLE)具有大样本性质,即有相合性和渐近正态性.根据文献[7]我们令
θ=(β,σ2)T,设θ的极大似然估计
| $ {G_N}\left( {y,\theta } \right) = \frac{1}{N}\sum\limits_{i = 1}^N {g\left( {{y_i},\theta } \right)} $ |
得到,则θ的估计方程是
假设θ0是
| $ \sqrt N \left( {{{\hat \theta }_N} - {\theta _0}} \right) = {\left[ { - {{G''}_N}\left( {y,\tilde \theta } \right)} \right]^{ - 1}}\left( {\sqrt N {{G'}_N}\left( {y,{\theta _0}} \right)} \right). $ |
定理1 在一定的正则条件下,根据文献[8]有
1)
2)
其中
| [1] | Zeger S L, Diggle P J. Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV Seroconveiters[J]. Biometrics, 1994, 50: 689–699 DOI:10.2307/2532783. |
| [2] | Xuming He, Zhongyi Zhu and Wing-kam Fung. Estimation in a Semiparametric model for Longitudinal Data with Unspecified Depensence tructure[J]. Biometrika, 2002, 89: 579–590 DOI:10.1093/biomet/89.3.579. |
| [3] | 柴根象, 孙平, 蒋泽云. 半参数回归模型的二阶段估计[J]. 应用数学学报, 1995, 18(3): 353–363. |
| [4] | 赵江. 半参数回归函数混合型估计均方收敛性[J]. 应用数学学报, 1995, 8(2): 235–244. |
| [5] | 田萍, 马国锋. 一类纵向数据半参数模型中的强相合估计[J]. 数理统计与管理, 2008, 27(5): 865–869. |
| [6] | 樊明智, 田萍. 纵向数据下半参数回归模型估计的渐近性质[J]. 统计与决策, 2008(9): 163–165. |
| [7] | Domowitz I, White H. Misspecified models with dependent observations[J]. Journal of econometrics, 1982(20): 35–58 |
| [8] | Sanjoy K S. Robust analysis of generalized Linear mixed models[J]. Journal of the American Statistical Association, 2004(466): 451–460 |
2017, Vol. 23
