﻿ 一种预测miRNA与疾病关联关系的矩阵分解算法
«上一篇
 文章快速检索 高级检索

 智能系统学报  2018, Vol. 13 Issue (6): 897-904  DOI: 10.11992/tis.201805043 0

### 引用本文

LIU Xiaoyan, CHEN Xi, GUO Maozu, et al. A matrix factorization method for predicting miRNA-disease association[J]. CAAI Transactions on Intelligent Systems, 2018, 13(6): 897-904. DOI: 10.11992/tis.201805043.

### 文章历史

1. 哈尔滨工业大学 计算机科学与技术学院,黑龙江 哈尔滨 150001;
2. 北京建筑大学 电气与信息工程学院，北京 100044

A matrix factorization method for predicting miRNA-disease association
LIU Xiaoyan1, CHEN Xi1, GUO Maozu1,2, CHE Kai1, WANG Chunyu1
1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China;
2. School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China
Abstract: There are increasing evidences that microRNAs (miRNAs) play an important role in life processes. In recent years, predicting the association between miRNAs and diseases has become an active topic. However, most of the existing methods are based on known miRNA-disease associations and are not ideal for miRNAs and diseases without any known associations. This paper presents a least squares optimization matrix factorization method for miRNA-disease association (LMFMDA) prediction. The LMFMDA, which is based on miRNAs similarity matrix, disease similarity matrix, and miRNAs-disease relationship, uses the iterative least squares method to solve the expression vectors of miRNAs and disease and approximates the existing associations between miRNAs and diseases by the expression vector of miRNA and disease. Different from the conventional approach, we introduce auxiliary miRNAs and disease variables to ensure that these variables converge to the optimal solution during optimization. The experiments show that the AUC obtained by applying the leave-one-out cross-validation method is 0.820 6, which is obviously better than other current methods. Especially in the miRNA and disease without any associated information, the LMFMDA algorithm significantly outperforms the latest algorithm.
Key words: microRNAs    disease    association prediction    matrix factorization    iterative least squares

MicroRNAs(miRNAs)是一类很小的内源性非编码RNA，长度约为20～24个核苷酸，通过碱基配对与其靶向的mRNA的3'端非编码区相结合，导致靶mRNA的降解或翻译抑制，从而在转录后水平上调控基因表达[1-3]。越来越多的证据表明，miRNA在免疫反应、转录、增殖、分化、信号传导和胚胎发育等[4-7]生物过程中起着重要的作用，miRNA突变、miRNA的生物合成和miRNA与其靶mRNA的功能失调可能会导致各种疾病。因此，识别miRNA与疾病之间的互作关系至关重要。早期研究采用生物学实验方法确定miRNA与特定疾病的关系，然而生物学实验方法实验周期长、成本高。因此计算生物学方法分析、预测miRNAs和疾病的关联问题成为了当前的研究热点。

1 相关工作

2 实验数据

2.1 miRNAs功能相似度网络

2.2 疾病语义相似性网络

 ${C_d}(t) = \left\{ \begin{array}{l}1,\;\;\;\;t = d\\\max \{ 0.5 \times {C_d}(t')|t' \in {\rm{childrenoft}}\} ,\;\;\;\;t \ne d\end{array} \right.$ (1)

 ${\rm{DS}}(d_1,d_2) = \displaystyle\frac{{\displaystyle\sum\nolimits_{t \in T(d_1) \cap T(d_2)} {({C_{d_1}}(t) + {C_{d_2}}(t))} }}{{\displaystyle\sum\nolimits_{t \in T(d_1)} {{C_{d_1}}(t) + \sum\nolimits_{t \in T(d_2)} {{C_{d_2}}(t)} } }}$ (2)
2.3 miRNAs-疾病关联关系网络

2.4 数据融合

 Download: 图 1 miRNAs-疾病关联关系在疾病中的分布图 Fig. 1 Distribution map of the miRNAs-disease association in diseases

 Download: 图 2 miRNAs-疾病关联关系在miRNA中的分布图 Fig. 2 Distribution map of the miRNAs-disease association in miRNAs
3 LMFMDA算法模型 3.1 损失函数

 ${{{R}}'} = {{{M}}^{\rm{T}}}{{D}}$ (3)

 $\min {\lambda _1}\left\| {{{{M}}^{\rm{T}}}{{M}}- {\bf{MS}}} \right\|_F^2 + {\lambda _2}\left\| {{{{D}}^{\rm{T}}}{{D}} - {\bf{DS}}} \right\|_F^2$ (4)

 $\begin{array}{c}\min {\lambda _1}\left\| {{{{M}}^{\rm{T}}}{{X}} - {\bf{MS}}} \right\|_F^2 + {\mu _1}\left\| {{{M}} - {{X}}} \right\|_F^2 + \\ {\lambda _2}\left\| {{{{D}}^{\rm{T}}}{{Y}} - {\bf{DS}}} \right\|_F^2 + {\mu _2}\left\| {{{D}} - {{Y}}} \right\|_F^2\end{array}$ (5)

 $\begin{array}{c}{ L} = \left\| {{{{M}}^{\rm{T}}}{{D}} - {{R}}} \right\|_F^2 + {\lambda _0}\left( {\left\| {{M}} \right\|_F^2 + \left\| {{D}} \right\|_F^2} \right) + \\ {\lambda _1}\left\| {{{{M}}^{\rm{T}}}{{X}} - {\bf{MS}}} \right\|_F^2 + {\mu _1}\left\| {{{M}} - {{X}}} \right\|_F^2 + \\ {\lambda _2}\left\| {{{{D}}^{\rm{T}}}{{Y}} - {\bf{DS}}} \right\|_F^2 + {\mu _2}\left\| {{{D}} - {{Y}}} \right\|_F^2\end{array}$ (6)
3.2 优化

 $\begin{array}{c}\displaystyle\frac{{{ \partial }}{{L}}}{{{ \partial }}{{M}}} = 2 \cdot {{D}} \cdot {\left( {{{{M}}^{\rm{T}}}{{D}} - {{R}}} \right)^{\rm{T}}} + 2 \cdot {\lambda _0} \cdot {{M}} + \\ 2 \cdot {\lambda _1} \cdot {{X}} \cdot {\left( {{{{M}}^{\rm{T}}}{{X}} - {\bf{MS}}} \right)^{\rm{T}}} + 2 \cdot {\mu _1}\left( {{{M}} - {{X}}} \right)=\\ 2 \cdot {{D}}{{{D}}^{\rm{T}}}{{M}} - 2 \cdot {{D}}{{{R}}^{\rm{T}}} + 2 \cdot {\lambda _0} \cdot {{M}} + 2 \cdot {\lambda _1} \cdot {{X}}{{{X}}^{\rm{T}}}{{M}} - \\ 2 \cdot {\lambda _1} \cdot {{X}} \cdot {\bf{M}}{{\bf{S}}^{\rm{T}}} + 2 \cdot {\mu _1} \cdot {{M}} - 2 \cdot {\mu _1} \cdot {{X}}\end{array}$ (7)

$\displaystyle\frac{{\partial L}}{{\partial M}}{\rm{ = }}0$ ，有：

 $\begin{array}{c}{{M}} = {\left( {{{D}}{{{D}}^{\rm{T}}} + \left( {{\lambda _0} + {\mu _1}} \right) \cdot {{{I}}_k} + {\lambda _1} \cdot {{X}}{{{X}}^{\rm{T}}}} \right)^{ - 1}}\cdot\\\left( {{{D}} \cdot {{{R}}^{\rm{T}}} + {\lambda _1} \cdot {{X}} \cdot {\bf{MS}} + \mu {}_1 \cdot {{X}}} \right)\end{array}$ (8)

 $\begin{array}{c}{{D}} = {\left( {{{M}}{{{M}}^{\rm{T}}} + ({\lambda _0} + {\mu _2}) \cdot {{{I}}_k} + {\lambda _2} \cdot {{Y}}{{{Y}}^{\rm{T}}}} \right)^{ - 1}}\\\left( {{{M}} \cdot {{R}} + {\lambda _2} \cdot {{Y}} \cdot {\bf{DS}} + {\mu _2} \cdot {{Y}}} \right)\\{{X}} = {\left( {{\lambda _1} \cdot {{M}}{{{M}}^{\rm{T}}} + {\mu _1}{{{I}}_k}} \right)^{ - 1}}\left( {{\lambda _1} \cdot {{M}} \cdot {\bf{MS}} + {\mu _1}{{M}}} \right)\\{{Y}}= {\left( {{\lambda _2} \cdot {{D}}{{{D}}^{\rm{T}}} + {\mu _2}{{{I}}_k}} \right)^{ - 1}}\left( {{\lambda _2} \cdot {{D}} \cdot {\bf{DS}} + {\mu _2}{{D}}} \right)\end{array}$ (9)
3.3 关联关系预测

3.4 算法框架

1) 初始化miRNAs和疾病的向量矩阵MD，以及辅助向量XY，并构建损失函数；

2) 用迭代最小二乘法求解MD

3)根据MD预测miRNAs-疾病的关联关系。

 ${{{R}}'} = {{{M}}^{\rm{T}}}{{D}}$

 Download: 图 3 LMFMDA算法模型框图 Fig. 3 The flow chat of LMFMDA algorithm model
3.5 复杂度分析

4 实验结果

4.1 实验参数

miRNAs与疾病的向量矩阵MD初始化为取值在[0, 1]上的随机向量，XY分别初始化为等同于MD

4.2 结果评价

 Download: 图 4 RWRMDA、CMFMDA、RLSMDA和LMFMDA的AUC结果 Fig. 4 The AUC results of RWRMDA, CMFMDA, RLSMDA and LMFMDA
4.3 分析

5 讨论

 ${{{R}}'} = {{{M}}^{\rm{T}}}{{D}} - { em} - {{e}}{{{d}}^{\rm{T}}}$

 Download: 图 5 带常数维模型中k与AUC关系图 Fig. 5 The relation diagram of k and AUC in a model with constant dimensional

6 结论

 [1] WANG Qianghu, SUN Jie, ZHOU Meng, et al. A novel network-based method for measuring the functional relationship between gene sets[J]. Bioinformatics, 2011, 27(11): 1521-1528. DOI:10.1093/bioinformatics/btr154 (0) [2] LV Sali, LI Yan, WANG Qianghu, et al. A novel method to quantify gene set functional association based on gene ontology[J]. Journal of the royal society interface, 2012, 9(70): 1063-1072. DOI:10.1098/rsif.2011.0551 (0) [3] HRISTOVSKI D, FRIEDMAN C, RINDFLESCH T C, et al. Exploiting semantic relations for literature-based discovery[J]. AMIA annual symposium proceedings, 2006, 2006: 349-353. (0) [4] KARP X, AMBROS V. Encountering microRNAs in cell fate signaling[J]. Science, 2005, 310(5752): 1288-1289. DOI:10.1126/science.1121566 (0) [5] CHENG A M, BYROM M W, SHELTON J, et al. Antisense inhibition of human miRNAs and indications for an involvement of miRNA in cell growth and apoptosis[J]. Nucleic acids research, 2005, 33(4): 1290-1297. DOI:10.1093/nar/gki200 (0) [6] MISKA E A. How microRNAs control cell division, differentiation and death[J]. Current opinion in genetics and development, 2005, 15(5): 563-568. DOI:10.1016/j.gde.2005.08.005 (0) [7] XU Peizhang, GUO Ming, HAY B A. MicroRNAs and the regulation of cell death[J]. Trends in genetics, 2004, 20(12): 617-624. DOI:10.1016/j.tig.2004.09.010 (0) [8] YOU Zhuhong, HUANG Zhian, ZHU Zexuan, et al. PBMDA: a novel and effective path-based computational model for miRNA-disease association prediction[J]. PLoS computational biology, 2017, 13(3): e1005455. DOI:10.1371/journal.pcbi.1005455 (0) [9] SHI Hongbo, ZHANG Guangde, ZHOU Meng, et al. Integration of multiple genomic and phenotype data to infer novel miRNA-disease associations[J]. PLoS one, 2016, 11(2): e0148521. DOI:10.1371/journal.pone.0148521 (0) [10] JIANG Qinghua, HAO Yangyang, WANG Guohua, et al. Prioritization of disease microRNAs through a human phenome-microRNAome network[J]. BMC systems biology, 2010, 4(S1): S2. (0) [11] JIANG Qinghua, WANG Guohua, WANG Yadong. An approach for prioritizing disease-related microRNAs based on genomic data integration[C]//Proceedings of the 3rd International Conference on Biomedical Engineering and Informatics. Yantai, China, 2010: 2270–2274. (0) [12] CHEN Xing, LIU Mingxi, YAN Guiying. RWRMDA: predicting novel human microRNA–disease associations[J]. Molecular biosystems, 2012, 8(10): 2792-2798. DOI:10.1039/c2mb25180a (0) [13] CHEN Hailin, ZHANG Zuping. Similarity-based methods for potential human microRNA-disease association prediction[J]. BMC medical genomics, 2013, 6: 12. DOI:10.1186/1755-8794-6-12 (0) [14] SHI Hongbo, XU Juan, ZHANG Guangde, et al. Walking the interactome to identify human miRNA-disease associations through the functional link between miRNA targets and disease genes[J]. BMC systems biology, 2013, 7: 101. DOI:10.1186/1752-0509-7-101 (0) [15] XUAN Ping, HAN Ke, GUO Maozu, et al. Prediction of microRNAs associated with human diseases based on weighted k most similar neighbors[J]. PLoS one, 2013, 8(8): e70204. DOI:10.1371/journal.pone.0070204 (0) [16] XU Chaohan, PING Yanyan, LI Xiang, et al. Prioritizing candidate disease miRNAs by integrating phenotype associations of multiple diseases with matched miRNA and mRNA expression profiles[J]. Molecular biosystems, 2014, 10(11): 2800-2809. DOI:10.1039/C4MB00353E (0) [17] MØRK S, PLETSCHER-FRANKILD S, PALLEJA CARO A, et al. Protein-driven inference of miRNA–disease associations[J]. Bioinformatics, 2014, 30(3): 392-397. (0) [18] PASQUIER C, GARDÈS J. Prediction of miRNA-disease associations with a vector space model[J]. Scientific reports, 2016, 6: 27036. DOI:10.1038/srep27036 (0) [19] SUN Dongdong, LI Ao, FENG Huanqing, et al. NTSMDA: prediction of miRNA–disease associations by integrating network topological similarity[J]. Molecular biosystems, 2016, 12(7): 2224-2232. DOI:10.1039/C6MB00049E (0) [20] LI Xia, XU Juan, LI Yongsheng. Prioritizing candidate disease miRNAs by topological features in the miRNA-target dysregulated network[M]//AZMI A S. Systems Biology in Cancer Research and Drug Discovery. Netherlands: Springer, 2012: 289–306. (0) [21] JIANG Qinghua, WANG Guohua, JIN Shuilin, et al. Predicting human microRNA-disease associations based on support vector machine[J]. International journal of data mining and bioinformatics, 2013, 8(3): 282-293. DOI:10.1504/IJDMB.2013.056078 (0) [22] CHEN Xing, YAN Guiying. Semi-supervised learning for potential human microRNA-disease associations inference[J]. Scientific reports, 2014, 4: 5501. (0) [23] SHEN Zhen, ZHANG Youhua, HAN K, et al. miRNA-disease association prediction with collaborative matrix factorization[J]. Complexity, 2017, 2017: 2498957. (0) [24] WANG Dong, WANG Juan, LU Ming, et al. Inferring the human microRNA functional similarity and functional network based on microRNA-associated diseases[J]. Bioinformatics, 2010, 26(13): 1644-1650. DOI:10.1093/bioinformatics/btq241 (0)