中国医科大学学报  2023, Vol. 52 Issue (2): 126-131

文章信息

侯志娟, 文普帅
HOU Zhijuan, WEN Pushuai
转移性结直肠癌患者总生存期的相关新基因APOHKNG1FGG
Identification of novel genes correlated with overall survival of patients with metastatic colorectal cancer: APOH, KNG1, and FGG
中国医科大学学报, 2023, 52(2): 126-131
Journal of China Medical University, 2023, 52(2): 126-131

文章历史

收稿日期:2021-11-21
网络出版时间:2023-01-31 14:20:12
转移性结直肠癌患者总生存期的相关新基因APOHKNG1FGG
侯志娟1 , 文普帅1,2     
1. 锦州医科大学基础医学院病理生理学教研室, 辽宁 锦州 121001;
2. 锦州医科大学生物人类学研究所, 辽宁 锦州 121001
摘要目的 通过生物信息学方法探讨结直肠癌转移的分子机制,为结直肠癌转移提供潜在的治疗靶点。方法 从高通量基因表达数据库(GEO)下载GSE41568数据集,用Limma软件包筛选转移性和原发性结直肠癌的差异表达基因;通过加权基因共表达网络分析(WGCNA)筛选与结直肠癌不同转移部位相关的核心模块和核心基因;使用STRING和Cytoscape构建蛋白质-蛋白质相互作用(PPI)网络;利用PROGgenesV2数据库进行预后分析。通过基因集富集分析(GSEA)预测基因参与结直肠癌的潜在分子机制。结果 共获得1 159个差异表达基因,由703个上调基因和456个下调基因组成。共表达模块中,前50个核心基因的基因本体(GO)功能富集分析表明,基因主要与创伤反应和急性炎症反应相关。通过PPI网络筛选的3个核心基因APOHAPOHKNG1FGG对结直肠癌患者的预后有显著影响。目的 结论APOHKNG1FGG与结直肠癌患者的预后呈负相关,这些基因有望成为结直肠癌的诊断生物标志物或治疗靶点。
Identification of novel genes correlated with overall survival of patients with metastatic colorectal cancer: APOH, KNG1, and FGG
HOU Zhijuan1 , WEN Pushuai1,2     
1. Department of Pathophysiology, College of Basic Medical Science, Jinzhou Medical University, Jinzhou 121001, China;
2. Biological Anthropology Institute, Jinzhou Medical University, Jinzhou 121001, China
Abstract: Objective To explore the molecular mechanisms of colorectal cancer metastasis by bioinformatical methods and provide potential therapeutic targets for metastatic colorectal cancer. Methods The GSE41568 dataset was downloaded from the Gene Expression Omnibus database, and differentially expressed genes (DEGs)between metastatic and primary colorectal cancer were screened by the Limma software package in R. The network-centric modules and genes associated with different metastatic sites of colorectal cancer were then identified by weighted gene co-expression network analysis. Additionally, the protein-protein interaction network was predicted using the STRING database and visualized with Cytoscape software. Furthermore, the prognosis of patients with colorectal cancer was predicted using the PROGgenesV2 database. Finally, the potential molecular mechanisms of the genes were predicted by gene set enrichment analysis. Results A total of 1 159 DEGs, including 703 upregulated and 456 downregulated, were obtained in this study. In the co-expression analysis, 50 top hub genes from the turquoise modules were chosen for gene ontology functional enrichment analysis, which indicated that these genes were mainly associated with a response to wounding and an acute inflammatory response. Conclusion APOH, KNG1, and FGG were predicted to be negatively correlated with the prognosis of patients with colorectal cancer, indicating that they could serve as potential diagnostic biomarkers or therapeutic targets in patients with colorectal cancer.

结直肠癌是世界上最常见的恶性肿瘤之一,其发病率全球排名第三,死亡率全球排名第二[1]。大约25%的结直肠癌患者在最初诊断时伴有远处转移[2],如果不治疗,晚期结直肠癌转移患者的中位生存时间仅为5~6个月[3]。研究[4-5]表明,结直肠癌的转移部位受原发肿瘤位置的影响。虽然原发肿瘤来源相同,但不同转移部位的结直肠癌预后有显著差异。尽管一些分子标志物如KRAS/RAS、错配修复与结直肠癌的转移部位有关[6-7],但是不同转移部位的结直肠癌基因变化尚不明确。本研究通过加权基因共表达网络分析(weighted gene co-expression network analysis,WGCNA),鉴定可能对转移性结直肠癌至关重要的基因模块和核心基因,并分析与其相关的核心基因的预后价值,为寻找结直肠癌的诊断生物标志物提供理论基础。

1 材料与方法 1.1 数据集和数据处理

从高通量基因表达数据库(Gene Expression Omnibus,GEO)下载基因表达数据集GSE41568[8]。该数据集由39例原发性结直肠癌样本和94例转移性结直肠癌样本组成。根据不同的通路激活模式,将该数据集样本分为6个结直肠癌分子亚组。

1.2 差异表达基因的筛选

对原始微阵列数据进行RMA算法背景校正和对数转换,然后使用R语言(版本3.3.2)和Bioconductor软件包将数据进行归一化处理,应用Limma包筛选结直肠癌样本间的差异表达基因。将P < 0.05和|log2FC| > 0.5作为筛选标准。

1.3 共表达网络的构建和核心基因的检测

使用R语言中的WGCNA软件包构建差异表达基因的加权基因共表达网络。此外,分别使用拓扑重叠矩阵和动态树切割算法进行分层聚类和模块筛选[9]。最后筛选出与转移性结直肠癌相关的核心基因[10]

1.4 核心基因的功能注释和蛋白质-蛋白质相互作用(protein-protein interaction network,PPI)网络的建立

通过STRING[11]找到基因之间的相互作用,选择综合得分 > 0.4的相互作用,利用Cytoscape软件构建并可视化这些基因的PPI网络。通过注释、可视化和集成发现数据库对筛选的差异表达基因进行基因本体(gene ontology,GO)以及京都基因和基因组百科全书(Kyoto Encyclopedia of Genes and Genomes,KEGG)通路富集分析。将假发现率(false discovery rate,FDR) < 0.05作为显著性的分界点。

1.5 核心基因的生存分析

应用PROGgenesV2预测数据集研究核心基因表达水平与结直肠癌患者总生存期的相关性。根据载脂蛋白H(apolipoprotein H,APOH)、纤维蛋白原γ链(fibrinogen gamma chain,FGG)、激肽原1(kininogen 1,KNG1)的表达水平对队列进行划分。P < 0.05为差异有统计学意义。

1.6 基因集富集分析(gene set enrichment analysis,GSEA)

根据3个核心基因的中位表达值,将样本分为核心基因低表达组和核心基因高表达组,然后使用Broad Institute平台[12]执行GSEA。

2 结果 2.1 差异表达基因的识别

转移性结直肠癌和原发性结直肠癌之间共鉴定出1 159个差异表达基因,包含703个上调基因和456个下调基因(图 1)。

图 1 mRNA表达谱数据集中差异表达基因的火山图 Fig.1 Volcano plot of differentially expressed genes in the mRNA expression profiling datasets

2.2 共表达网络的构建以及核心模块和基因筛选

根据加权基因共表达网络(图 2A),将基于相似表达模式的差异表达基因分为不同的模块。如图 2B~2E所示,软阈值β = 14,选择最低的无标度拓扑拟合指数0.9用于后续的邻接计算。通过差异表达基因的共表达模块(图 2F),发现了与转移部位最相关的绿色模块(r = 0.51,P = 5e-10;图 2G),将其命名为转移模块,并进行后续分析。利用绿色模块中的基因绘制模块成员(module membership,MM)与基因显著性的散点图,这表明MM值与基因显著性具有高度相关性,表明绿色模块适合研究与结直肠癌转移状态相关的核心基因(图 2H)。然后,根据高MM值和加权q < 0.001的基因,选择绿色模块中前50个核心基因。

A, the clustering analysis of differentially expressed genes between metastatic and primary colorectal cancer (red represents metastatic sites and molecular subgroups of colorectal cancer); B, topology analysis of the scale-free fit index of different soft-thresholding powers; C, analysis of the mean connectivity of the different soft-thresholding powers; D, histogram of connectivity distribution when β=14;E, checking the scale-free topology when β=14;F, clustering dendrogram and module assignment; G, heatmap of the correlation between module eigengenes and clinical traits of colorectal cancer; H, scatter plot of gene significance in the turquoise module. 图 2 通过WGCNA筛选转移性结直肠癌的关键基因 Fig.2 Screen of the key genes in metastatic colorectal cancer by weighted gene co-expression network analysis

GO注释和KEGG通路富集分析结果发现,转移模块中基因的生物学过程在“创伤反应”“急性炎症反应”“蛋白质-脂质复合物组装”“血浆脂蛋白颗粒组装”和“炎症反应”中富集。对于细胞成分,核心基因在“细胞外区域”“细胞外空间”“细胞外区域部分”“G蛋白脂质复合物”和“血浆脂蛋白颗粒”中显著富集。分子功能分析显示,前50个核心基因主要富集于“酶抑制剂活性”“内肽酶抑制剂活性”“肽酶抑制剂活性”“丝氨酸型内肽酶抑制剂活性”和“细胞表面结合”。见表 1。此外,KEGG通路富集分析表明,前50个核心基因主要富集在“补体和凝血级联”中。

表 1 绿色模块中核心基因的GO分析 Tab.1 Gene Ontology analysis of hub genes in module turquoise
Category GO term Gene FDR
Biological process Response to wounding KNG1TFC5CRPF9AHSGC8BAPOA2FGGFGAFGBF2AOX1SERPINA3APOHCFHITIH4SERPIND1IGFBP1LBP 1.20e-12
Acute inflammatory response C8BTFAPOA2F2CRPC5CFHSERPINA3ITIH4LBPAHSG 7.78e-10
Protein-lipid complex assembly APOA2APOBAPOA1APOEAPOC3APOC1 3.46e-07
Plasma lipoprotein particle assembly APOA2APOBAPOA1APOEAPOC3APOC1 3.46e-07
Inflammatory response KNG1C8BTFAPOA2F2AOX1CRPC5SERPINA3ITIH4CFHLBPAHSG 4.94e-07
Cellular component Extracellular region GCTFCRPC5APOC1HPAHSGCFHR2TTRAPOA2APOBFGGAPOA1FGAITIH1APOEFGBALBSERPINA5APOC3ITIH4CFHAPOHSERPINA3ITIH2FGL1ITIH3LBPF2CPIGFBP1SERPIND1CPB2 6.21e-22
Extracellular space GCTFC5CRPAPOC1HPAHSGTTRAPOBAPOA2FGGAPOA1FGAALBFGBAPOEAPOC3APOHCFHFGL1LBPLECT2KNG1C8BF2IGFBP1CPCPB2 4.98e-20
Extracellular region part GCTFC5CRPAPOC1HPAHSGTTRAPOBAPOA2FGGAPOA1FGAALBFGBAPOEAPOC3APOHLECT2KNG1F2IGFBP1CPCPB2 3.39e-16
G protein-lipid complex APOA2APOBAPOA1APOEAPOC3APOHAPOC1HP 3.35e-08
Plasma lipoprotein particle APOA2APOBAPOA1APOEAPOC3APOHAPOC1HP 3.35e-08
Molecular function Enzyme inhibitor activity KNG1C5APOC1AHSGAMBPAPOA2ITIH1SERPINA5APOC3SERPIND1ITIH3 3.68e-09
Endopeptidase inhibitor activity AMBPKNG1ITIH1SERPINA5C5SERPINA3ITIH4ITIH2SERPIND1ITIH3AHSG 4.89e-08
Peptidase inhibitor activity AMBPKNG1ITIH1SERPINA5C5SERPINA3ITIH4ITIH2SERPIND1ITIH3AHSG 8.35e-08
Serine-type endopeptidase inhibitor activity AMBPITIH1SERPINA5SERPINA3ITIH4ITIH2SERPIND1ITIH3 2.72e-05
Cell surface binding FGGFGAFGBCRPAPOHLBP 4.88e-05

2.3 转移模块中核心基因PPI网络的构建

通过Cytoscape软件,选择置信度 > 0.4构建PPI网络(图 3)。在PPI网络中,选择连通度 > 10的基因作为“真实”核心基因。

图 3 核心基因PPI网络的构建 Fig.3 Construction of protein-protein interaction network of top hub genes

2.4 核心基因的生存分析

APOHFGGKNG1低表达与结直肠癌患者的总生存期延长显著相关(图 4)。

A, APOH; B, FGG; C, KNG1. HR, hazard ratio. 图 4 3个核心基因与总生存期的相关性分析 Fig.4 Overall survival analysis of three hub genes in patients with colorectal cancer

2.5 核心基因的GSEA与结直肠癌患者的总生存期显著相关

核心基因高表达组的结直肠癌样本在“细胞外刺激反应”基因集中持续富集,见图 5

A, enrichment plots of APOH; B, enrichment plots of FGG; C, enrichment plots of KNG1. ES, enrichment score; NES, normalized enrichment score; FDR, false discovery rate. 图 5 核心基因的GSEA与结直肠癌患者总生存期的关系 Fig.5 Gene set enrichment analysis of hub genes correlated with overall survival of patients with colorectal cancer

3 讨论

本研究应用WGCNA鉴定与结直肠癌不同转移部位相关的基因共表达模块和基因。在一个重要的绿色模块中筛选了前50个核心基因,随后将共表达网络和PPI网络中的共同核心基因视为“真实”核心基因,进行进一步分析。其中,APOHFGGKNG1与结直肠癌转移患者的预后密切相关,本研究对这3个基因的重要生物学过程进行了分析。

与转移性结直肠癌相关的绿色模块中,核心基因在炎症反应中富集,这与已发表的研究结果一致。研究[13]报道,全身炎症反应对晚期结直肠癌患者尤其是结直肠癌肝转移患者具有预后价值。此外,炎症细胞的募集是“创伤反应”的一个特征,表皮生长因子将这些细胞聚集并参与癌症转移[14]。因此,炎症和创伤反应在结直肠癌转移过程中的具体作用值得进一步研究。

APOH也被称为β2糖蛋白Ⅰ,主要参与脂质代谢和胆固醇转运。本研究中,与原发性大肠癌相比,转移性大肠癌中APOH的表达上调,并且与大肠癌患者的预后呈负相关,这表明APOH可能在转移性结直肠癌中发挥致癌作用。

KNG1在许多病理生理过程中发挥重要作用,包括纤维蛋白溶解和血栓形成,并在肿瘤发生中发挥作用[15]。研究[16]表明,KNG1的细胞质积累与结直肠癌的分期和淋巴结转移状态显著相关。以上证据提示,KNG1可能影响结直肠癌的转移。

FGG是纤维蛋白原的组成部分,在血液凝固、纤维蛋白溶解以及细胞和基质相互作用过程中发挥重要作用[17]。最近,ZHANG等[18]发现上调的FGG通过调节SlugZEB1的表达,激活上皮-间质转化途径,促进肝癌细胞的迁移和侵袭;此外,肝细胞癌组织中FGG的表达水平也被证明是手术切除后无病生存期的独立危险因素。尽管FGG在大肠癌转移中的作用尚未被研究,但上述证据表明FGG可能是大肠癌转移过程中潜在的关键因素。

此外,本研究结果表明,核心基因高表达组在“细胞外刺激反应”基因集中富集。这些证据表明,APOHFGGKNG1可能通过影响细胞间或细胞基质黏附而促进结直肠癌转移。本研究筛选的这些基因为结直肠癌转移提供了更详细的分子机制,可作为潜在的生物标志物和治疗靶点。

参考文献
[1]
BRAY F, FERLAY J, SOERJOMATARAM I, et al. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries[J]. CA Cancer J Clin, 2018, 68(6): 394-424. DOI:10.3322/caac.21492
[2]
VAN CUTSEM E, OLIVEIRA J, ESMO GUIDELINES WORKING GROUP. Advanced colorectal cancer: ESMO clinical recommendations for diagnosis, treatment and follow-up[J]. Ann Oncol, 2009, 20(Suppl 4): 61-63. DOI:10.1093/annonc/mdp130
[3]
VALDERRAMA-TREVIÑO AI, BARRERA-MERA B, CEBALLOS-VILLALVA JC, et al. Hepatic metastasis from colorectal cancer[J]. Euroasian J Hepatogastroenterol, 2017, 7(2): 166-175. DOI:10.5005/jp-journals-10018-1241
[4]
RIIHIMÄKI M, HEMMINKI A, SUNDQUIST J, et al. Patterns of metastasis in colon and rectal cancer[J]. Sci Rep, 2016, 6: 29765. DOI:10.1038/srep29765
[5]
PRASANNA T, KARAPETIS CS, RODER D, et al. The survival outcome of patients with metastatic colorectal cancer based on the site of metastases and the impact of molecular markers and site of primary cancer on metastatic pattern[J]. Acta Oncol, 2018, 57(11): 1438-1444. DOI:10.1080/0284186X.2018.1487581
[6]
YAEGER R, COWELL E, CHOU JF, et al. RAS mutations affect pattern of metastatic spread and increase propensity for brain metastasis in colorectal cancer[J]. Cancer, 2015, 121(8): 1195-1203. DOI:10.1002/cncr.29196
[7]
NORDHOLM-CARSTENSEN A, KRARUP PM, MORTON D, et al. Mismatch repair status and synchronous metastases in colorectal cancer: a nationwide cohort study[J]. Int J Cancer, 2015, 137(9): 2139-2148. DOI:10.1002/ijc.29585
[8]
LU M, ZESSIN AS, GLOVER W, et al. Activation of the mTOR pathway by oxaliplatin in the treatment of colorectal cancer liver metastasis[J]. PLoS One, 2017, 12(1): e0169439. DOI:10.1371/journal.pone.0169439
[9]
LANGFELDER P, HORVATH S. WGCNA: an R package for weighted correlation network analysis[J]. BMC Bioinformatics, 2008, 9: 559. DOI:10.1186/1471-2105-9-559
[10]
DONG J, HORVATH S. Understanding network concepts in modules[J]. BMC Syst Biol, 2007, 1: 24. DOI:10.1186/1752-0509-1-24
[11]
SZKLARCZYK D, FRANCESCHINI A, WYDER S, et al. STRING v10: protein-protein interaction networks, integrated over the tree of life[J]. Nucleic Acids Res, 2015, 43(Database issue): D447-D452. DOI:10.1093/nar/gku1003
[12]
SUBRAMANIAN A, TAMAYO P, MOOTHA VK, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles[J]. Proc Natl Acad Sci U S A, 2005, 102(43): 15545-15550. DOI:10.1073/pnas.0506580102
[13]
ISHIZUKA M, KITA J, SHIMODA M, et al. Systemic inflammatory response predicts postoperative outcome in patients with liver metastases from colorectal cancer[J]. J Surg Oncol, 2009, 100(1): 38-42. DOI:10.1002/jso.21294
[14]
SUNDARAM GM, ISMAIL HM, BASHIR M, et al. EGF hijacks miR-198/FSTL1 wound-healing switch and steers a two-pronged pathway toward metastasis[J]. J Exp Med, 2017, 214(10): 2889-2900. DOI:10.1084/jem.20170354
[15]
YOUSEF GM, DIAMANDIS EP. The new human tissue kallikrein gene family: structure, function, and association to disease[J]. Endocr Rev, 2001, 22(2): 184-204. DOI:10.1210/edrv.22.2.0424
[16]
WANG J, WANG XY, LIN SY, et al. Identification of kininogen-1 as a serum biomarker for the early detection of advanced colorectal adenoma and colorectal cancer[J]. PLoS One, 2013, 8(7): e70519. DOI:10.1371/journal.pone.0070519
[17]
AHN JH, YU HK, LEE HJ, et al. Suppression of colorectal cancer liver metastasis by apolipoprotein(a)kringle V in a nude mouse model through the induction of apoptosis in tumor-associated endothelial cells[J]. PLoS One, 2014, 9(4): e93794. DOI:10.1371/journal.pone.0093794
[18]
ZHANG X, WANG F, HUANG YB, et al. FGG promotes migration and invasion in hepatocellular carcinoma cells through activating epithelial to mesenchymal transition[J]. Cancer Manag Res, 2019, 11: 1653-1665. DOI:10.2147/CMAR.S188248