一种Bayes降水概率预报的最优子集算法

引用本文

胡邦辉, 刘善亮, 席岩, 王学忠, 游大鸣, 张惠君. 一种Bayes降水概率预报的最优子集算法[J]. 应用气象学报, 2015, 26(2): 185-192. 复制到剪切板

Hu Banghui, Liu Shanliang, Xi Yan, Wang Xuezhong, You Daming, Zhang Huijun. An Algorithm of Optimal Subset for Bayes Precipitation Probability Prediction Model[J]. Journal of Applied Meteorological Science, 2015, 26(2): 185-192 复制到剪切板

一种Bayes降水概率预报的最优子集算法

胡邦辉¹, 刘善亮², 席岩³, 王学忠¹, 游大鸣¹, 张惠君³

1. 解放军理工大学气象海洋学院，南京 211101;
2. 中国人民解放军61741部队，北京 100081;
3. 南京军区气象水文中心，南京 210016

2014-08-31 收到, 2015-01-04 收到修改稿.

资助项目: 国家自然科学基金项目 (41330420，41275099)

通讯作者: 胡邦辉, email: hubanghui@126.com.

摘要: MOS预报最优子集模型，通过消除数值模式系统性误差，可最大程度地提高其预报技巧。为了建立Naïve Bayes降水最优模型，利用2008—2011年T511数值预报产品和单站观测资料，对介休、运城、丰宁3个站Naïve Bayes降水概率分级预报模型进行研究。通过设计恰当的适应度函数，提出了一种用遗传算法搜寻Naïve Bayes模型最优子集的计算方案，得到了3个站的最优子集模型。结果表明：最优子集的拟合效果明显高于普通初始子集，能够显著提升数值模式在单站的预报技巧。最优子集模型主要通过降低数值模式空报率提高单站晴雨、小雨预报效果，通过小幅提高正确次数和降低空报次数改善对中雨预报效果。

关键词: 遗传算法朴素贝叶斯分类器单站降水预报预报技巧

An Algorithm of Optimal Subset for Bayes Precipitation Probability Prediction Model

Hu Banghui¹, Liu Shanliang², Xi Yan³, Wang Xuezhong¹, You Daming¹, Zhang Huijun³

1. Institute of Meteorology and Oceanography, PLAUST, Nanjing 211101;
2. Unit No. 61741 of PLA, Beijing 100081;
3. Meteorological and Hydrological Center of Military Area Command of Nanjing, Nanjing 210016

Abstract: Based on numerical prediction products, a model output statistic (MOS) for precipitation forecast of an observatory is set up which contains the model output rainfall as one of predictors. The model can remove the systemic error of numerical prediction on precipitation, so it improves the precipitation prediction skill to certain degree. But for a given amount of predictors, a problem to solve is how to select the optimal subset to improve the prediction skill especially in operational weather forecast. In order to construct a Naïve Bayes precipitation probability prediction model on the precondition of the best performance from optimal subsets, using T511 model products and their 13-hour to 24-hour forecast corresponding observation of precipitation from 2008 to 2010 at three observatories, namely Jiexiu, Yuncheng and Fengning, the classificatory Naïve Bayes models on precipitation probability are developed and valuated. Different from the treatment of classic optimal subsets regression which enumerates the optimal subset one by one under the rule of couple score criterion (CSC), a Naïve Bayes model using genetic algorithm to search the optimal subset from a great many of subsets is presented. Model follows artificial intelligence searching characteristics. The genetic algorithm is established through the construction of gene bit-series from binary encoding method, and the introduction of a fitness function with cause. Considering the elimination of non-existing affair samples for the weather of low probability, two models are built based on genetic algorithm and Naïve Bayes model. The essential difference between two kinds of models is the fitness functions they use: One uses the accuracy of precipitation as fitness function, and it is called genetic algorithm-Naïve Bayes forecasting model type 1, GA-NB1 in brief; the other one uses threat score as fitness function, and is called GA-NB2 accordingly. The models are evaluated by prediction tests with dataset ranging from July to September in 2011. Results indicate that simulated results of optimal subset are much superior to those of ordinary initial subsets. Both GA-NB1 and GA-NB2 can improve T511 model precipitation accuracy by 19% on precipitation occurrence, threat scores are improved by 0.16 and 0.13 on drizzle and moderate precipitation, respectively. The prediction for precipitation occurrence and drizzle is enhanced by the optimal subset model because they effectively reduce the false alarm rate of numerical model, by more than 19 times during the period. The cause for improving moderate rain prediction includes two aspects: A slight increase in the amount of correct forecast and decrease of false alarms.

Key words: genetic algorithm Naïve Bayes classifier station precipitation forecast prediction skill

预报分类

统计对象

介休站

运城站

丰宁站

T511

GA-NB1

GA-NB2

T511

GA-NB1

GA-NB2

T511

GA-NB1

GA-NB2

晴雨

准确率/%

64.4

90.0

57.8

82.2

81.1

78.9

90.0

87.8

正确次数

漏报次数

空报次数

小雨

TS评分

0.20

0.56

0.57

0.16

0.25

0.17

0.26

0.42

0.36

正确次数

漏报次数

空报次数

中雨

TS评分

0.19

0.45

0.44

0.13

0.38

0.36

0.11

0.5

正确次数

漏报次数

空报次数

大雨

TS评分

0.33

0.50

0.33

正确次数

漏报次数

空报次数

暴雨

TS评分

正确次数

漏报次数

空报次数

[1]	闵晶晶, 孙景荣, 刘还珠, 等. 一种改进的BP算法及在降水预报中的应用. 应用气象学报, 2010, 21, (1): 55–62. DOI:10.11898/1001-7313.20100107
[2]	刘还珠, 赵声蓉, 陆志善, 等. 国家气象中心气象要素的客观预报——MOS系统. 应用气象学报, 2004, 15, (2): 181–191.
[3]	刘爱鸣, 潘宁, 邹燕, 等. 福建前汛期区域暴雨客观预报模型研究. 应用气象学报, 2003, 14, (4): 420–429.
[4]	赵声蓉, 裴海瑛. 客观定量预报中降水的预处理. 应用气象学报, 2007, 18, (1): 21–28. DOI:10.11898/1001-7313.20070104
[5]	燕东渭, 孙田文, 杨艳, 等. 支持向量机数据描述在西北暴雨预报中的应用试验. 应用气象学报, 2007, 18, (5): 676–681. DOI:10.11898/1001-7313.20070503
[6]	Raftery A E, Gneiting T, Balandaoui F, et al. Using Bayesian model averaging to calibrate forecast ensembles. Mon Wea Rev, 2005, 133: 1155–1174. DOI:10.1175/MWR2906.1
[7]	Sloughter J M, Raftery A E, Gneiting T, et al. Probabilistic quantitative precipitation forecasting using Bayesian model averaging. Mon Wea Rev, 2007, 135: 3209–3220. DOI:10.1175/MWR3441.1
[8]	Liu J N K, Li B N L, Dillon T S. An improved Naive Bayesian classifier technique coupled with a novel input solution method. IEEE Transaction on System, Man, and Cybernetics-Part C:Application and Reviews, 2001, 31, (2): 249–256. DOI:10.1109/5326.941848
[9]	王开宇, 赵瑞星, 翟宇梅. 朴素贝叶斯分类器在降水预报中的应用. 军事气象水文, 2007, 31, (3): 41–44.
[10]	郭雅芬, 过仲阳, 苏君毅, 等. 贝叶斯分类法在MCS移动路径预测中的应用. 地球信息科学, 2007, 9, (2): 20–23.
[11]	苏君毅, 邱洁, 过仲阳, 等. 基于贝叶斯方法的中尺度对流系统移动方向研究. 华东师范大学学报:自然科学版, 2006, 6: 41–46.
[12]	柯宗建, 张培群, 董文杰, 等. 最优子集回归方法在季节气候预测中的应用. 大气科学, 2009, 33, (5): 994–1002.
[13]	谷德军, 纪忠萍, 李春晖. 南海夏季风爆发日期与海温的多尺度关系及最优子集回归预测. 海洋学报, 2011, 33, (6): 55–63.
[14]	李玲萍, 尚可政, 钱莉, 等. 最优子集回归在夏季高温极值预报中的应用. 兰州大学学报:自然科学版, 2010, 46, (6): 54–58.
[15]	Nawaz M, Enscore E, Ham I. A Heuristic algorithm for the machine, n job flowshop. The International Journal of Management Sciences, 1983, 11, (1): 91–95.
[16]	赵凯, 孙燕, 张备, 等. T213数值预报产品在本地降水预报中的释用. 气象科学, 2008, 28, (2): 217–220.
[17]	刘建文, 郭虎, 李耀东, 等. 天气分析预报物理量计算基础. 北京: 气象出版社, 2005: 1–253.
[18]	王学忠, 胡邦辉, 吕梅, 等. 沙瓦特指数的一种迭代算法. 应用气象学报, 2009, 20, (4): 486–491. DOI:10.11898/1001-7313.200904014
[19]	Zhou Lina, Feng Jinjuan, Sears A, et al. Applying the Naïve Bayes Classifier to Assist Users in Detecting Speech Recognition Errors. Big land, Hawaii:System Sciences, Proceedings of the 38th Annual Hawaii International Conference, 2005: 183.
[20]	韩瑞峰. 遗传算法原理与应用实例. 北京: 兵器工业出版社, 2010: 1–443.
[21]	夏祥华, 孙汉文. 基于遗传算法的曲线拟合方法用于重叠荧光光谱的定量解析. 光谱学与光谱分析, 2012, 32, (8): 2157–2161.
[22]	王双成. 贝叶斯网络学习、推理与应用. 上海: 立信会计出版社, 2010: 1–291.


图 1. 介休站晴雨预报GA-NB1与GA-NB2拟合适应度曲线 Fig 1. The simulated precipitation occurrence prediction fitness functions of GA-NB1 and GA-NB2 at Jiexiu Station


图 2. 2011年7—9月介休站GA-NB1模型13~24 h的降水分级预报结果 Fig 2. The observed, GA-NB1 and T511 predicted 13-24-hour classificatory precipitation at Jiexiu Station from Jul to Sep in 2011