基于聚类天气分型的KNN方法在风预报中的应用

引用本文

陈豫英, 刘还珠, 陈楠, 曾晓青, 马金仁, 刘迁迁, 马筛艳. 基于聚类天气分型的KNN方法在风预报中的应用[J]. 应用气象学报, 2008, 19(5): 564-572. 复制到剪切板

Chen Yuying, Liu Huanzhu, Chen Nan, Zeng Xiaoqing, Ma Jinren, Liu Qianqian, Ma Shaiyan. Application of KNN to Wind Forecast Based on Clustering Synoptic Patterns[J]. Journal of Applied Meteorological Science, 2008, 19(5): 564-572 复制到剪切板

基于聚类天气分型的KNN方法在风预报中的应用

陈豫英¹, 刘还珠², 陈楠¹, 曾晓青³, 马金仁¹, 刘迁迁¹, 马筛艳¹

1. 宁夏气象防灾减灾重点实验室, 银川 750002;
2. 国家气象中心, 北京 100081;
3. 兰州大学大气科学学院, 兰州 730000

2007-09-11 收到, 2008-07-28 收到修改稿.

资助项目: 中国气象局轨道建设项目“精细化气象要素预报业务系统 (一期)”资助

摘要: 以模式识别和相似预报思想为基础, 建立基于自组织神经网络 (SOM) 的聚类天气分型和交叉验证的K最近邻域非参数估计仿真模型 (KNN)。该模型首先以自组织神经网络技术对西北地区的高空流场和高度场进行聚类分型, 针对不同天气形势下的历史样本, 通过交叉检验, 分别寻求各类天气型下的最佳K组合。为了验证聚类天气分型对KNN方法的影响, 使用2003—2006年冬半年T213数值预报产品和宁夏日最大风速资料, 同时建立了宁夏冬半年日最大风速≥6 m/s天气分型和未分型的KNN预报模型, 并对2007年1—5月进行了预报试验, 预报评估结果表明:天气分型后的预报模型总体上降低了预报空报率, 提高了预报准确率, 特别是某些类天气型, 提高幅度更大, 为分类相似预报开拓了思路。

关键词: 自组织神经网络聚类天气分型交叉验证 K最邻近域日最大风速预报

Application of KNN to Wind Forecast Based on Clustering Synoptic Patterns

Chen Yuying¹, Liu Huanzhu², Chen Nan¹, Zeng Xiaoqing³, Ma Jinren¹, Liu Qianqian¹, Ma Shaiyan¹

1. Key Laboratory for Meteorological Disaster Prevention and Reduction of Ningxia, Yinchuan 750002;
2. National Meteorological Center, Beijing 100081;
3. College of Atmospheric Sciences, Lanzhou University, Lanzhou 730000

Abstract: Based on the model identification and an analogue forecasting, a new approach based on Self-Organizing feature Map (SOM) and cross validation is constructed, which is called K-nearest neighbor nonparametric estimation bootstrap model (KNN). 500 hPa geopotential height and 700 hPa u, v wind field over Northwest China are analyzed by the model clusterings at first, then the optimal K combination is sought using cross validation aiming at past samples under different weather patterns. Forecasting identification value of each synoptic pattern is determined by K-data, according to historical record. When forecasting in real time, what kind of synoptic pattern is to be known first, then K-data of different time is used to compute the nearest neighbor of real forecasting predictor to historical material predictor. Finally forecasting conclusion is obtained by using the standard of forecasting identification value. In order to validate the effect on cluster synoptic pattern to KNN, T213 NWP products from 2003 to 2006 in winter half year and the data of daily maximum velocity in Ningxia are used to construct prediction models of daily maximum velocity≥6 m/s pattern in Ningxia under synoptic and non-synoptic patterns at one time, data from Jan to May in 2007 is used for forecast experiments. The forecast evaluation results show that although the probability of original sample is reduced when adding the Self-Organizing feature Map of KNN, more false alarms in forecasting are avoided, so that the effect of forecasting is improved in general, especially the forecasting effects of some synoptic patterns compared with those that aren't patterned. The result is that the forecasting information of Ningxia high wind can be reflected by improved KNN. What's worth pointing out is that, the number of synoptic patterns is reduced when patterned, so the forecasting will be effected to some extent. It has a good effect for meteorological observing station which has more original samples, but it is not good for the ones that have less original samples. Therefore if there are more historical data which can reflect the wide range of system changing, the forecast accuracy will be improved significantly and it has a great value for operational usage. Classification analogue prediction thinking can be expanded by these results.

Key words: Self-Organizing feature Map clustering synoptic patterns cross validation K-nearest neighbor daily maximum velocity forecast

[1]	陈豫英, 陈晓光, 马金仁, 等. 风的精细化MOS预报方法研究. 气象科学, 2006, 26, (2): 210–216.
[2]	刘还珠, 赵声蓉, 赵翠光, 等. 国家气象中心气象要素的客观预报———MOS系统. 应用气象学报, 2004, 1, (2): 181–191.
[3]	杨忠恩, 陈淑琴, 黄辉. 舟山群岛冬半年灾害性大风的成因与预报. 应用气象学报, 2007, 18, (2): 80–85.
[4]	林良勋, 程正泉, 张兵, 等. 完全预报方法在广东冬半年海面强风业务预报中的应用. 应用气象学报, 2004, 15, (4): 485–490.
[5]	胡波, 杜惠良. 浙江省沿海海面日极大风预报. 海洋预报, 2006, 23, (增刊): 64–67.
[6]	Cover T M, Hart P E, Nearest neighbor pattern classification. IEEE Trans on Inf Theory, 1967, (IT-13): 21–27.
[7]	翟宇梅, 赵瑞星. 概率天气预报的K近邻非参数估计仿真模型. 系统仿真学报, 2005, 17, (4): 786–788.
[8]	邵明轩, 刘还珠, 窦以文. 用非参数估计技术预报风的研究. 应用气象学报, 2006, 17, (增刊): 125–129.
[9]	曾晓青, 邵明轩, 王式功, 等. 基于交叉验证技术的KNN方法在降水预报中的试验. 应用气象学报, 2008, 19, (4): 471–478.
[10]	Kohonen T, Self-organizing Maps. Berlin: Springer-Verlag, 1998: 1-6.
[11]	许文杰, 刘希玉. 基于无监督神经网络聚类算法的研究. 信息技术和信息化, 2006, (6): 85–88.
[12]	孙世霞, 杨建池, 邱晓刚, 等. 基于BP网络的LSCS仿真可信性评估方法. 系统仿真学报, 2006, 18, (7): 2037–2041.
[13]	王青, 祝世虎, 董朝阳. 自学习智能决策支持系统. 系统仿真学报, 2006, 18, (4): 924–926.
[14]	夏文文, 王士同. 基于Voronoi距离的鲁棒的双自组织特征映射网络. 计算机应用, 2007, 27, (5): 1109–1112.
[15]	刘还珠, 郝为, 林孔元, 等.基于智能计算的多模型气象综合预报∥刘还珠, 汤桂生.暴雨落区预报实用方法.北京:气象出版社, 2000:30-37.
[16]	黄卓, 杨洪敏, 郝为, 等.基于智能聚类的综合相似预报∥刘还珠, 汤桂生.暴雨落区预报实用方法.北京:气象出版社, 2000: 53-59.
[17]	廖木星. 海面风场预报的技术研究报告. 青岛远洋船员学院学报, 2003, 24, (2): 6–10.
[18]	颜梅, 范宝东, 满柯, 等. 黄渤海大风的客观相似预报. 气象科技, 2004, 32, (6): 467–470.


图 1. Kohonen自组织特征映射神经网络 (SOM)^[8] Fig 1. Kohonen Self-Organizing feature Map structure^[8]


图 2. SOM聚类分析的4种天气型 (粗黑线为700 hPa等高线, 单位:gpm; 细黑带箭头线为700 hPa u, v风场合成的流线) Fig 2. Four weather patterns of SOM cluster analysis (black bold lines are contours at 700 hPa, unit:gpm; arrow lines are stream lines at 700 hPa)


图 3. 2007年1—5月宁夏各站24h日最大风速≥6 m/s预报的TS评分 (a)、空报率 (b) 和概括率 (c) Fig 3. TS (a), absent forecast quotiety (b), general probability (c) of 24-hour forecast for weather stations with daily maximum velocity≥6m/s from Jan to May in 2007 of Ningxia


图 4. 2007年1—5月宁夏各站48 h日最大风速≥6 m/s预报的TS评分 (a)、空报率 (b) 和概括率 (c) Fig 4. TS (a), absent forecast quotiety (b), general probability (c) of 48-hour forecast for weather stations with daily maximum velocity≥6 m/s from Jan to May in 2007 of Ningxia