J. Meteor. Res.  2019, Vol. 33 Issue (5): 989-992 PDF
http://dx.doi.org/10.1007/s13351-019-9601-0
The Chinese Meteorological Society
0

#### Article Information

JI, Lei, Zaiwen WANG, Min CHEN, et al., 2019.
How Much Can AI Techniques Improve Surface Air Temperature Forecast?—A Report from AI Challenger 2018 Global Weather Forecast Contest. 2019.
J. Meteor. Res., 33(5): 989-992
http://dx.doi.org/10.1007/s13351-019-9601-0

### Article History

in final form August 16, 2019
How Much Can AI Techniques Improve Surface Air Temperature Forecast?—A Report from AI Challenger 2018 Global Weather Forecast Contest
Lei JI1, Zaiwen WANG1, Min CHEN1, Shuiyong FAN1, Yingchun WANG1,2, Zhiyuan SHEN3
1. Institute of Urban Meteorology, Beijing 100089;
2. Beijing Meteorological Service, Beijing 100089;
3. Sinovation Ventures AI Institute, Beijing 100080
ABSTRACT: In August 2018, the Institute of Urban Meteorology (IUM) in Beijing co-organized with Sinovation Ventures a Weather Forecasting Contest (WFC)—one of the AI (artificial intelligence) Challenger Global Contests. The WFC aims to take advantage of the AI techniques to improve the quality of weather forecast. Across the world, more than 1000 teams enrolled in the WFC and about 250 teams completed real-time weather forecasts, among which top 5 teams were awarded in the final contest. The contest results show that the AI-based ensemble models exhibited improved skill for forecasts of surface air temperature and relative humidity at 2-m and wind speed at 10-m height. Compared to the IUM operational analog ensemble weather model forecast, the most notable improvements of 24.2% and 17.0% in forecast accuracy for surface 2-m air temperature are achieved by two teams using the AI techniques of time series model, gradient boosting tree, depth probability prediction, and so on. Meanwhile, it is found that reasonable data processing techniques and model composite structure are also important for obtaining better forecasts.
Key words: artificial intelligence (AI)     analog ensemble weather forecast     surface meteorological elements     AI model
1 Application of artificial intelligence in meteorology

The term “artificial intelligence” (AI) was first coined at a conference held in Dartmouth, UK in 1956. AI covers an interdisciplinary field among natural sciences, social sciences, and technical sciences. In 2006, Geoffrey Hinton, the leading authority in machine learning study, proposed the concept of deep learning, which greatly improved the capability of neutral networks and set out a new wave of AI study (Hinton, 2006). In recent years, AI techniques have developed rapidly and been widely applied in many fields. How to implement AI as well as big data and machine learning techniques to establish meteorological forecast models and improve the forecast accuracy is an issue of common concern.

Over the world, meteorological agencies in many countries collaborate with various research institutes and companies to explore the approach for AI application in the meteorological field. AccuWeather in the US has been collaborating with Google to realize minute, hourly, and daily forecasts at 0–90-day lead time using cloud computing and AI techniques (https://qz.com/535345/ibm-is-going-to-change-how-we-forecast-the-weather-with-watson/). The UK Met Office is collaborating with Amazon to develop data storage and cloud computing techniques and with Microsoft to develop AI techniques (http://www.odbms.org/2017/07/machine-learning-in-weather-forecasting/). The Earth Risk proposed the TempRisk Apollo method for more reliable probability forecasts of temperature based on numerical products of ECMWF and ensemble frame of multiple AI models (EarthRisk Technologies, 2013).

In collaboration with Qinghua University, China Meteorological Administration (CMA) implemented a distributed deep learning framework and space–time convolutional and recurrent neutral network in radar echo extrapolation forecast (Wang et al., 2018) and improved the accuracy by about 40%, compared with the cross-correlation method (Bi, 2017). Meanwhile, AI techniques have also been applied to forecast of severe convective wea-ther such as hail and thunderstorms (Zhou et al., 2019). In Beijing, the machine learning method is employed in forecast of surface air temperature (Dai et al., 2019), thunderstorms (Yang et al., 2018), and strong convective weather events (Guo et al., 2019). In addition, Shenzhen Bureau of Meteorology has collaborated with Ali Platform to conduct nowcasting of precipitation based on radar observations (Yao and Li, 2017).

In other countries, AI has also been preliminarily applied to operational weather forecast such as nowcasting of thunderstorms (Lagerquist et al., 2018) and precipitation intensity (Mattioli et al., 2018), and warnings of extreme weather events such as blizzards (Burrows and Mooney, 2018) and low visibility (Kneringer et al., 2018). Meanwhile, AI also plays a role in analysis of weather and climate (Collins et al., 2018; Kunkel et al., 2018).

The above results indicate that the AI method based on multi-source observations and global numerical forecast products has demonstrated certain positive effects on improving the forecast accuracy of meteorological variables on various spatiotemporal scales. In recent years, following the advances in high-performance computing and network techniques, high-resolution regional numerical weather forecast and high spatiotemporal resolution meteorological monitoring network have been gradually improved. How to take full advantage of the above benefits to achieve fine operational weather forecast in the real-time operational mode becomes an imperative issue. As an open source platform for global artificial intelligence talents, the AI Challenger Global Contest (https://challenger.ai/news/ai_challenger) initiated by Sinovation Ventures (http://www.chuangxin.com/) since 2017 provides an opportunity to address the above issue.

2 The AI Challenger 2018 Global Contest

The second AI Challenger Global Contest kicked off in Beijing on 29 August 2018, comprising 10 contests such as “weather forecasting,” “visual perception for autonomous driving,” “crop disease detection,” and so on, aimed at solving the most challenging issues in the above fields by application of AI techniques, i.e., challenging the real-world problems by AI. As a co-organizer of the contest, Institute of Urban Meteorology (IUM) planned and hosted the Weather Forecasting Contest (WFC; https://challenger.ai/competition/wf2018), based on actual needs of operational weather forecast. The objective of this competition is to explore new interdisciplinary ideas and methods for refined operational weather forecast. More than 1000 teams representing Qinghua University, Beijing University, Stanford University, and other universities as well as some renowned technological companies from 19 countries including China, the U.S., Japan, Russia, etc. enrolled in the WFC. Eventually, about 250 teams went through the bi-week-long interim contest and completed the week-long final contest.

3 The Weather Forecasting Contest 3.1 Settings of the contest

“Weather Forecasting” is one of the AI Challenger 2018 Global Experimental Contests. This contest, namely the WFC, requires players to establish an effective AI model based on the observation data and the RMAPS (the Rapid-refresh Multi-scale Analysis and Prediction System; Fan et al., 2013) model data provided by IUM, and produce in real time 36-h forecasts of 2-m temperature (t2), 2-m relative humidity (rh2), and 10-m wind speed (w10).

The observation and the RMAPS data provide time series of meteorological variables on hourly intervals at 10 surface automatic weather stations in Beijing from 1 March 2015 to 3 November 2018. During the contest, the latitude, longitude, and station IDs of these stations are hidden to the participants. The observational data contain 9 elements such as 2-m air temperature, 2-m humidity, 10-m wind, and surface pressure at the 10 stations; while the RMAPS model data contain 29 elements such as temperature, humidity, wind, pressure at the surface and various pressure levels. For the players to train and tune their AI models, 1188 days of training data and 89 days of verification data (both were historical data) were provided. Moreover, real-time previous day meteorological data were input into the AI models for the bi-week interim contest and the one-week final test. All the data provided to the AI teams only approximated 1% of the total available meteorological data used in the operatio-nal weather forecast. Based on this amount of data, the AI models were run to generate real-time 36-h forecasts of t2, rh2, and w10 at the 10 stations in Beijing.

Due to the extremely strict timeliness requirements for actual weather forecast, real-time forecast is adopted in the contest. IUM issued daily updates of the data, and all teams were required to submit the forecast results within six hours after the data were updated. Compared to other contests, the WFC is especially challenging, because the data cannot be prepared in advance, the forecasts needed to be made in real time and ad hoc, and results needed to be submitted within 6 h and would be verified for multiple time periods.

3.2 The WFC final contest results

Five teams were selected from about 250 teams to enter the final contest of WFC. The forecasts made by teams AI01–AI05 were compared with the RMAPS analog ensemble forecast (AnEn), which has incorporated the statistical similarity, big data mining, and ensemble forecasting techniques. The AnEn forecast has been verified as the best available product of the IUM operational forecast system (Wang et al., 2019). The pros and cons of the AI methods used by these teams are then evaluated against AnEn. The criteria for evaluation are as follows. The AnEn is used as the benchmark and the root mean square error (RMSEAnEn) between this forecast and observations is calculated first, and the root mean square errors (RMSETeam) between the forecasts of individual teams (AI01–AI05) and observations are then calculated. Finally, the percentage change of RMSE of each team relative to RMSEAnEn is calculated by

 ${\rm{RMS}}{{\rm{E}}_P} = \left({{\rm{RMS}}{{\rm{E}}_{{\rm{AnEn}}}} - {\rm{RMS}}{{\rm{E}}_{{\rm{Team}}}}} \right)/{\rm{RMS}}{{\rm{E}}_{{\rm{AnEn}}}} \times 100 \text%.$

This formula shows that a larger RMSEP corresponds to a better forecast relative to AnEn of IUM. As AnEn is the best available forecast, the team who performs better than AnEn is a really superior AI team.

The overall results of daily 36-h forecasts in the final contest are shown in Fig. 1. Most of the five teams employed the AI model ensemble approach, but there existed differences in specific model selection and data processing techniques. Major AI models used include the time series model (Prophet), gradient boosting tree (GBT), depth probability prediction (Seq2Seq), bidirectional long short-term memory neural network (Bi-LSTM), recurrent neural network (RNN), artificial neu-ral network (ANN), etc. Since the RMAPS provides 36-h forecast on a daily basis, there is a 12-h overlap period in two consecutive days. Some teams used the averages over the overlapping period as the final results, and conducted sin/cos coding on the temporal features. Missing values were filled by linear interpolation or multi-station averaging, or excluded from final results.

 Figure 1 Percentage change of RMSE (RMSEP) relative to RMSEAnEn for (a) 2-m temperature (t2), (b) 2-m relative humidity (rh2), and (c) 10-m wind speed (w10) from the forecasts made by the five teams (AI01–AI05), averaged over 10 surface weather stations in Beijing during the week-long final contest in 28 October–3 November 2018.

Overall, the forecast skills of AI01 (from Zhejiang University) and AI02 (from Southwest Jiaotong University) are better than that of AnEn, and their daily forecasts are stable. The skill of AI03 is slightly lower, while the skills of AI04 and AI05 are in general worse than that of AnEn (Table 1). The above results indicate that a reasonable construction of the AI model ensemble frame is critical for forecasting t2, rh2, and w10. This could be a valuable clue for application of AI techniques in operational weather forecasting. Take t2 forecast as an example, forecasts by AI01, AI02, and AI03 are better than that of AnEn. In particular, AI01 and AI02 demonstrate greater advantages and improved the forecast accuracy by 24.2% and 17.0% respectively.

Table 1 Weekly mean RMSEP values for t2, rh2, and w10 from forecasts by AI01–AI05
 RMSEP AI01 AI02 AI03 AI04 AI05 t2 24.2 17.0 7.0 −8.3 −29.9 rh2 12.4 9.7 −6.1 −13.2 −24.6 w10 6.2 −3.3 −4.0 −4.4 −6.3
4 Future development

The WFC in 2018 attracted more than 1000 teams and players from high education institutes and high-tech companies of many counties. With only a small amount of meteorological data provided for the training and tuning, some AI models established during this contest demonstrate high capability for forecasts of t2, rh2, and w10 in real-time mode by using advanced AI algorithm, reasonable model composite, and proper data processing techniques. Particularly, teams AI01 and AI02 improved t2 forecast accuracy by 24.2% and 17.0% respectively, compared to that of the best available forecast of AnEn.

It is noted that some shortcomings still exist. For example, forecast of precipitation was not implemented. Nonetheless, results of the contest still indicate that AI techniques have great potential in meteorological field. In the future, IUM will explore the application capability of AI techniques in meteorological application, and improve the fine weather forecast, especially the forecast of those weather events that have great societal impacts.

AI techniques can be applied in various fields that rely on big data. At present, in addition to the wide application of AI in medical, transportation, education, and so on, huge opportunities loom ahead for AI to be more deeply and widely applied in meteorology, hydrology, and geology sciences.

Acknowledgments. We thank Mr. Kaifu Li and Mr. Yonggang Wang, the CEO and CTO of Sinovation Ventures, respectively, and Mr. Zhuohao Wu and Ms. Jing Dong, as well as all their team members who participated in the WFC, for their great support to make the WFC accomplished. We also thank all contestants around the world who enthusiastically dedicated their wisdom to the WFC.

References
 Bi, B. G., 2017: Progresses and thoughts on weather forecasting using artificial intelligence technology. Proc. National Conference of Weather Forecast Center Directors, Yinchuan, China, 12 October 2017. (in Chinese) Burrows, W. R., and C. J. Mooney, 2018: Automated products for forecasting arctic blizzard conditions. J36.4 in Proc. Annual Meeting of the Amer. Meteor. Soc., Austin, Texas, 6–11 January 2018. Available at http://ams.confex.com/ams/98Annual/webprogram/Paper336043.html. Accessed on 16 August 2019. Collins, W., M. Prabhat, E. Racah, et al., 2018: Deep learning for detecting extreme weather and climate patterns. TJ7.1 in Proc. Annual Meeting of the Amer. Meteor. Soc., Austin. Texas, 6–11 January 2018. Available at http://ams.confex.com/ams/98Annual/webprogram/Paper328029.html. Accessed on 16 August 2019. Dai, Y., N. He, Z. Y. Fu, et al., 2019: Beijing intelligent grid temperature objective prediction method (BJTM) and verification of forecast result. J. Arid Meteor., 37, 339–344. DOI:10.11755/j.isssn.1006-7639(2019)-02-0339 EarthRisk Technologies, 2013: TempRisk Apollo White Paper. Available at http://www.earthrisktech.com/resources/reports/white_papers/TempRiskApollo_WhitePaper_Oct2013.pdf. Accessed on 16 August 2019. Fan, S. Y., H. L. Wang, M. Chen, et al., 2013: Study of the data assimilation of radar reflectivity with the WRF 3D-Var. Acta Meteor. Sinica, 71, 527–537. DOI:10.11676/qxxb2013.032 Guo, H. Y., M. X. Chen, L. Han, et al., 2019: High resolution nowcasting experiment of severe convection based on deep learning. Acta Meteor. Sinica, 77, 715–727. DOI:10.11676/qxxb2019.036 Hinton, G. E., S. Osindero, and Y. Teh, 2006: A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527–1554. DOI:10.1162/neco.2006.18.7.1527 Kneringer, P., S. J. Dietz, G. J. Mayr, et al., 2018: An ordered hurdle model for probabilistic low-visibility nowcasting to support decisions at airports. J36.6 in Proc. Annual Meeting of the Amer. Meteor. Soc., Austin, Texas, 6–11 January 2018. Available at http://ams.confex.com/ams/98Annual/webprogram/Paper325064.html. Accessed on 16 August 2019. Kunkel, K. E., J. C. Biard, and E. Racah, 2018: Automated detection of fronts using a deep learning algorithm. TJ7.4 in Proc. Annual Meeting of the Amer. Meteor. Soc., Austin. Texas, 6–11 January 2018. Available at http://ams.confex.com/ams/98Annual/webprogram/Paper333480.html. Accessed on 16 August 2019. Lagerquist, R. A. McGovern, M. B. Richman, et al., 2018: Using machine learning to forecast severe thunderstorm winds on a CONUS-Wide grid. 3.1 in Proc. Annual Meeting of the Amer. Meteor. Soc., Austin. Texas, 6–11 January 2018. Available at http://ams.confex.com/ams/98Annual/webprogram/Paper335039.html. Accessed on 16 August 2019. Mattioli, C. J., M. S. Veillette, and H. Iskenderian, 2018: Dual application of convolutional neural networks: Forecasts of radar precipitation intensity and offshore radar-like mosaics. 695 in Proc. Annual Meeting of the Amer. Meteor. Soc., Austin. Texas, 6–11 January 2018. Available at http://ams.confex.com/ams/98Annual/webprogram/Paper323735.html. Accessed on 16 August 2019. Wang, Y., M. Long, J. Wang, et al., 2018: PredRNN: Recurrent neural networks for predictive leaning using spatiotemporal LSTMs. Proc. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017. Available at http://papers.nips.cc/paper/6689-predrnn-recurrent-neural-networks-for-predictive-learning-using-spatiotemporal-lstms. Accessed on 16 August 2019. Wang, Z. W., M. Chen, L. D. Monache, et al, 2019: Application of analog ensemble method to surface temperature and wind speed prediction in Beijing area. Acta Meteor. Sinica, 77, 865–884. DOI:10.11676/qxxb2019.044 Yang, L., F. Han, M. X. Chen, et al., 2018: Thunderstorm gale identification method based on support vector machine. J. Appl. Meteor. Sci., 29, 680–689. DOI:10.11898/1001-7313.20180604 Yao, Y. C., and Z. J. Li, 2017: Short-term precipitation forecasting based on radar reflectivity images. Proc. International Conference on Information and Knowledge Management, Singapore, 6–10 November 2017. Available at https://github.com/yaoyichen/CIKM-Cup-2017/blob/master/CIKM_AnalytiCup_2017_Team_Marmot.pdf. Accessed on 3 August 2019. Zhou, K. H., Y. G. Zheng, B. Li, et al., 2019: Forecasting different types of convective weather: A deep learning approach. J. Meteor. Res., 33, 797–809. DOI:10.1007/s13351-019-8162-6