Volume 6 Number 2 (Feb. 2011)
Home > Archive > 2011 > Volume 6 Number 2 (Feb. 2011) >
JCP 2011 Vol.6(2): 162-171 ISSN: 1796-203X
doi: 10.4304/jcp.6.2.162-171

The Effects of Imputing Missing Data on Ensemble Temperature Forecasts

Tyler C. McCandless1, Sue Ellen Haupt2, George S. Young3
1The Pennsylvania State University/Applied Research Laboratory & Meteorology Department State College, PA, USA
2The Pennsylvania State University/Applied Research Laboratory, State College, PA, USA
 Current Address: National Center for Atmospheric Research, Boulder, CO, USA
3The Pennsylvania State University Meteorology Department, State College, PA, USA

Abstract—A major issue for developing post-processing methods for NWP forecasting systems is the need to obtain complete training datasets. Without a complete dataset, it can become difficult, if not impossible, to train and verify statistical post-processing techniques, including ensemble consensus forecasting schemes. In addition, when ensemble forecast data are missing, the real-time use of the consensus forecast weighting scheme becomes difficult and the quality of uncertainty information derived from the ensemble is reduced. To ameliorate these problems, an analysis of the treatment of missing data in ensemble model temperature forecasts is performed to determine which method of replacing the missing data produces the lowest Mean Absolute Error (MAE) of consensus forecasts while preserving the ensemble calibration. This study explores several methods of replacing missing data, including ones based on persistence, a Fourier fit to capture seasonal variability, ensemble member mean substitution, three day mean deviation, and an Artificial Neural Network (ANN). The analysis is performed on 48-hour temperature forecasts for ten locations in the Pacific Northwest. The methods are evaluated according to their effect on the forecast performance of two ensemble post-processing forecasting methods, specifically an equal-weight consensus forecast and a ten day performance-weighted window. The methods are also assessed using rank histograms to determine if they preserve the calibration of the ensembles. For both postprocessing techniques all imputation methods, with the exception of the ensemble mean substitution, produce mean absolute errors not significantly different from the cases when all ensemble members are available. However, the three day mean deviation and ANN have rank histograms similar to that for the baseline of the non-imputed cases (i.e. the ensembles are appropriately calibrated) for all locations, while persistence, ensemble mean, and Fourier substitution do not consistently produce appropriately calibrated ensembles. The three day mean deviation has the advantage of being computationally efficient in a real-time forecasting environment.

Index Terms—ensemble forecasting; data imputation; Artificial Intelligence (AI), Artificial Neural Network (ANN); missing data; numerical weather prediction


Cite: Tyler C. McCandless, Sue Ellen Haupt, George S. Young, "The Effects of Imputing Missing Data on Ensemble Temperature Forecasts," Journal of Computers vol. 6, no. 2, pp. 162-171, 2011.

General Information

ISSN: 1796-203X
Abbreviated Title: J.Comput.
Frequency: Bimonthly
Editor-in-Chief: Prof. Liansheng Tan
Executive Editor: Ms. Nina Lee
Abstracting/ Indexing: DBLP, EBSCO,  ProQuest, INSPEC, ULRICH's Periodicals Directory, WorldCat,etc
E-mail: jcp@iap.org
  • Nov 14, 2019 News!

    Vol 14, No 11 has been published with online version   [Click]

  • Mar 20, 2020 News!

    Vol 15, No 2 has been published with online version   [Click]

  • Dec 16, 2019 News!

    Vol 14, No 12 has been published with online version   [Click]

  • Sep 16, 2019 News!

    Vol 14, No 9 has been published with online version   [Click]

  • Aug 16, 2019 News!

    Vol 14, No 8 has been published with online version   [Click]

  • Read more>>