Simulated Lake-Effect Precipitation Evaluation

GLISA is evaluating lake-effect precipitation in Global and Regional Climate Models (GCMs and RCMs) to investigate whether a model simulates lake-effect precipitation, and if so, how well it is captured. We are using a data clustering approach to identify like “clusters” of seasonal precipitation totals that tease out intense lake-effect to non-lake effect areas. Our focus is mainly on the winter climatology (1980-99) for the months of December, January, and February. We are comparing simulated lake-effect clusters (e.g., location, size, and precipitation totals) to observed lake-effect clusters to assess the quality of the simulation. We are applying our approach to the following climate model data sets:

  • 40 GCMs from the Fifth Climate Model Intercomparison Project (CMIP5)
  • 19 dynamically downscaled simulations from the North-American Coordinated Regional Climate Downscaling Experiment (NA-CORDEX)
  • 6 dynamically downscaled simulations from the UW-RegCM4 dataset

All model precipitation biases are being evaluated against the University of Delaware dataset from 1980-1999. All model data were downloaded as monthly data except for UW-RegCM4, which are daily data. More on GLISA’s methodology is available below.


Cluster analysis uses a measure of similarity (or dissimilarity) to classify data into groups called clusters. Each cluster is formed with members having high similarity to each, in our case based on precipitation totals, other than to members of other clusters. Specifically, our use of K-means clustering partitions the precipitation data into a pre-specified number of groups. We explored up to 8 clusters and settled on four (Fig. 1), because with four clusters we could easily identify areas uninfluenced by the lakes (cluster #1), areas where the lakes have minimal influence (cluster #2), traditional lake-effect zones (cluster #3), and intense lake-effect zones (cluster #4). 

Fig 1. Precipitation clusters for the observational (University of Delaware) data using a k-means clustering algorithm where k=4. The results are clusters numbered 1-4, where 4 is the strongest lake effect precipitation cluster. 

University of Delaware (UDEL) precipitation data was used as a historical observational data set due to its availability in both the U.S. and Canada and its gridded temporal resolution of 0.5º x 0.5º. The period 1980-1999 was used because this was the full length of years available in the UW-RegCM4, and we wanted to be able to compare biases across all model data sets.  Winter (Dec-Jan-Feb) climatologies (20-year means) were calculated for total precipitation.  Those 20-year means were then run through a k-means clustering program and the models’ clustering results were compared to the observational clustering results. The spatial domain omits data over the Great Lakes (because precipitation observations are not widely available) and only includes data in the Great Lakes watershed.  Initially we included locations farther outside of the watershed, but intense precipitation in the mountainous regions of PA and NY affected the way the lake-effect clusters were drawn.

In the near future, we will use the MET tool to compare our lake-effect clusters in the models versus observations to know if the models get the lake-effect clusters in the right location and if they are of similar size.