Recently, we released a framework called windML ( http://www.windml.org), which provides an easy-to-use access to wind data sources for Python building upon numpy, scipy, sklearn, and matplotlib. It contains data mining methods and examples for various learning tasks like time-series prediction, classification, clustering, dimensionality reduction, and related tasks. We currently use two data sets, i.e., the NREL western wind integration study and an Australian data set named AEMO. The NREL data set is really awesome with 10-minute data of 32000 wind turbines for 3 years. But it's partly based on simulations. Does anyone know of another (large, public) data set with spatial distributed wind turbines? Thanks in advance.