Using Artificial Neural Networks for Wind turbine performance monitoring, does anyone have thoughts on the architecture?

12 March 2015 6 4K Report

I'm currently writing my Masters Thesis, and after an extensive literature study I've finally found my topic. The wind power company I'm doing my thesis at have accumulated a large amount of data over the years the've had their turbines up and running. Almost 600 parameters are logged every 5 minutes and stored in a database that I have access to.

Their initial wish was for me to find a way to optimise their turbines using this data, but I realised pretty quickly that real-time data and extensive on-site measurements are needed to make a conventional optimisation. In addition to that, it didn't seem like a very academic study, but more a way to avoid hiring a consultant.

At first I thought of a lot of different ways to make my study a new take on the matter, but the more articles I read, the more I realised that there has been alot of studies in this field, and my ideas were already researched. When reading all those articles, I noticed that all groups (as far as I could see) used only the mean values of the parameters in their models.

Most of the groups were only using the ANN to find a better fit of the wind turbine's power curve, which is defined as the power output as a function of wind speed. But since the power output depends on many more variables (albeit none as much as wind power), the power curve isn't as smooth as turbine makers shows in their specifications, but rather highly variable. Another contributing factor to this is that the average numbers are used (usually 5/10 min). A lot of the groups put much focus into trim the data of outliers, finding a more narrow fit to the curve.

My take is that if I use min, max and standard deviation values found in the data, in addition to the average, I will find a better fit and can better predict power output. What I've found so far is that my model can predict the samples that other groups discarded as outliers, which means i only have to discard points where production is zero. Also, I will use more parameters that are coupled to the production and my hope is to find a very sensitive model that can find deviations from normal production.

Now you've got a picture of the project I have in mind, and hopefully I can get some help on the architecture of the neural network, since I'm new to the concept.

I've got pretty good matlab skills, so that's were I'll be doing my work. So far I've only used the standard nftool GUI to construct my ANN's, tried to figure out how to best find the number of hidden neurons. To begin with I have a rather small dataset of about 800 samples, trying to keep down the computation time as long as I'm bound to a "slow" computer.

My thought is to use the first year of data from a turbine to train my model, this is to ensure i have data points from the whole spectra and also get the seasonal change in there. I've tried to pick inputs that I know affect the power output, and are not directly coupled to other inputs. I've used average values for parameters that do not fluctuate greatly over 5 min, and min/max/avg/std for parameters with large fluctuation.

My inputs are:

Grid phase angle (average) - generator

Grid frequency (average) - electrodynamical torque

Pitch angle blade 1 (average) - Coefficient of power

Pitch angle blade 2 (average)

Pitch angle blade 3 (average)

Rotor RPM (average) - Coefficient of power, inertia

Generator torque (average) - mechanical torque

Generator torque (min)

Generator torque (max)

Generator torque (standard deviation)

Nacelle Direction (average) - performance when compared to wind dir.

Wind direction (average) - different wind shear and turbulence patterns

Outdoor temp (average) - density of air -> power output

Wind speed (average) - power output

Wind speed (min)

Wind speed (max)

Wind speed (Standard deviation)

Output:

Power output (average)

So, by using this ANN model, I hope to detect under-production from the turbine. In extension, I'd like to use ANN pattern recognition to classificate certain common faults, but that's a whole other story.

Questions and thoughts:

Do you think that this model would benefit from more than one hidden layer?

Is nftool the best tool to use here? (in Matlab)

Any thoughts of the amount of hidden neurons?

I've tried both L-M and Bayesian Regularisation as training algorithms, is there a better one for this kind of problem?

I did a test trying 1 to 20 hidden neurons with both L-M and BR and picked out the ones I thought was best (computational time not considered), I'm not quite sure what to look for in the performance plot though. Is this behaviour as the plots shows what I should look for, where training and test are close to each other? (file attatched)

Sorry for the long post, but I'm short on feedback and would greatly appreciate any kind of help!

Best regards,

Daniel

Brijmohan Singhi

From the discussions, your problem appears to control the wind turbine for optimal performance. It is possible to develop a neural network algorithm that may lead to the desired results. Algorithm development would require extensive study of the system.

Amanda Lunt

Hi Daniel,

Certainly with number of hidden neurons, you can use trial and error. There's currently no more widely accepted way of finding the most appropriate for your particular domain. An acceptable method is as follows:

Set lower and upper bounds based on existing theory (search for some papers, eg Eric Baum from the late 1980s) or use your best guess. You will find research into relationships with # input neurons, # instances of your dataset and/ or # output neurons.

Train the whole set at each architecture over multiple runs (eg 10), take average of performance and see which ones produce the lowest error. Perhaps you might look for ones with smallest range of error over all runs. You can automate this data collection - Matlab produces stats for each run through that you can capture in your own dataset.

Base your decision on the results from this combined with pragmatic notions of size - eg, if 16 hidden neurons is only slightly worse performance than 140 hidden neurons, likely better option is the 16 architecture on account of running time. Depends on your particular limitations.

However... sounds as though you have a large dataset - be prepared for this process to take a lot of processing time. May be worth integrating some use of CUDA.

Amanda

Hongmei He

Trail-and-error method can be used to find the proper number of hidden neurons.

In the same way, you may select proper set of inputs for the neural network. Some inputs may not strong related to the output.

Usually we expect to have similar performance on training, validation and testing data. otherwise, we could say the neural network over-fitting problem, if the performance on training data is much better than that on test data. Neural network tool has set the validation data set to reduce the over-fitting problem.

As Amanda said, you can use ten-folder cross validation to assess the performance of a neural network.

Of course, it would be better if you can develop a system to train a neural network, regarding all aspects above.

Ziemowit Dworakowski

I'd add one more remark to the answers above:

Keep in mind, that number of hidden neurons (Or more precisely, number of adjustable network parameters - weights) should depend on number of samples used for training.

It might not be a good idea to "initially" train the network on 800 samples only - especially, if you have access to a full database.

A "rule of thumb" is that you need approximately 10 data samples for each weight - to be realatively sure, that your network will not be prone to over-fitting. In your example You have 17 inputs, which for 800 training samples limits the number of hidden neurons to ~ 5 (provided that one hidden layer will be used). In my opinion it won't be enough.

If you are convinced, that all 17 inputs are necessary, I'd start with at least ten times larger dataset and proceed from there.

Another important factor is random initialization of weights: Two identical NNs trained on similar data may render suprisingly different results - especially if training dataset is limited. Did you perform testing once or e.g. 100 times and obtained statistical results?

With regards,

Ziemowit Dworakowski

Daniel Karlsson

Thank you all for the feedback, thoughts and tips!

I appreciate your take Marko, but looking through the links, it seems like it might be too much to take on this late in the project. I think I will follow Amanda and Hongmei's proposal, and I'm now finalising the code to run it on the large data set.

Follow ups on the questions,

In my case, do you see a need for a second hidden layer, or is one layer enough?
Right now I'm running iterations to find a suitable architecture. BR takes much longer but seems to do better than LM. Do you think it's safe to say that I should stick to BR and save some computing time? On the other hand, it might be beneficial to my study to show results for that as well..

Daniel Karlsson

Good addition Ziemowit, the 800 was merely to get the coding started and getting to know the process without using too much time on the runs.

The graphs I attached in the original post were just single runs, before I had constructed the loop. So what I intend to do next: Let the model run with increasing number of hidden neurons, from 1 to 20, with 10 runs on each layer size. Save the data from each run into an array and select the number of neurons that gives the best results both in results, and difference between train and test. I think i will weigh the train-test ratio higher. Does this seem reasonable?

I'm not yet convinced I need all the inputs, I'm going to think it through once more and make sure the parameters chosen aren't directly affected by errors I aim to identify, since that would make the model see a "normal" behaviour.

I've been looking through alot of answers from Greg Heath over at the mathworks forum, he seems to be somewhat a guru on the matter. I've set the rng (random number generator) to default now, and got the same results from two different runs. That doesn't seem right to me, so I'll have to read through that post again. (I think it's just to reproduce the best net found) edit: I saved the current rng per iteration into an array to be able to reproduce the current net.

Do you think can be any Uranium bearing rocks in Eastern part of Iran and western part of Afghanistan?

Do you think can be any diamond bearing rocks in Eastern part of Iran and western part of Afghanistan?

What is the difference between mathematical R^4 space and physical 4D unit space?

If Banks do not provide credit facility, what are the options available for FPOs and impact on producer’s income?

Controlling for pupil light reflex when analyzing pupil size time course?

What are a “Farmers Producer Organization” (FPO) and its essential features?

Strugglling with m6A dot blot any suugesstion ?

Do interactions between biosphere, carbon cycle, & water cycle impact global warming & interaction between atmosphere & hydrosphere?

How can I use the cif data obtained from rietveld refinement extracted via gsas2, for microstructural analysis using ETEX software?

How to get moment output in Abaqus Standart?

Could you recommend some articles on Urban Transportation System optimization and Innovation?

Feedback defines the constitution of an organism?

Text-Communication from the M1 Hand Area using BCI—and then there is Elon Musk?

Which Scopus Journal provides the most affordable fees?

Seeking Advice on Viability and Execution of Undergraduate Thesis Topic?

Can we mark 'EFL Learners shifting from general digital to AI technologies' as technological transition?

How to develop investments in renewable energy sources?

Who will be moral responsible for the death of thousands of people in the event of an earthquake?

What are examples of AI for good projects a teacher can assign to students?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?