We are looking into time series of vertical buoy (Datawell) or radar displacement measurements of the ocean surface and are trying to discriminate between outliers and extreme waves. Has anybody got experience with this task?
I hope you've read the book, "The Wave," by Susan Casey. I personally experienced a rogue wave in a 40 foot sailboat and lived to tell the story. If we hadn't been floating like a little cork we'd now be sailing the Flying Dutchman with Davy Jones.
Yes, there are many resources out there. I assume you have a time series of sea surface elevation (displacement as you say), and you want to remove bad data without throwing out data which may be interesting. A simple statistical definition such as 3 or 4 standard deviations will not work. Sometimes this is accomplished by looking a physical limits, for example, the acceleration should not exceed gravity. Please see the following references:
Casas‐Prat, M. and L. H. Holthuijsen (2010), Short‐term statistics of waves observed in deep water, Journal of Geophysical Research: Oceans (1978–2012), 115(C9).
This an excellent paper. It describes quality control procedures and shows the expected form of short term statistics.
Brodtkorb, P. A., P. Johannesson, G. Lindgren, I. Rychlik, J. Rydén, and E. Sjö (2000), WAFO - a Matlab toolbox for analysis of random waves and loads, Seattle.
This reference is not so useful, but as a practical matter, the MATLAB toolbox WAFO is enormously useful and it comes with an excellent manual which gives examples of quality control procedures.
Lastly, I have written a paper (shameless self promotion) which addresses complications with buoy data, the review may useful to you but the details are probably unimportant since you are using commercial buoys.
Collins III, C. O., B. Lund, T. Waseda, and H. C. Graber (2014), On recording sea surface elevation with accelerometer buoys: lessons from ITOP (2010), Ocean Dynamics, 64(6), 895-904.
an accelaration criterion we already have implemented, we call it 'breaker criterion'. We also have implemented a wave period criterion as well. Both work relatively fine but not for really bad data sets. We also have the WAFO and DIWASP toolboxes on the start. I will have a look into the papers. Yea, what you call sea surface elevation is our vertical displacement. But as we have two more horizontal displacements with a Datawell directional wave rider in the raw data we call them 'displacements' following the Datawell manuals 'slang' ;-)
I do not have experience with very bad data. Lets say you have a procedure for identifying and rectifying (maybe by interpolation) bad data points. If the number of flags, or bad data points, exceeds some fraction of the total data (perhaps 10%), then I think it will make the corrected data difficult to interpret. At some point the data is a result of corrections and not actual measurements, this is a balance of course. Good luck and let me know if you develop an effective procedure for very problematic data.
among my peers and professors there is Leonid Lopatoukhin who specializes in rogue waves. As far as i remember he used platform data in his researches.
Interesting question. If you have tried every method of identifying physical reasons for outliers then you have to accept them as data points. A method I have used to show that the data point is not an outlier is to check for first order dependency using Lathrop's technique. Once you have removed the identified outliers then I'm afraid you have to be subjective and use, say, Grubb's test - or an extension. If your data are contemporary then I assume you've cross compared. The problem with both radar and buoy measurements is that the standard filters (and system characteristics) remove the most interesting high frequency info that could be useful in identifying breaking 'events'.
R G Lathrop - typically in ‘First order response dependencies’ Journal of Exp. Psychology, Vol. 72, 1967. All the best stuff goes back in time :) It's a simple way of checking independence of sequential data. I used it for a set of short sequence time series data - but every little helps when you are trying to sort out recorded data. I'll try to find the original paper.