I have modelled flight planning from gate to gate:
Taxi-out, take-off, climb, cruise, descent, approach + landing, taxi-in
The modelling integrates the data from European Environment Agency (EEA) for landing and take-off and Base of Aircraft of Database (BADA) for climb cruise and descent;
My intention is to compare the flight times of my model with those of the ROADEF challenge and the historic data;
The values of my model make use of data from 2006 and 2008 such as the ones used in the ROADEF data set; by data it is meant Taxi-out, take-off, approach + landing, taxi-in EEA data from 2006 and 2008;
I have built a module to retrieve real data from flightaware.com but I can only scrape one month data e.g. from June 2019
Summarizing:
My model uses data from 2006/2008 Taxi-out, take-off, approach + landing, taxi-in + BADA and returns flight times;
The real data is scraped from flightaware.com and regards June 2019 flight time;
I also checked and there are situations where there is no historic data in flightaware.com for the years of 2006 or 2008
When I compared my values with those of flightawre I have found that my values are in the 90th quantile;
Is this comparison scientifically correct?
How can my model values (retrieved with partial data from 2006 and 2008) be compared with those collected for June 2019 from flightaware.com?
BR