I have modelled flight planning from gate to gate:

Taxi-out, take-off, climb, cruise, descent, approach + landing, taxi-in

The modelling integrates the data from European Environment Agency (EEA) for landing and take-off and Base of Aircraft of Database (BADA) for climb cruise and descent;

My intention is to compare the flight times of my model with those of the ROADEF challenge and the historic data;

The values of my model make use of data from 2006 and 2008 such as the ones used in the ROADEF data set; by data it is meant Taxi-out, take-off, approach + landing, taxi-in EEA data from 2006 and 2008;

I have built a module to retrieve real data from flightaware.com but I can only scrape one month data e.g. from June 2019

Summarizing:

My model uses data from 2006/2008 Taxi-out, take-off, approach + landing, taxi-in + BADA and returns flight times;

The real data is scraped from flightaware.com and regards June 2019 flight time;

I also checked and there are situations where there is no historic data in flightaware.com for the years of 2006 or 2008

When I compared my values with those of flightawre I have found that my values are in the 90th quantile;

Is this comparison scientifically correct?

How can my model values (retrieved with partial data from 2006 and 2008) be compared with those collected for June 2019 from flightaware.com?

BR

Similar questions and discussions