This question is related to the measure of estimation and forecast accuracy in (freight) transportation network models.
input data : origin-destination (O-D) matrixes at the European regional NUTS2 level for different transportation modes (road, rail, inland waterways). These matrixes contain tons transported for a given year between each O-D pairs. These matrixes exist for different types of commodity, but this is of less importance in my question. Consider this input as the "observed real-world" data.
Model : the matrixes are merged, so that I have one single matrix per category of commodities, that includes all what is transported by the three modes. Several mode-choice and assignment procedures are applied in a network model. Using a sample of 5% of the OD pairs, the model is calibrated for each mode-choice/assignment model that is tested, in order to reflect the "observed" modal-split found in the input data for the sample. The calibrated cost functions are then applied to the total matrix.
Output : For each mode-choice/assignment model, I retrieve the tons that are allocated to each OD pair, for each mode. I can also retrieve the tons that are assigned to each link (road, rail, waterway) of the network, but I've no observed counts along these segments, so that I cannot use this output.
It comes out that, even if classical estimators such as R2 (computed on a per O-D basis : observed tons / tons obtained by the models) can be very similar from model to model, the distribution of errors can be very different.
Question : I would like to compare the (relative) quality of the different models I test. A came over a series of estimators used in forecast accuracy (MSE, MAE, MAPE, MsAPE, sMAPE,...) and more recently MAPE-R and MASE, but I'm not sure that these estimators can be applied to the problem I describe here. Does someone have any experience with this ?