I am training a model using the Mask-RCNN deep learning model(There are multiple classes available ). I need to know for evaluation purposes how can I calculate the mAP(Mean Average Precision), mAR(Mean Average Recall), and F1 score correctly with k-fold cross-validation. I have noticed that different code segments out there in the issues section of the official repository regarding this matter. But the problem is there are mainly two approaches out there to calculate the F1 score and still the discussion is going on about which one is correct. Below source code is extracted from the issues section of Mask-RCNN repository(Link:https://github.com/matterport/Mask_RCNN/issues/2474) Even though one of the approaches is correct, according to my knowledge F1 is defined as follows.

PR(Precision)

RC(Recall)

F1 score = [(2 x PR x RC) x 100/(PR+RC)]

So I need to know,

1) Does PR = mAP and RC = mAR ?

2) If yes, then does calculating PR for a model mean calculating the mAP and calculating the RC 3) for a model mean calculating the mAR.Is my argument correct?

3) What do the precisions and recalls array contain?

4) What's the correct way to calculate mAp, mAR, and F1 metrics?

5) If I am using k-fold cross validation should I calculate each of these values at the end of each iteration and get the average?

Method 1

from mrcnn.model import load_image_gt

from mrcnn.model import mold_image

from mrcnn.utils import compute_ap, compute_recall

from numpy import expand_dims

from mrcnn import utils

def evaluate_model(dataset, model, cfg):

APs = list();

F1_scores = list();

for image_id in dataset.image_ids:

#image, image_meta, gt_class_id, gt_bbox, gt_mask = load_image_gt(dataset, cfg, image_id, use_mini_mask=False)

image, image_meta, gt_class_id, gt_bbox, gt_mask = load_image_gt(dataset, cfg, image_id)

scaled_image = mold_image(image, cfg)

sample = expand_dims(scaled_image, 0)

yhat = model.detect(sample, verbose=0)

r = yhat[0]

AP, precisions, recalls, overlaps = utils.compute_ap(gt_bbox, gt_class_id, gt_mask, r["rois"], r["class_ids"], r["scores"], r['masks'])

AR, positive_ids = compute_recall(r["rois"], gt_bbox, iou=0.2)

ARs.append(AR)

F1_scores.append((2* (mean(precisions) * mean(recalls)))/(mean(precisions) + mean(recalls)))#Method 1

APs.append(AP)

mAP = mean(APs)

mAR = mean(ARs)

return mAP, mAR, F1_scores

Method 2

mAP, mAR, F1_score = evaluate_model(dataset_val, model, inference_config)

print("mAP: %.3f" % mAP)

print("mAR: %.3f" % mAR)

print("first way calculate f1-score: ", F1_score)

F1_score_2 = (2 * mAP * mAR)/(mAP + mAR)#Method 2

print('second way calculate f1-score_2: ', F1_score_2)

More Hasindu Yahamapth Dias Dahanayake's questions See All
Similar questions and discussions