You can rely on classical prediction metrics such as:
Precision, also known as 'positive predictive value', measures the proportion of correctly predicted positive instances out of all instances predicted as positive. It is calculated as the number of true positive results divided by the number of all samples predicted to be positive. You want high in scenarios where minimizing false positives is essential.
Recall, also known as 'sensitivity', quantifies the proportion of correctly predicted positive instances out of all actual positive instances. It is calculated as the number of true positive results divided by the number of all samples that should have been identified. You want high recall in scenarios where missing actual positives (false negatives) is costly.
A suitable combination of the above metrics for your purposes is the F2-score, which is a variant of the F1 score that puts a stronger emphasis on recall compared to the standard F1 score. Placing a stronger emphasis on recall rather than precision makes it suitable for tasks where capturing all positive instances is crucial.
The performance parameter for assessing imbalance in datasets is typically the imbalance ratio or imbalance index, which quantifies the ratio of minority class instances to majority class instances in the dataset.