Looking for citable sources on interpretation of classification metrics in the field of ML (binary criterion)?

More Marcel Jud's questions See All

Can anyone provide me with a data set suitable for discriminant analysis?

Dear all, Can anybody help me out with this? I'm looking for a dataset suitable for discriminant analysis, preferably 5 groups or less. The data are obtained with analytical chemical methods such...

17 May 2024 7,567 3 View

Repeated measures ANCOVA with three groups and three time points: useful or not?

Using a questionnaire, I analyse the affective well-being after three different dance lessons at three points in time (before, during, after) in the same subjects. Background: the social...

27 March 2024 1,144 4 View

How to find genomic locations of genes and their exons?

Greetings everyone! For a publication I want to include analyses of pathogenic variants found in several genes (e.g. BRCA1/BRCA2) with regard to location and frequency. Thus, for the figures, I...

05 March 2024 906 4 View

Appropriate measure and mininum sample size for inter-rater reliability with binary coding?

Dear RG community I've coded N = 500 professional development courses for teachers according to topics (0 = was not part of the course; 1 = was part of the course). I'd like to have the...

18 September 2023 2,002 1 View

How to prevent TOPAS from adding irrelevant and unnecessary bonds to CIF file? And how to handle guest molecules in voids?

I have a single crystal structure of a coordination polymer and by PXRD I can see, that I'm able to synthesize isotypical structures containing the same ligand and different metal atoms. This is...

14 September 2023 2,283 1 View

How is sustained casing pressure regulated in the middle east?

At the moment, I’m writing a paper about Sustained Casing Pressure (SCP) – pressure in gas or oil well annuli (between steel casings or between casing and tubing) that rebuilds when it’s bled down...

25 July 2023 8,890 0 View

How do I create a heatmap of passed/failed ratio in GraphPad?

Hello everyone, I have two data sets of samples that passed or failed a specific analysis. I have two variables that I would like to compare based on whether the samples passed or failed the...

12 July 2023 9,334 1 View

Are cancer cells able to metabolize medium chian fatty acids?

In the ketogenic diet MCT oils are frequently consumed. But if cancer cells are able metabolize these oils am I not putting myself at risk?

16 February 2023 9,503 10 View

Pharmacokinetics Excel model avaliable?

Does somebody has a pharmacokinetics model available in Excel? I’m working with a lot medications and would very much like to know how it builds up and to what extent. A simple model suffices but...

14 February 2023 7,359 4 View

How to increase yield of RNA of low bacterial concentration?

Hi there, I am looking for a protocol to isolate RNA using low total CFU of bacteria (S. Aureus). We already have a good isolation protocol for mammalian cells (cell culture and tissue), but now...

23 January 2023 7,018 3 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

Using OBD technique i am trying to measure laser induced shockwaves velocity i found that at start velocity increases and then decay?

i am unable to interpret why its increases in start as shown in figure

11 August 2024 2,179 1 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

How can I interpret the data without the need of solving it manually?

How can I interpret the data gathered without solving?

03 August 2024 9,054 3 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Machine learning (ML) has shown great potential in predicting the compressive strength of concrete, an important property for structural engineering. However, its practical application comes with...

03 August 2024 2,546 2 View

Tarak Benkedjouh

In results section, the data divided into two sets; part for training data set and the rest for testing data set; the first question is what are the rules for selecting them?...the second question: when you talk about e.g. “80% of data used for training and the remaining 20% were used for validation”, Why do not used the cross validation for the performance's evaluation in your model and Why do use a larger percentage of data on the training set compared to the validation set or test set? When the accuracy in the cross-validation process less, is reducing the features a good idea? And the third question: The metrics that you choose to evaluate your regression or multi-class classification is very important (such as: RMSE, MSE, Accuracy, etc.). In your paper for which criteria you choose the metrics influences and the comparisons and when the accuracy in the validation process less, is reducing the features a good idea?

And when your model has the lowest loss and cost function obtained, can it prove that your method is "significantly better than other technologies"? Your experiment uses loss and cost to evaluate the training process, and your subject is exchange rate evaluation. Unless the relevant indicators of your estimation prediction results are better than other methods, the effectiveness of the proposed approach method can be proved and the last point How to set the parameters of proposed model for better performance? The complexity of the proposed model and the model parameter uncertainty needed to mentioned in the text.

Samawel Jaballi

Certainly, interpreting machine learning classification metrics can be complex due to the absence of universally accepted thresholds for quality. Unlike in frequentist statistics, the realm of machine learning often lacks established ranges for interpreting performance metrics like accuracy, precision, recall, AUC-ROC, and F1 score, especially when it comes to categorizing them into labels such as "poor", "satisfactory", or "excellent". These labels can often be subjective and contingent on the specific problem domain. However, a key contribution to this topic is the work of Powers, D. M. (2011). In his paper titled "Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation", published in the Journal of Machine Learning Technologies, Powers examines several classification metrics, providing a systematic exploration of their properties and relationships. This paper offers guidelines for the interpretation of these metrics in terms of quality. While it doesn't strictly delineate thresholds for labels like "good" or "excellent", it provides valuable insights and can serve as a foundational citable reference. Additionally, it's worth noting that the evaluation criteria can often be domain-specific. In medical diagnostics, for instance, a very high recall might be preferred over precision to avoid false negatives. Conversely, in a spam email classification task, precision might be prioritized to avoid mistakenly classifying genuine emails as spam. In the context of educational datasets, it's even more nuanced. For instance, if you're predicting student performance, a "good" accuracy might be considerably lower than in other domains due to the myriad of unpredictable factors affecting student outcomes.

Best regards,

Samawel JABALLI

Inès François

It is difficult to find papers and reviews that justify the threshold of criteria to assess models there are some rules of thumb with range values, for example, for accuracy 50 % is not good it means that your values were predicted randomly, more the percentage is close to 100% better the accuracy is however around 100% meaning that your model overfits, for RMSE MSE lower the value is better your model is because it is translated that your model doesn't have errors. Each model is different according to the included variables. You can run many models changing some parameters and comparing the values between them acknowledging each specificity of the model ensemble learning method will be more accurate than a simple model. You need to select your model according to your research question.