What situation we have to use ROC curve analysis?

Hi,

I'll try to explain ROC curve analysis using a simple example.

Imagine a study evaluating a new test that screens people for a disease. Each person taking the test either has or does not have the disease. The test outcome can be positive (classifying the person as having the disease) or negative (classifying the person as not having the disease).

For now, suppose the outcome of a medical test results in a continuous-scale measurement. Let t be a threshold (sometimes called a cutoff) value of the diagnostic test used to classify subjects. Assume that subjects with diagnostic test values less than or equal to t are classified as non-diseased and that subjects with diagnostic test values greater than t are classified as diseased, and let m and n denote the number of subjects in each group.

The test results for each subject may or may not match the subject's actual status. In that setting:

True positive: Sick people correctly identified as sick

False positive: Healthy people incorrectly identified as sick

True negative: Healthy people correctly identified as healthy

False negative: Sick people incorrectly identified as healthy

In general, Positive = identified and negative = rejected. Therefore:

True positive = correctly identified

False positive = incorrectly identified

True negative = correctly rejected

False negative = incorrectly rejected

To estimate classification accuracy using standard ROC methods, the disease status for each patient is measured without error. The true disease status often is referred to as the gold standard. The gold standard may be available from clinical follow-up, surgical verification, and autopsy; in some cases, it is adjudicated by a committee of experts.

The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings. Discrimination threshold: largest change in a value of a quantity being measured that causes no detectable change in the corresponding indication. An ROC curve is a plot of sensitivity on the y axis against (1−specificity) on the x axis for varying values of the threshold t.

When evaluating a continuous-scale diagnostic test, we need to account for the changes of specificity and sensitivity when the test threshold t varies.

The 45° diagonal line connecting (0,0) to (1,1) is the ROC curve corresponding to random chance. The ROC curve for the gold standard is the line connecting (0,0) to (0,1) and (0,1) to (1,1). Generally, ROC curves lie between these 2 extremes. The area under the ROC curve is a summary measure that essentially averages diagnostic accuracy across the spectrum of test values.

See also http://circ.ahajournals.org/content/115/5/654

Hope this helps.

Alexander Egoyan

Dear Palash,

I appreciate your interest in our work and thank you for your comments.

ROC curve, is a graphical plot that illustrates the diagnostic ability of a binary classifier system as its discrimination threshold is varied.

The diagnostic performance of a test is the accuracy of a test to discriminate diseased cases from normal controls.

ROC curves can also be used to compare the diagnostic performance of two or more laboratory tests.

ROC Curves plot the true positive rate (sensitivity) against the false positive rate (1-specificity) for the different possible cutpoints of a diagnostic test. Each point on the ROC curve represents a sensitivity/specificity pair.

The closer the curve follows the left side border and the top border, the more accurate the test.

The closer the curve is to the 45-degree diagonal, the less accurate the test.

See also https://www.mailman.columbia.edu/research/population-health-methods/evaluating-risk-prediction-roc-curves

and http://www.scielo.br/scielo.php?pid=S0021-75572009000100008&script=sci_arttext&tlng=en

Hope this helps.

How to calculate the charge densities of valence band maximum (VBM) and conduction band minimum (CBM) of a system using Quantum ESPRESSO software?

For a graphical abstract of a review article, can we use images from other's work which are already cited in the manuscript?

How do we enhance NK Cell proliferation?

Why some RDG iso-surface Maps show direct bonds, rather than just showing circular colored(blue, green, and red) disks??

What is the name of Typic Fluvaquents as per WRB soil classification?

How to solve this energy minimization issue in lammps?

Is there any Classroom Climate scale which provides a composite score for classroom climate of the secondary level students?

How to use LindaWorkers with Gaussian 9 rev D.01 SMP?

Given a magnetic compound how can one determine its spin group?

What can be the possible reason for decline in CAR expression in subsequent days?

How to calculate effect size of AMCE (Average Marginal Component Effect) in Randomized Conjoint Experiment?

Can I proceed with Response Surface Methodology (RSM) with only one factor for further optimization?

How to conduct a sensitivity power analysis for Kendall's Tau?

What topic or subject does Production Engineering need to address more?

Can i apply a public questionnaire but analyse it with a different data analysis method?

Standard curve of H2O2?

How to merge two or more linkage maps?

Could someone please provide a list of journals that accept the application of methodologies in nature-based solutions?

Getting standard errors and T-statistics for Dynamic panel data with instrumental variable method?

What effects of Autonomous Language Learning can be shown regarding linguistic competence and communicative skills?