Can someone suggest statistical methods to exclude outliers?

More Ivan Kadurin's questions See All

How can, granulometry or grain size data, be used in enrichment factor calculations?

As a stable element

23 July 2024 4,187 1 View

Is there anything faster than Xarray or Pandas out there?

Hello, dear RG community. Personally, I have found Xarray to be excruciatingly slow, especially for big datasets and nonstandard operations (like a custom filtering function). The only suggestion...

15 July 2024 4,705 2 View

O impacto da Gestão Estratégica de Marketing, na tomada de decisões empresariais?

o impacto da Gestão Estratégica de Marketing, na tomada de decisões empresariais?

10 July 2024 3,430 0 View

My hippocampal mouse primary neurons clump when I plate them on microfluidic devices for axonal isolation. Any tip?

Hi all, I am working on both primary hippocampal mouse neurons and i3Neurons to isolate axonal material. I plate them in microfluidic devices in house made (like any other you can find on the...

01 July 2024 8,070 0 View

Can you use MTT assay for tissue samples?

Can you use the MTT assay of cell viability for tissue samples derived from rats e.g. liver, muscle, heart, brain, or is this method reserved for cultured cells? Thank you

29 May 2024 6,389 2 View

What are the features and best library to predict and clasify people?

Hello everyone, I am currently exploring several options to give the collected data the greatest value possible. I have demographic data on older people, where I perform various memory and mood...

02 May 2024 876 6 View

Is this formula to calculate LST is accurate?

Could you please provide me some authoritetic refers where this formula used if it accurate OR tell me why it is wrong and how to fix it? 1. TOA (L) = ML * Qcal + AL 2. BT = (K2 / (ln (K1 / L) +...

19 April 2024 8,924 0 View

How to derive the volume based equation for gas fraction starting from Dr. Ishii's definition of local gas fraction?

Hello, dear RG community. Very often, gas fraction (also known as void fraction or gas holdup) is thought of as a ratio of the gas volume to the total volume:...

04 April 2024 7,290 0 View

Spheroid or not to spheroid? That is the question?

Dear Researchers, I am reaching out to this esteemed community seeking insights into a recent experiment I conducted involving cell seeding in agarose micro-wells. My aim is to engage with fellow...

22 March 2024 4,438 0 View

Do you know of a specific videogame or genre of videogames that stimulates metacognition?

I'm researching about videogames and their effect on metacognition, actually I intend to develop a game with that specific intention. I'm tinkering with a puzzle game that gives feedback and maybe...

18 March 2024 1,147 2 View

How to test multivariate outlier in STATA?

Hey all, I need help testing for multivariate outliers using STATA for my master thesis. The literature recommends the Minimum Covariance Determinant (MCD) (Verardi & Dehon, 2010). I found the...

22 July 2024 8,821 2 View

How can I prepare the csv. dataset to find the flood triggaring factor using ANN?

here's a concise guide for preparing your CSV dataset in Excel to identify flood-triggering factors using an ANN: Clean and Format:Address missing values (fill with mean/median or remove if...

21 June 2024 7,941 0 View

How to deal with zero/negative growth rates in tree growth models?

Tree growth is an important aspect of forestry and forest ecology. Typically a growth rate is calculated as the difference between two stem diameter measurements over a given time interval: (dbh2...

15 May 2024 8,739 2 View

Dealing with outliers?

Hello everyone, I am calculating the duration from Site Selection to Clinical Start Date for the process map. The duration across all groups ranges from 0 to 70 months, with mean of 5.6 months...

03 May 2024 2,624 4 View

What are "normal" PEP successive values?

Hello everyone, I work on the cardiac pre-ejection period (PEP) using Biopac ECG/ICG. I record cardiac activity on non-patient healthy adults from 18 to 65 years, during rest or cognitive and...

01 April 2024 4,702 0 View

Finding the differentially expressed genes?

1. For the gene expression data (microarray dataset which is been extracted from the Gene Expression Omnibus (GEO) platform), which of the following normalisation techniques are suggested as the...

05 March 2024 8,451 1 View

Ion ratio value in LC-MS/MS?

Hi everyone, I analyzed some samples by LC-MS/MS, and I got a pretty good peak for quantifier ion, but the peak intensity for qualifier ion is not good enough so I got "ion ratio" as outlier...

21 February 2024 4,249 12 View

Should item-level data or total scores be used for detecting multivariate outliers using Mahalanobis distance method?

I want to detect multivariate outliers in my dataset, which contains participant responses to various questionnaires such as DASS-21, PSWQ etc. Should I compute Mahalanobis distance using total...

18 January 2024 3,699 3 View

Outliers continue to appear after removal. What should I do?

Hello, I am trying to remove outliers from a dataset. I removed a few, but they continuously appear. What should I do? Please do not post AI generated answers.

16 January 2024 9,442 4 View

Can anyone suggest a good way of testing for an outliers for an experiment ran in triplicate?

I have a set of experiments ran in triplicate and I have some data points that seems like an outliers in each one? Which method is the best way to examine that? i would appreciate recommendations

21 December 2023 6,801 2 View

Ma'Mon Abu Hammad

Certainly, there are several statistically tested methods for identifying and excluding outliers from a dataset. Here are some common approaches along with links to resources for further instructions and clarifications:

Z-Score or Standard Deviation Method: Calculate the z-score for each data point, which represents how far each point is from the mean in terms of standard deviations. Data points with z-scores beyond a certain threshold (e.g., 2 or 3 standard deviations) are considered outliers and can be excluded. Resources: Z-Score and Outliers

Modified Z-Score: This is a variation of the standard z-score method that uses the median and median absolute deviation (MAD) for calculations instead of the mean and standard deviation. It's less affected by extreme values and is suitable for non-normally distributed data. Resources: Modified Z-Score

Interquartile Range (IQR) Method: Calculate the IQR (the range between the first quartile and the third quartile) and identify data points outside a certain multiple of the IQR as outliers.Resources:Interquartile Range (IQR) Method

Tukey's Fences: Similar to the IQR method, Tukey's fences involve setting thresholds based on the IQR. Points beyond these thresholds are flagged as potential outliers. Resources: Tukey's Fences

Grubbs' Test: A formal statistical test that determines if a single data point is an outlier. It calculates a test statistic and compares it to critical values from the t-distribution. Resources: Grubbs' Test

Dixon's Test: A method for identifying a single outlier in a dataset. It uses ratios of differences between values to detect an outlier. Resources: Dixon's Test

Mahalanobis Distance: A multivariate method that calculates the distance between each data point and the mean of the dataset. Data points with high Mahalanobis distances are considered outliers.Resources:Mahalanobis Distance

Remember that the choice of method depends on the nature of your data, your specific goals, and any assumptions you're making about your data distribution. It's also important to consider the potential impact of excluding outliers on your analysis and results. Always document and justify your outlier removal process.

Koen Van de Moortel

Whether a measurement is an "outlier" depends totally on the model you use. A point may be far a way from the "best fitting" line through your data, but it may be close to an exponential function through your data.

The only reason to remove outliers I know, is if they are really wrong mesurements. A visual inspection of a scatterplot may suggest which points could be bad, although those can also be in the middle of your data cloud...

Jochen Wilhelm

Statistical methods can help you to identify points that are "far off from the rest". If these values are should or should not be excluded is not a statistical question but a subject-matter question. More often than not in research such "outliers" are desperately trying to tell you that your assumptions are bad. Excluding such values just make your wrong assumptions look better but your conclusions more wrong.

Generally, outliers should be rare. If they are, then they are usually not a problem in your analysis. If they are not rare, your whole experiment/assay is in doubt.

Carlos Jimenez-Gallardo

There is no statistical method that excludes outliers. It is a matter of Decision based on the background of the respective data. It is easy if you detect that it is a single case or a measurement error, but if it is real data, the decision is difficult.

advice, change the question, determine exclusion criteria.

For example, in Chile land of earthquakes, which is considered an atypical value. according to the measurements any over 8.0, and reaching the 9.5, however these are values that appear, but the 9.5 only appears 1 every 60 or 70 years maybe more, but those of grade 8 are more frequent in time, then what decision do you make?

so, once again, I agree with @Jochen