One way to detect the outlier of our dataset is Z-score. However, I can't find any reference that informs us about the cut-off level. Would you please tell me about the reliable and most used cut-off level among scholars?
If there was a single such cut score which unambiguously separated genuine cases from outliers (or, "outright liars"), I believe we all would have heard of it!
There isn't one; you simply have to make a judgment call. No matter what you select as a "critical" z-score threshold, do understand that there will inevitably be instances of false positive ("outlier") and false negative ("non-outlier") cases involved. As well, the shape of the distribution matters.
If a distribution is normal in shape, then cases having a z-score with magnitude of +/-2.58 or more would occur less than 1% of the time. A threshold of +/-2.81 would represent a value beyond which no more than 0.5% of cases would fall. Assuming a normal distribution, I don't think too many people would think a z-score threshold of +/- 3 would be too aggressive a choice for flagging cases as outliers.
However, for non-normal distributions, these thresholds would under-estimate the number/proportion of false positives for outlying cases. As well, if your research aim is to sort cases from overlapping distributions (e.g., mixture models), then a different approach should be applied.
If you are looking to identify outlier data, drawing a simple box plot or using the Grubbs' test or the ROUT method can be helpful.
They are also done in most softwares.
Some of them (Grabbs' or Dixon's Q Test) are statistical tests and therefore have a probability value (P value). So you do not have to look for a cut-off.
I agree that there is no formal cut-off value for outliers. Instead, I would plot the distribution of the Z-scores and look for values that are "detached" from the rest of the distribution (which is literally what "outliers" means).
Since z-scores go out to infinity, there is no single "cutoff". A lot depends on your risk. Traditionally, two standard deviations (z score of < -2 or > +2) is used, that represents a spread of 95% of the data IF your data are normal. Tchebychev inequality gives you a worst case of 75%( 1 - 1/ z squared). Statistical Process Control uses 3 standard deviations. The "Six Sigma" folks use 6. I once heard a Boeing (aircraft) presentation where they were using 9 for certain aircraft manufacturing tests.
BOTTOM LINE - it all depends upon your risk level and to some extent the distribution of the data (Normal vs ???). There is no "ANSWER" to your question.
The approach you adopt, the sensitivity of the subject, the dispersion of the data, the number of outliers for different cut-offs, ... determines the suitable cut-offs. Sometimes to prevent losing a significant percentage of data, we use wide bands. Sometimes the the distribution tends to normal, narrower bands are preferable.