Can cross validation be used for confidence interval?

More Batuhan Kavlak's questions See All

I am looking for a material that does not slip on wet glass floor and does not scratch the glass. Do you think vulcanized rubber is suitable?

I am looking for a non-slip and non-scratching material for curved and wet glass surfaces.

30 March 2023 8,553 3 View

Could NASA POWER underestimate daily precipitation?

Hi, I am checking the daily precipitation data from NASA POWER (https://power.larc.nasa.gov), which uses data from MERRA2. When I compare values with other data sets' daily sum values, I see...

11 May 2020 2,765 3 View

Saos-2 cell culture contamination ?

Hi everyone, I'm doing Saos-2 cell culture study. There has contamination about 2 months and I could not find the cause. There is always a blur in the flask. I share the microscope image after...

20 November 2018 6,277 1 View

Finding plantation date of agricultural fields

I am trying to build a model to predict past plantation date of given agricultural fields, using satellite imagery. I couldn't find any paper on the subject. I have been working on crop-type...

01 January 1970 7,909 3 View

Find the coefficient of friction between two surfaces

How to find the coefficient of friction between two surfaces, one fixed and one moving?

01 January 1970 2,969 4 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

How to report results of Generalised Linear Mixed Models in a journal article?

Hi everyone, If you have written or come across any papers where Generalised Linear Mixed Models are used to examine intervention (e.g., in mental health) efficacy, could you please share the...

04 August 2024 4,130 4 View

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

Better ways to analyze the qualitative and quantitative data in a sequential explanatory mixed method approaches

04 August 2024 2,703 6 View

How can I interpret the data without the need of solving it manually?

How can I interpret the data gathered without solving?

03 August 2024 9,054 3 View

Why can't academics earn the money they deserve?

Only Journals make money from the articles we have worked on for years. Academics do not earn money from their refereeing. Then shouldn't the solution be a system in which academics can earn...

01 August 2024 6,469 6 View

Posthoc test lettering in JAMOVI?

Does anyone know of a module for the JAMOVI software that is capable of generating mean separations using the classic letters based on post hoc results (e.g., Tukey test)? If, as I believe, such...

31 July 2024 3,333 4 View

Luis Ernesto Cervera-Gómez

Check this! https://stats.stackexchange.com/questions/88809/significance-testing-of-cross-validated-classification-accuracy-shuffling-vs-b?noredirect=1&lq=1

David Morse

Hello Batuhan,

The bootstrap (resampling) method is just an extension of your idea, such that many samples are taken and (in the traditional approach) confidence intervals are derived from the observed distribution of resultant values. So, yes, re-estimation with a "different" data set can certainly be a path towards determining (what turns out to be pretty robust) estimates of CIs.

Good luck with your work.

Dena Kadhim

I think yes

Batuhan Kavlak

Hi Mohammed Deeb, could you inform me more about it? I know we use a test set to prevent bias in our results, and we use Cross-Validation to prevent bias in our split. Do you mean this is why we use a randomly collected test set?

David Morse Luis Ernesto Cervera-Gómez

Thank you for the answers!

Mohammed Deeb Thank you for clarification. My problem was exactly what Luis Ernesto Cervera-Gómez refer. The dependence of the folds in cross-validation is thought to add bias in error estimation. I guess it is still an open issue for research and which situation it is reliable is not clear yet.

Samer Sarsam

If you tend to use stratified cross-validation, then it is not necessary to have an independent test set. Meaning that if you need to merge the training with the test set, the cross-validation alone is enough.

It is worthy to highlight that even if you use 10-fold cross-validation, for instance, to estimate expected performance on unseen data for a model built from the full dataset, there will be bias (which people typically ignore). This due to the fact that each training set in 10-fold cross-validation only contains 90% of the full dataset. This leads to getting an estimation of the expected performance of a classifier that learned from 90% of the full dataset.

Mohammed Deeb, accuracy alone is not a reliable approach to assess the performance of the machine learning algorithm.

HTH.

Dr. Samer Sarsam