What does ten times ten-fold cross validation of data set mean and its importance?

More Manas Ranjan Prusty's questions See All

XRD peak deviation?

In XRD analysis, when we move to a higher angle, like above 50 degrees, if I have two reference lines of different phases, one matches the peak and the other is towards the right of the peak with...

31 March 2024 1,812 2 View

Will the temperature of PCR lid affect the overnight ligation at 16 degree celsius?

I had set up overnight ligation reaction at 16 degree celsius in a PCR machine but the lid temperature was high around 95 degrees. Will it affect the efficeincy of ligation if done in PCR in this way?

22 January 2024 2,587 2 View

Is it possible for yttrium or yttrium oxide to react at 760 degree Celsius during casting of magnesium, if its added to reinforce the magnesium?

Casting was done under SF6-CO2 and argon gases environment.

02 December 2023 4,920 2 View

Is there any non kit based method to isolate PCR amplified product?

Is there any manual method to precipitate PCR amplified product?

01 December 2023 5,539 4 View

For doing blunt end cloning shall I dephosphorylate the linearized vector before blunting?

To ligate blunt insert into blunt vector does the vector need to be dephosphorylated at the time of ligation? I have digested my vector with restriction enzyme that create sticky ends that...

15 November 2023 448 0 View

Will the gene under EF1a promoter (from mammalian expression vector) be expressed in any bacterial host?

I have inserted a gene in downstream to Ef1a promoter of my mammalian expression vector. Can I check the protein expression using any bacterial expression host? Or, will it only express in any...

06 November 2023 3,410 4 View

What are the odds that, the amplicon generated using primers carrying overhangs will have blunt end and does not carry any overhangs ?

The DNA polymerase fills in the gap due to overhangs (using a pair of primers carrying overhangs) during the 2nd round of replication in PCR. Does those gaps actually get filled or they have...

02 November 2023 6,400 1 View

How can I ligate an insert digested with same restriction enzyme from both end to a vector digested with same enzyme?

I have insert having flanking regions with XbaI site and and a vector digested with XbaI. How can I prevent self ligation in insert as well as vector before ligation?

31 October 2023 8,865 1 View

How to perform a LCMS result analysis to find new protein?

I recently conducted a liquid chromatography-mass spectrometry (LC-MS) analysis of my protein sample, which resulted in the identification of over 300 proteins. I need assistance in identifying...

30 October 2023 5,812 8 View

What are the correct port definitions for Ridge Gap Waveguide in CST ?

One clarification I needed regarding port definition for gap waveguide in CST . Do we need to use magnetic shielding in waveguide port definition to have the effect of PMC in port ? and what...

05 September 2023 5,583 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Handling Missing Data and Building a Predictive Model with Incomplete Information ?

I am developing a predictive model for a water supply network that involves 20 influencing points. However, I only have historical data for 10 out of these 20 points. I would like to know how to...

10 August 2024 4,005 2 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

How to normalize and take the significance of the MTT OD values with 3 replicates for the same cell-line?

Hi, I have a question about normalizing the MTT OD values for doing the statistical analysis. So, if we have 3 different plates and we call them 3 different replicates, so, first we would...

07 August 2024 8,106 4 View

Why activated CAR-Jurkat cell could not kill targets?

Previously when I co-coluture anti-CD19(FMC63) CAR-Jurkat with Raji with E:T=5:1, Jurkat can eliminate Raji in 24h. However, when I test another CAR construct, although I can dectect totally CD69...

06 August 2024 641 2 View

Is Galaxy.org good to use for research for analyzing data and for publication?

Hello all, I wanted to know, can I use galaxy (USA, Europe or Australia) platform for analyzing the shotgun data, and can it be used for publication purpose as well? Thanks :)

06 August 2024 6,610 4 View

Do experts have journals in the field of artificial intelligence and big data that are not indexed by SCI or EI?

05 August 2024 8,836 2 View

How to isolate lymphocytes from mouse spleen?

I have tried several times to isolate lymphocytes from mouse spleen, but all attempts have been unsuccessful. I tried most available protocols. I used different dissociation media (HBSS with Ca...

04 August 2024 9,913 7 View

What are possible strategies can be used to analyze data under sequential explanatory mixed method approach?

Better ways to analyze the qualitative and quantitative data in a sequential explanatory mixed method approaches

04 August 2024 2,703 6 View

How can I interpret the data without the need of solving it manually?

How can I interpret the data gathered without solving?

03 August 2024 9,054 3 View

Shane McGee McMahon Popular answer

Cross validation is a method applied to a model and a data set in an effort to estimate the out of sample error. It has become quite popular because of it's simplicity and utility; there is even a popular statistics message board named after the method (cross validated: stats.stackexchange.com).

When we fit a model to a data set, we do so by minimizing some sort of loss function; most often, we will use the squared error loss function for simplicity. It is well known, and should be quite obvious, that estimating the resulting prediction error by using the same data that we used to fit the model will produce overly optimistic results. Therefore, it is common practice to test the model on a new data set to provide a better estimation of the out of sample prediction error. However, when data collection is cost prohibitive, we may prefer not to "throw away" a significant portion of out data in a test data set. In this case me may turn to k-fold cross validation, the mot popular flavor of which being 10-fold cross validation. In k-fold cross validation, the data set is split randomly into k partitions, We then fit our model to a data set consisting of k-1 of the original k parts, and use the remaining portion for validation. That is we estimate the out of sample error using the portion of data left out of the fitting procedure. We repeat this k times and our estimate for the out of sample error is the the average over the k validation runs.

There are many excellent references available. The following book by Hastie et. al is quite good and freely available online, it has a short section on cross validation that may be of some interest.

Hastie, T., & Tibshirani, R. (2005). The elements of statistical learning: data mining, inference and prediction. The Mathematical … (Second Edi.). Springer. Retrieved from http://www.springerlink.com/index/D7X7KX6772HQ2135.pdf

Shane McGee McMahon

Manas Ranjan Prusty

Dear Shane,

Your information is really informative. Now I have a doubt in this. I understood the reason behind k-fold or 10-fold cross validation. What is the need of going for 10-times 10-fold cross validation? What is the actual procedure to do so? Is it simply running the 10-fold cross validation method for 10 times? That means the loop should run in total 100 times and finally the average of all the 100 output??

Please help me with this confusion.

Thanks in advance

10-fold cross validation would perform the fitting procedure a total of ten times, with each fit being performed on a training set consisting of 90% of the total training set selected at random, with the remaining 10% used as a hold out set for validation.

Thanks Shane. But my doubt is still not clear. I can understand 10-fold cross validation as you have explained. But what is 10 times of 10-fold cross validation? Is it doing 10-fold cross validation for 10 times? That means do we read to do a total 100 times(10X10fold) of cross validation.

Saravanan Thavamanikumar

Yes Manas. You repeat the 10-fold cross validation 10 times and take the mean.

Hui Xu

I combined the 10 fold cross vaidation to get a result, and repeat 10 times to get the mean, is ok?

Samsuddin Ahmed

do 10-fold CV 10 times.

Then take the measures like mean, stdev, etc