What is statistical meaning of testing in machine learning?

More Marina Sapir's questions See All

PVP can be dissolve in 1- pentanol or 1- butanol? It's the same result?

at the preparation of the C18 alkyl silica, the paper used n-pentanol to dissolve PVP, in my lab I have 1- butanol. Could it be the same?

20 May 2024 5,933 1 View

What are absorbance maximum and extinction coefficient of 4,4′-Diisothiocyano-2,2′-stilbenedisulfonic acid?

Maybe there are more fresh data than old reference in Sigma-Aldrich?

19 May 2024 9,483 0 View

¿Conocen software o apps que recomienden para la transcripción de grupos focales o entrevistas grupales en español?

Hola a todos/as! ¿Cómo estás? He visto que existen apps con AI para asistir en la transcripción de grupos focales. Probé uno (Notta) que tiene una versión no paga que alcanza para testearlo y no...

17 April 2024 640 0 View

How could we implement -xthenreg (GMM threshold) with robust errors (Stata14)?

Could you please clarify how to add robust (white) errors to the function -xthenreg in Stata14? I am running Threshold GMM Estimator (Seo and Shin (2016)) for my panel (N=60, T=28). Commands such...

03 April 2024 8,500 1 View

Where can I find floral sleeves for 10cm x 10cm pots in the UK?

We use Aracons for Arabidopsis but need something to contain larger plants in larger pots. All garden supplies stores I've been in person and online only sell sleeves with 10cm bottom openings,...

02 April 2024 9,337 0 View

Which cell-penetrating peptides (CPPs) do you recommend to deliver intra-cellular cargo in chitin cell-walled unicellular eukaryotes?

Dear colleagues, I am working on a project in which I want to deliver oligonucleotides (specifically, molecular beacons (MBs)) inside different protist species. To do so, I am going to use...

18 March 2024 2,778 0 View

¿Estarían formulados correctamente los siguientes objetivos para abordar una adecuada intervención con víctimas y supervivientes de trata?

1.1.- Reducir los obstáculos de las personas víctimas o posibles víctimas de trata con fines de explotación sexual y sus hijos/as en su acceso a recursos, prestaciones y ayudas forales. 1.2.-...

10 March 2024 8,166 1 View

Is there any posibility that the phase matching of gcpw is depended on spacing or anyother factor?

I am designing two RF tracks (Transmission lines) based on Grounded Coplanar waveguide. The length of both tracks is same, but there is still a phase difference between two tracks.

16 February 2024 6,718 2 View

Which tool is the most reliable for predicting RNA secondary structure in oligonucleotides?

Dear colleagues, I am designing RNA molecular probes for FISH, and I want them to adopt a hairpin structure. I have checked for the possible secondary structure formation of some of the candidate...

14 February 2024 7,811 2 View

If i delete my google sholar profile, can i create a new one?

If i delete my google sholar profile, and 30 days for recovery profile pass, can i create a new one?

14 December 2023 8,329 3 View

Feedback defines the constitution of an organism?

“Here is a thought experiment. Let's place Rodolpho Llinas's jarred-brain on top of a body (Fig. 1). I bet Llinas would argue that his jarred-brain retains its own consciousness, and the android...

11 August 2024 2,483 1 View

How to learn more about SPSS and its Application?

I would like to learn more about SPSS and Its application especially in regards to data analysis. Please suggest me how I can learn more about it. Thank you so much.

11 August 2024 9,101 4 View

Has anyone applied Python in the field of textile engineering for data analysis, automation, or smart textiles?

I'm currently exploring the application of Python in textile engineering, specifically in areas like data analysis, process automation, and the development of smart textiles. I'm interested in...

10 August 2024 7,429 2 View

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

The rate of glucose consumption by the neocortex is reduced by over 80% during anesthesia (Sibson et al. 1998), which disables the synapses (Richards 2002) that are inundated by glial tissue (Engl...

08 August 2024 3,118 0 View

What condition of Kirat Rai's participation in politics before and now in Nepal?

The political participation of the Kirat Rai community in Nepal has evolved significantly over time: Historical Context Historically, the Kirat Rai people, like many indigenous groups in Nepal,...

05 August 2024 5,950 1 View

Measuring the Intelligence of a Species?

Larger brains, which typically contain more neurons, store and transfer more information (Tehovnik and Chen 2015), but the precise relationship between number of neurons and information has yet to...

05 August 2024 1,238 2 View

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

I need the python code to forecast what crop production will be in the next decade considering climate and crop production variables as seen in the attached.csv file.

05 August 2024 2,977 3 View

The Curse of Evolution and Complexity?

Brain and body mass together are positively correlated with lifespan (Hofman 1993). The duration of neural development is one of the best predictors of brain size, and conception is the best...

05 August 2024 6,247 3 View

Need help with my research project on open source SIEM and machine learning?

Hello everyone, I am currently working on a research project that aims to integrate machine learning techniques into an open source SIEM tool to automate the creation of security use cases from...

04 August 2024 3,196 2 View

Swimming/space travel depends on the proprioceptive muscle spindles?

When the entire neocortex is ablated in rodents, although they are still able to swim, all the limbs move continuously and asynchronously (Vanderwolf 2006; Vanderwolf et al. 1978). Normal animals...

03 August 2024 835 3 View

J. A. Hageman

The idea is that your validation sample is representative of the population. In that sense having a poor accuracy on the validation sample indicates that your machine learning model has not generalized beyond the training set. Continuing with several validation sets until one is tested with sufficient accuracies will lead to false positives.

Marina Sapir

How would this idea that "validation sample is representative of the population" be justified for a small sample? In biomedical studies, for example, both training and test are necessarily small.

Adwaye Rambojun

The testing part of a machine learning model may refer to model evaluation while training or model evaluation once trained. I will explain the first, as the second is quite case specific. Say you are using data X_train to train some model. And assume that you have some additional data X_test that you keep on the side.

While training, people often report the test set accuracy (accuracy of model on X_test) and the train set accuracy (accuracy of model on X_train). If these too diverge while training, then the model is said to be over-fitting the data. That is, it is too specific to X_train, and does not generalise well to data it has not seen.

Coming to your second question, as in any experiment design, those collecting data need to make sure that the data-set is rich enough. In machine learning language, we would say it captures enough variability in the features being examined. Once a model is trained, it is still important to validate it against data coming from outside X_test and X_train, and assimilate new information by incorporating examples that cause the model to give wrong predictions. This is akin to learning in human beings.

I hope this helps!

I never asked about training.

You never answered my only question about the statistical meaning of testing.