What is online training in convolutional neural networks?

Well, it is the usual way of training CNN on very large datasets. Instead of computing the gradient on the whole dataset, you estimate it on a very small subset (a batch, typically of size between 16 and 256), which allows you to stream the training samples.

The extreme (and original) online learning scheme is to process one example at a time. But nothing forbids you to accumulate these samples into a buffer and perform one gradient descent update once the buffer is full.

For very big network, you may not have the computational resources to process more than one sample at a time. In those cases, just use a very big momentum (0.99 or so).

In a practical way, you can use file formats that do not require to load the entire file in memory like hdf5. Or you may also read your data from a network connection as they come.

Stéphane Breton

We usually distinguish 4 optimization modes in machine learning:

1) Off-line / Batch

2) On-line,

3) Recursive,

4) Incremental

The Off-line / Batch mode is the classical learning mode. The estimation/learning dataset is considered as a whole. The optimal estimation model can then be determined either directly (by Moore-Penrose generalized inverse, Section 2 from https://www.researchgate.net/publication/275590644_Learning_deep_representations_via_extreme_learning_machines ) when the optimization problem is linear or iteratively by overall/batch gradient descent when facing with nonlinearities ( https://arxiv.org/pdf/1609.04747.pdf ).

The On-line mode consists in estimating the parameters of the model iteratively by means of stochastic gradient descent while presenting the estimation data sequentially (one by one, https://arxiv.org/pdf/1609.04747.pdf ). This has the advantage of avoiding the simultaneous storage of all the data in memory.

The Recursive mode is of On-line type in addition with CONTINUOUS OPTIMAL estimation of the model parameters ( https://www.researchgate.net/publication/6666254_A_Fast_and_Accurate_Online_Sequential_Learning_Algorithm_for_Feedforward_Networks and http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.101.5176&rep=rep1&type=pdf ). This mode has the advantage of adapting dynamically and optimally the model without requiring complete re-learning each time new input data are processed.

The Incremental mode denotes an optimal dynamic building of the estimation model in course of learning ( https://www.researchgate.net/profile/Chee_Siew/publication/6928613_Universal_Approximation_Using_Incremental_Constructive_Feedforward_Networks_With_Random_Hidden_Nodes/links/00b4952f8672bc0621000000.pdf ). Such an approach constitutes a valuable solution in overcoming well-known overfitting problems inherent to CNNs.

Note: the Recurrent extra-qualifier for a learning model refers to a model whose outputs are re-entering. Such a model is similar to a temporal state model.

Article A Fast and Accurate Online Sequential Learning Algorithm for...

Article Universal Approximation Using Incremental Constructive Feedf...

Article Learning deep representations via extreme learning machines

Hi! For HEK293 cell line growth, recommended media is with or without HEPES buffer?

Are central bank digital currencies (CBDCs) a necessity or a "solution in search of a problem", meaning "a lot of risk for very little reward"?

MANOVA or a different test? Is parametric or non-parametric the best option?

What is the most appropriate test to use given the data I have and the research question I am to address?

Can you help me define cutoff frequencies for a FIR band-pass filter (for filtering an EEG signal) by pointing me towards existing data/studies?

Hello! Can anybody help me with a Face Muscle EMG in txt format, I need it to be read in LabVIEW for my degree?

PK DNA extraction efficiency?

Heat mediated antigen retrieval (using a microwave) for Olig2 immunohistochemistry staining?

How to optimise TEM images of myelinated vs unmyelinated axons?

When exactly can I use the terminology "phenotypic features" ?

Feedback defines the constitution of an organism?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

Measuring the Intelligence of a Species?

How can i do multivariate Time Series forecast using MLP, ANFIS and LSTM?

The Curse of Evolution and Complexity?

Need help with my research project on open source SIEM and machine learning?

Swimming/space travel depends on the proprioceptive muscle spindles?

What are the limitations and challenges of using machine learning for predicting concrete compressive strength in practical applications?

Some new emerging problems on application of RL for scheduling in IoT networks?

How to Compress Information Neurally?