The goal of many algorithms in (biomedical) signal processing is, at the end of the day, to perform some sort of classification, e.g., binary classification. The binary labels (categorical response variables) could for instance represent the presence or absence of a disease or the binary signal quality of a physiological recording. 

When working with time series data, such as the ECG, EEG, blood pressure etc., one can extract features from these signals, that are afterwards used for classification. 

When working with the ECG for example, one could use the duration, height of the QRS complex etc., as features for a classification of normal/abnormal beats. Each training sample, corresponding to one heart beat, would consist of the feature vector and the label. 

Now, when training a classifier, be it a SVM, Logistic Regression or some sort of decision tree/forest, the samples are clearly NOT independent. These classifiers however assume that the samples are IID and I observe that e.g. decision tree, overfit heavily on such data. This is also problematic with ensemble methods such as random forests which rely on bagging (resampling the training data). 

What are common approaches to alleviate this problem or work with dependent samples in a classification procedure?

I know that alternative methods such as hidden Markov models are well suited for time series data, but I am specifically interested in the supervised classification setup using such type of learners (SVM, Log.Regression, ....). 

Similar questions and discussions