Detecting Concept Drift with Active Learning?

Dear Albert,

The following links contain publications which fulfill your request:

1-JMLR: Workshop and Conference Proceedings 17 (2011) 48–55 2nd Workshop on Applications of Pattern Analysis

MOA Concept Drift Active Learning Strategies for Streaming Data

Abstract

We present a framework for active learning on evolving data streams, as an extension to the MOA system. In learning to classify streaming data, obtaining the true labels may require major effort and may incur excessive cost. Active learning focuses on learning an accurate model with as few labels as possible. Streaming data poses additional challenges for active learning, since the data distribution may change over time (concept drift) and classifiers need to adapt. Conventional active learning strategies concentrate on querying the most uncertain instances, which are typically concentrated around the decision boundary. If changes do not occur close to the boundary, they will be missed and classifiers will fail to adapt. We propose a software system that implements active learning strategies, extending the MOA framework. This software is released under the GNU GPL license.

http://jmlr.csail.mit.edu/proceedings/papers/v17/zliobaite11a/zliobaite11a.pdf

2-Active learning approach to concept drift problem

Logic Jnl IGPL (2011) doi: 10.1093/jigpal/jzr011 First published online: February 24, 2011

http://jigpal.oxfordjournals.org/content/early/2011/02/24/jigpal.jzr011.full.pdf

3-Augmented Query Strategies for Active Learning in Stream Data Mining

Abstract. Active learning is used in situations where the amount of unlabeled data is abundant but it is costly to manually label the data. So, depending on our available budget, from all unlabeled instances we are to select only a subset of them to ask the oracle for manual labeling. Thus, the query strategy, i.e., how relevant instances are selected to be sent to the oracle, plays an important role in active learning. Though active learning is a very established research area, only a few research

works have been done on it in the context of stream data mining. Active learning for stream data is more challenging than for static data because the repetition of queries is not feasible as revisiting the data is almost impossible. In this paper, we propose two augmented query strategies for active learning in stream data mining, namely, Margin Sampling with Variable Uncertainty (MSVU) and Entropy Sampling with Uncertainty using Randomization (ESUR). These two strategies are derived

and improved from the existing methods of Variable Uncertainty (VU) and Uncertainty using Randomization (UR) respectively. We evaluate the effectiveness of our proposed MSVU and ESUR strategies by comparing them against the original VU and UR on 6 different datasets using two base classifiers: Leveraging Bagging (LB) and Single Classifier Drift (SCD). Experimental results show that our proposed strategies offer promising outcomes for various datasets and detecting concept drift in the data.

http://www.aungz.com/PDF/265.pdf

4-Learning under Concept Drift: an Overview

Indr˙e Zliobait˙e ˇ

Faculty of Mathematics and Informatics

Vilnius University, Lithuania [email protected]

Concept drift refers to a non stationary learning problem over time. The training and the application data often mismatch in real life problems [61]. In this report we present a context of concept drift problem 1 . We focus on the issues relevant to

adaptive training set formation. We present the framework and terminology, and formulate a global picture of concept drift learners design. We start with formalizing the framework for the concept drifting data in Section 1. In Section 2 we discuss the adaptivity mechanisms of the concept drift learners. In Section 3 we overview the

principle mechanisms of concept drift learners. In this chapter we give a general picture of the available algorithms and categorize them based on their properties. Section 5 discusses the related research fields and Section 5 groups and presents major concept drift applications. This report is intended to give a bird’s view of concept drift research field, provide a context of the research and position it within broad spectrum of research fields and applications.

https://arxiv.org/pdf/1010.4784.pdf

Hoping this will be helpful,

Rafik

Pseudo-incremental learning. Is this correct?

Feedback defines the constitution of an organism?

Baseline drift in HPLC? What causes this?

Self-Organizing Superorganisms—as envisaged by Nenad Sestan (2018)?

How to generate a citation of my paper from ResearchGate?

Is there anyone with experience in TEM analysis who can assist with a manuscript for an upcoming journal?

How to get links for copyrights for papers?

How to determine positive-stained cells in FACS? Use isotype or unstained control?

Absorption coefficient of methane?

Measuring the Intelligence of a Species?

If we are using snowball sampling technique, how do we justify the true representativeness of the sample statistically? is there any statistical test?