Model averaging via penalized regression for tracking concept drift

Kyupil Yeon, Moon Sup Song, Yongdai Kim, Hosik Choi, Cheolwoo Park

Research output: Contribution to journalArticlepeer-review

13 Scopus citations

Abstract

A supervised learning algorithm aims to build a prediction model using training examples. This paradigm typically has the assumptions that the underlying distribution and the true input-output dependency do not change. However, these assumptions often fail to hold, especially in data streams. This phenomenon is known as concept drift. We propose a new model combining algorithm for tracking concept drift in data streams. The final predictive ensemble model has a form of a weighted average and ridge regression combiner. The coefficients of the combiner are determined by ridge regression with the constraints such that the coefficients are nonnegative and sum to 1. The proposed algorithm is devised via a new measure of concept drift, the angle between the estimated weights from data and the optimal weight vector obtained under no concept drift. It is shown that the ridge tuning parameter plays a crucial role of forcing the proposed algorithm to adapt to concept drift. Our main findings include (i) the proposed algorithm can achieve the optimal weights in the case of no concept drift if the tuning parameter is sufficiently large, and (ii) the angle is monotonically increasing as the tuning parameter decreases. These imply that if the tuning parameter is wellcontrolled, the algorithm can produce weights which reflect the degree of concept drift measured by the angle. Using various numerical examples, it is shown that the proposed algorithm can track concept drift better than other existing ensemble methods. Supplemental materials, computer code and R-package, are available online.

Original languageEnglish
Pages (from-to)457-473
Number of pages17
JournalJournal of Computational and Graphical Statistics
Volume19
Issue number2
DOIs
StatePublished - Jun 2010

Keywords

  • Data stream
  • Drifting concept
  • Ensemble method
  • Model combiner

Fingerprint

Dive into the research topics of 'Model averaging via penalized regression for tracking concept drift'. Together they form a unique fingerprint.

Cite this