The instability of the Pearson correlation coefficient in the presence of coincidental outliers

Yunmi Kim, Tae Hwan Kim, Tolga Ergün

Research output: Contribution to journalArticlepeer-review

85 Scopus citations

Abstract

It is well known that any statistic based on sample averages can be sensitive to outliers. Some examples are the conventional moments-based statistics such as the sample mean, the sample variance, or the sample covariance of a set of observations on two variables. Given that sample correlation is defined as sample covariance divided by the product of sample standard deviations, one might suspect that the impact of outliers on the correlation coefficient may be neither present nor noticeable because of a 'dampening effect' i.e., the effects of outliers on both the numerator and the denominator of the correlation coefficient can cancel each other. In this paper, we formally investigate this issue. Contrary to such an expectation, we show analytically and by simulations that the distortion caused by outliers in the behavior of the correlation coefficient can be fairly large in some cases, especially when outliers are present in both variables at the same time. These outliers are called 'coincidental outliers.' We consider some robust alternative measures and compare their performance in the presence of such coincidental outliers.

Original languageEnglish
Pages (from-to)243-257
Number of pages15
JournalFinance Research Letters
Volume13
DOIs
StatePublished - 1 May 2015

Keywords

  • Correlation
  • Outliers
  • Robust statistic

Fingerprint

Dive into the research topics of 'The instability of the Pearson correlation coefficient in the presence of coincidental outliers'. Together they form a unique fingerprint.

Cite this