Regularized boxplot via convex clustering

Hosik Choi, J. C. Poythress, Cheolwoo Park, Jong June Jeon, Changyi Park

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

A boxplot is a simple and effective exploratory data analysis tool for graphically summarizing a distribution of data. However, in cases where the quartiles in a boxplot are inaccurately estimated, these estimates can affect subsequent analyses. In this paper, we consider the problem of constructing boxplots in a bivariate setting with a categorical covariate with multiple subgroups, and assume that some of these boxplots can be clustered. We propose to use this grouping property to improve the estimation of the quartiles. We demonstrate that the proposed method more accurately estimates the quartiles compared to the usual boxplot. It is also shown that the proposed method identifies outliers effectively as a consequence of accurate quartiles, and possesses a clustering effect due to the group property. We then apply the proposed method to annual maximum precipitation data in South Korea and present its clustering results.

Original languageEnglish
Pages (from-to)1227-1247
Number of pages21
JournalJournal of Statistical Computation and Simulation
Volume89
Issue number7
DOIs
StatePublished - 3 May 2019

Keywords

  • Box-whisker plot
  • convex clustering
  • group comparison
  • shrinkage estimator

Fingerprint

Dive into the research topics of 'Regularized boxplot via convex clustering'. Together they form a unique fingerprint.

Cite this