Abstract
A boxplot is a simple and effective exploratory data analysis tool for graphically summarizing a distribution of data. However, in cases where the quartiles in a boxplot are inaccurately estimated, these estimates can affect subsequent analyses. In this paper, we consider the problem of constructing boxplots in a bivariate setting with a categorical covariate with multiple subgroups, and assume that some of these boxplots can be clustered. We propose to use this grouping property to improve the estimation of the quartiles. We demonstrate that the proposed method more accurately estimates the quartiles compared to the usual boxplot. It is also shown that the proposed method identifies outliers effectively as a consequence of accurate quartiles, and possesses a clustering effect due to the group property. We then apply the proposed method to annual maximum precipitation data in South Korea and present its clustering results.
Original language | English |
---|---|
Pages (from-to) | 1227-1247 |
Number of pages | 21 |
Journal | Journal of Statistical Computation and Simulation |
Volume | 89 |
Issue number | 7 |
DOIs | |
State | Published - 3 May 2019 |
Keywords
- Box-whisker plot
- convex clustering
- group comparison
- shrinkage estimator