CHOOSING THE MOST REPRESENTATIVE AND EFFICIENT AVERAGES OF NUMERIC UNIVARIATE DATA SETS: VOTING AND BOOTSTRAPPING TECHNIQUES
Abstract
Numeric univariate data set exhibits different characteristics which are expected to be summarily provided by a typical value or a representative of a set of values called averages. These characteristics change as data set departs from being symmetric to asymmetric with and without outliers resulting into a challenge of acceptance of each average to the subjects being represented. In this research, the voting and bootstrapping techniques are adopted as methods through which every data set can choose its best averages in terms of representativeness and efficiency. While bootstrapping method provides the most efficient average as one having least standard error, the voting technique provides opportunity for every subject in a data set to choose one and only one of the averages as its best representative and thereafter, the most representative average of the data set as one having the highest counts. The techniques were illustrated with eighteen (18) data sets of different characteristics sourced from https://artofstat.com/web-apps. Results show that the most representative average could be any of mode, mid-range, median, Lehmer mean and harmonic mean, and that the most efficient average could be any of harmonic mean, geometric mean, arithmetic mean, quadratic mean, Lehmer mean, mid-range and median. The study, therefore, recommends that every numeric data set should be allowed to choose its most representative using voting technique and its most efficient average using the bootstrapping method as both techniques provide better opportunity for the averages to interact with the data set and compete for their choice as the best averages.