A DUAL EMPIRICAL–SIMULATION FRAMEWORK FOR ROBUST PREDICTIVE CLASSIFICATION IN MULTIVARIATE DISCRIMINANT ANALYSIS

  • Maryjane Nneoma Chika Department of Statistics, Ignatius Ajuru University of Education, Rumuolumeni Port Harcourt, Rivers State
  • Uyodhu Amekauma Victor-Edema, Department of Statistics, Ignatius Ajuru University of Education, Rumuolumeni Port Harcourt, Rivers State

Abstract

The growing reliance on data-driven decision-making has intensified the demand for statistical
models that combine predictive accuracy with interpretability. Traditional Multivariate
Discriminant Analysis (MDA), notably Linear Discriminant Analysis (LDA) and Quadratic
Discriminant Analysis (QDA), offers a transparent framework for classification but is limited
by its performance on high-dimensional, imbalanced, and non-normal datasets. This study
aimed to evaluate the performance, robustness, and interpretability of classical, regularized,
and robust MDA methods through an integrated empirical–simulation framework. Empirical
analyses were conducted on two secondary health-related datasets, the Centres for Disease
Control and Prevention (CDC), Heart Disease Indicators Dataset (10,000 records) and the
Nigeria Demographic and Health Survey (NDHS) Children Anaemia Dataset (3,856 cases),
purposively sampled, while 1,000 Monte Carlo simulations were performed under varying
sample sizes, covariance structures, and contamination levels. Models were evaluated with
accuracy, precision, recall, F1-score, and Area Under Curve metrics. Findings revealed that
LDA was computationally efficient but had poor recall on imbalanced data (27%), whereas
QDA slightly improved recall (37%) but was unstable under heteroscedasticity. Robust MDA
variants employing Minimum Covariance Determinant (MCD) estimators demonstrated
greater resilience to outliers and violations of normality assumptions, maintaining predictive
stability. The study concludes that robust and regularized MDA models provide dependable
and transparent alternatives for decision-making in imperfect data environments and
recommends simulation-based validation protocols to enhance model reliability across diverse
data conditions.
Keywords: Multivariate Discriminant Analysis, Linear Discriminant Analysis, Quadratic
Discriminant Analysis, Robust Statistics, Monte Carlo Simulation.

Author Biographies

Maryjane Nneoma Chika, Department of Statistics, Ignatius Ajuru University of Education, Rumuolumeni Port Harcourt, Rivers State

Department of Statistics, Ignatius Ajuru University of Education, Rumuolumeni

Port Harcourt, Rivers State

 

Uyodhu Amekauma Victor-Edema,, Department of Statistics, Ignatius Ajuru University of Education, Rumuolumeni Port Harcourt, Rivers State

Department of Statistics, Ignatius Ajuru University of Education, Rumuolumeni

Port Harcourt, Rivers State

 

Cover Page
Published
2026-04-22
Issue
Section
Articles