A STUDY ON THE PERFORMANCE OF THE PARTIAL LEAST SQUARES REGRESSION IN HANDLING MULTICOLLINEARITY USING SIMULATED DATA

  • I. A Sadiq

Abstract

Multicollinearity is a common issue in regression analysis which occurs due to the violation of the assumptions of regression that there is no correlation between the explanatory variables of the least square estimator, and because of the violation, the estimate of the parameters tends to be less precise and unreliable, and this leads to unstable inflated variance. Thus, the biased regression techniques which stabilize the variance of the parameter estimate were employed. This study focused majorly on the Partial Least Square Regression, a biased regression technique for overcoming multicollinearity, the strength and limitations of the method, and also the performance of the method when compared with the Principal Component Regression (PCR) using the Root Mean Square Error (RMSE) as a performance metric. A simulation study of data that follows a normal distribution with varying levels of multicollinearity was conducted to evaluate the accuracy, interpretability, and robustness of PLSR models and also in comparison to the PCR using the root mean square error (RMSE) as a performance metric. Based on this study, it is observed that the PLSR is more robust to multicollinearity than PCR, as it is less likely to produce unstable parameter estimates in highly correlated datasets. Therefore, this technique can be applied to the same distribution used in this study by varying the sample sizes. It can also be used to look at the behaviors of distributions other than those used in this study.

 

Keywords: Regression; Multicollinearity; Partial Least Square Regression; Principal Component Regression; Simulation

RSS
Published
2025-04-09
Issue
Section
Articles