Comparative Assessment of Random Forest and Support Vector Regression Models for Rainfall Time-Series Forecasting and Flood Risk Implications in Maiduguri, Nigeria.
Abstract
Rainfall variability is a major driver of hydrological extremes, particularly in semi-arid regions,
which are remarkably sensitive to changes in the planetary climate. Precise rainfall forecasting
is an imperative component in flood risk mapping. This study assessed the applicability of
machine learning models to provide rainfall time series forecasting in Maiduguri, in
northeastern Nigeria, using monthly rainfall data from 1981 to 2023, collected from the
Nigerian Meteorological Agency. Two machine learning models, Random Forest (RF) and
Support Vector Regression (SVR), were evaluated using both regression and classification
frameworks. The performance metrics used in assessing the models in forecasting rainfall were
Mean Squared Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Error (MAE),
and the coefficient of determination (R2), in addition to classification using a confusion matrix
after binning the rainfall into flood-classes based on a specific threshold. Results indicate a
considerable outperformance of the RF algorithm in forecasting rainfall in Maiduguri, Nigeria,
based on its smaller prediction errors (MSE = 1571.61, RMSE = 39.64 mm, MAE = 21.99 mm,
R2 = 0.80) than those of SVR (MSE = 1956.12, RMSE = 44.23 mm, MAE = 24.80 mm, R2 =
0.77). Also, a confusion analysis on the classification results revealed a higher capability on
the part of the RF algorithm in recognizing events of rainfall that are prone to flooding. The
results demonstrate the superior predictive capability of the RF algorithm in forecasting
rainfall, particularly in flood risk mapping in semi-arid areas sensitive to changes in planetary
climate.
Royal Statistical Society Nigeria Local Group 2026 Conference Proceeding
Keywords: Rainfall forecasting; Machine learning; Random Forest; Support Vector Regression; Flood
risk assessment