Predictive ARIMA Model with a Machine Learning (ML) Approach for COVID-19 Data in Pakistan

  • Muhammad Ilyas Department of Mathematics, Government College University Hyderabad, Sindh, Pakistan
  • Shaheen Abbas Mathematical Sciences Research Centre, Federal Urdu University of Arts, Sciences and Technology, Karachi, Pakistan
  • Faisal Nawaz Department of Mathematics, Dawood University of Engineering and Technology, Karachi SIndh
Keywords: ARIMA (p, d, q), Machine Learning (ML),, Predictive model, trained and validation data

Abstract

Abstract Views: 0

This study is based on the application of an ARIMA (p, d, q) based machine learning (ML) approach to evaluate the dynamics of COVID-19 pandemic. The focus is on estimating epidemic trends and performing diagnostic scrutiny with model fitting. The data including all four waves of the pandemic pertaining to Pakistan, covering all four provinces (Sindh, Punjab, Khyber Pakhtunkhwa, Balochistan, as well as Gilgit Baltistan, Azad Jammu Kashmir, and the capital city Islamabad, collected from February 26, 2020, to September 30, 2021, is analyzed. The ML algorithm is used to optimize the results of ADF, unit root test which ensures the minimum of ACF, and PACF graphs intention of the data series. The results employ the fitted ARIMA models (1, 1, 1) and (1, 1, 7) for the 1st to 4th waves, confirming daily infected cases across the entire dataset of Pakistan. The cumulative trained observations are from the 1st wave (February 26, 2020, to October 20, 2020), 2nd wave (October 21, 2020, to March 16, 2021), 3rd wave (March 17, 2021, to July 10, 2021), and 4th wave (July 11, 2021, to September 30, 2021), with a further 14-day forecast (from October 1 to October 14, 2021). The results show a strong correlation between the trained and predicted values, ranging from 0.8789 to 0.99236. To select predictive model parameters, the model that results in the minimum Bayesian Information Criterion (BIC) value and residuals from the datasets obtained after detaching the unnecessary errors and the 95^% CI for the forecasting error ( ) are calculated. These values would help to decide the best fitted predictive model.

Downloads

Download data is not yet available.

References

Chyon FA, Suman MNH, Fahim MRI, Ahmmed MS. Time series analysis and predicting COVID-19 affected patients by ARIMA model using machine learning. J Virol Methods. 2022;301:e114433. https://doi.org/10.1016/j.jviromet.2021.114433

Feng Y, Hao W, Li H, Cui N, Gong D, Gao L. Machine learning models to quantify and map daily global solar radiation and photovoltaic power. Renewable Sustain Energy Rev. 2020;118:e109393. https://doi.org/10.1016/j.rser.2019.109393

Ilu SY, Prasad R. Time series analysis and prediction of COVID-19 patients using discrete wavelet transform and auto-regressive integrated moving average model. Multimed Tools Appl. 2024;83:72391–72409. https://doi.org/10.1007/s11042-024-18528-x

Roosa K, Chowell G. Assessing parameter identifiability in compartmental dynamic models using a computational approach: application to infectious disease transmission models. Theor Biol Med Model. 2019;161:e1. https://doi.org/10.1186/s12976-018-0097-6

Contreras J, ARIMA models to predict next-day electricity process. IEEE Trans Power Syst. 2004;19(1):366–374. https://doi.org/10.1109/TPWRS.2002.804943

Vaishya R, Javaid M, Khan IH, Haleem A. Artificial Intelligence (AI) applications for COVID-19 pandemic. Diabetes Metab Syndr Clin Res Rev. 2020;14(4):337–339. https://doi.org/10.1016/j.dsx.2020.04.012

Chimmula VKR, Zhang L. Time series forecasting of COVID19 transmission in Canada using LSTM networks. Chaos Solitons Fractals. 2020;134:e109864. https://doi.org/10.1016/j.chaos.2020.109864

Alboaneen D, Pranggono B, Alshammari D, Alqahtani N, Alyaffer R. Predicting the epidemiological outbreak of the coronavirus disease 2019 (COVID-19) in Saudi Arabia. Int J Environ Res Public Health. 2020;17(12):e4568. https://doi.org/10.3390/ijerph17124568

Sardar I, Akbar MA, Leiva V, Alsanad A, Mishra P. Machine learning and automatic ARIMA/Prophet models-based forecasting of COVID-19: ethodology, evaluation, and case study in SAARC countries. Stoch Environ Res Risk Assess. 2023;37:345–359. https://doi.org/10.1007/s00477-022-02307-x

Alkady W, ElBahnasy K, Leiva V, Gad W. Classifying COVID-19 based on amino acids encoding with machine learning algorithms. Chemom Intell Lab Syst. 2022;224:e104535. https://doi.org/10.1016/j.chemolab.2022.104535

Paparoditis E, Politis DN. The asymptotic size and power of the augmented Dickey–Fuller test for a unit root. Economet Rev. 2018;37(9):955–973. https://doi.org/10.1080/00927872.2016.1178887

Sujath RA, Chatterjee JM, Hassanien AE. A machine learning forecasting model for COVID-19 pandemic in India. Stoch Environ Res Risk Assess. 2020;34:959–972. https://doi.org/10.1007/s00477-020-01827-8

Kim S, Kim H. A new metric of absolute percentage error for intermittent demand forecasts. Int J Forecast. 2016;32(3):669–679. https://doi.org/10.1016/j.ijforecast.2015.12.003

Fong SJ, Li G, Dey N, Crespo RG, Herrera-Viedma E. Finding an accurate early forecasting model from small dataset: A case of 2019-ncov novel coronavirus outbreak. Int J Interact Multimed Artif Intell. 2020;6(1):132–140. https://doi.org/10.9781/ijimai.2020.02.002

Zhan C, Tse CK, Lai Z, Hao T, Su J. Prediction of COVID-19 spreading profiles in South Korea, Italy and Iran by data-driven coding. PloS One. 2020;15(7):e0234763. https://doi.org/10.1371/journal.pone.0234763

Published
2024-09-26
How to Cite
1.
Ilyas M, Abbas S, Nawaz F. Predictive ARIMA Model with a Machine Learning (ML) Approach for COVID-19 Data in Pakistan. Sci Inquiry Rev. [Internet]. 2024Sep.26 [cited 2025Jan.21];8(3):25-7. Available from: https://journals.umt.edu.pk/index.php/SIR/article/view/5605
Section
Orignal Article