Hybrid time series and machine learning models for forecasting cardiovascular mortality in India: an age specific analysis | BMC Public Health

The dataset classified about heart disease-related fatalities in India into distinct age brackets: 0–5, 6–15, 16–49, 50–69, and 70 + . The data transformed in each age group into a time series format to study trends and patterns across time. This modification facilitated the assessment of temporal dynamics and trend prediction. A vital aspect of time series analysis is ensuring stationarity, as the majority of time series models, including ARIMA, necessitate that the data maintains a consistent mean and variance throughout time [30]. The evaluated stationarity for each age group using the Augmented Dickey-Fuller (ADF) test.

Table 1 summarizes the results of the ADF test. The test yields the Dickey-Fuller test statistic, lag order, and associated p-value for each age cohort. The null hypothesis of the ADF test posits the existence of a unit root, indicating that the data is non-stationary. If the p-value is less than the significance level (often 0.05), the null hypothesis is rejected, signifying that the series is stationary.

Table 1 ADF test results for different age group

All age groups have p-values at or below 0.05, suggesting that we can reject the null hypothesis of non-stationarity. This indicates that the time series data for each age group is steady, rendering it appropriate for ARIMA time-series modeling. The Dickey-Fuller test data demonstrate a pronounced inclination toward stationarity, especially in the older age cohorts (50–69 and 70 +), characterized by a notably low p-value (0.01). This discovery underscores the resilience of the time series data across all age demographics and establishes a basis for subsequent research and predictions.

Table of Contents

ACF, PACF and residuals analysis

In this Research produced the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots to conduct a more in-depth analysis of the time series data for each age group. These plots facilitate the identification of the correlation structure within the data and inform the selection of suitable parameters for the ARIMA model. The ACF looks at how well different sets of data are related over different time periods, while the PACF shows how an observation from the past is directly linked to the present value, without taking into account any intermediate delays.

Figures 1a, 2, 3, 4 and 5 b illustrate the ACF and PACF plots for each age group: 0–5, 6–15, 16–49, 50–69, and 70 + . In every instance, the autocorrelation and partial autocorrelation values reside within the confidence intervals, denoted by the lower limit (LL) and upper limit (UL). This indicates that there are no significant lagged correlations beyond the boundaries of randomness, which suggests that the series does not exhibit significant seasonal or cyclical patterns.

Figure 6a to e provide a histogram and normal Q-Q plots of residuals for each age cohort. Across all age demographics, the Q-Q plots illustrate that most residual values are in proximity to the red reference line, signifying that the residuals are approximately normally distributed. This validates the residual normality assumption and affirms the suitability of ARIMA-based models for analyzing patterns of heart disease mortality across different age groups.

The results show that the time series data for all age groups can be used for ARIMA modeling because there are no significant autocorrelations or partial autocorrelations that are not taken into account within acceptable confidence levels. This conclusion underscores the dataset’s robustness and endorses the use of time-series forecasting methods for heart disease mortality trends across various age demographics.

ARIMA analysis

The utilized ARIMA model on the time series data for each age cohort (0–5, 6–15, 16–49, 50–69, and 70 +) to determine the optimal model for predicting heart disease mortality trends. The AIC determined the model selection for each age group, favoring models with the lowest AIC values. The table below presents the chosen ARIMA models and their associated parameters, including the Moving Average (MA1) coefficient, drift, log-likelihood, AIC, corrected AIC (AICC), and BIC.

It was found that the ARIMA (0,1,1) and ARIMA (0,1,1) with drift models fit most age groups the best, as shown by their lower AIC and log-likelihood values in Table 2. The drift parameter was important in the models for age groups 0–5, 16–49, 50–69, and 70 + , which means that the time series data for these groups consistently showed an upward or downward trend. The MA1 coefficient exhibited variability among age groups while consistently capturing short-term associations efficiently.

Table 2 ARIMA model selection for different age group and their related results

The age group 6–15 was modeled without drift, as incorporating a drift term did not substantially enhance the model’s fit. This indicates that the trend component for this age group is either less pronounced or absent compared to the other categories. The chosen models illustrate ARIMA’s capacity to precisely represent the dynamics of heart disease mortality across several age groups, establishing a basis for exact forecasting and trend analysis.

We used the Breusch-Pagan (BP) test to determine whether heteroscedasticity existed in the residuals of the ARIMA models across different age groups. The results showed in Table 3 that all five age groups 0–5, 6–15, 16–49, 50–69, and 70 + had p-values higher than the usual cutoff of 0.05, meaning there was no strong evidence of heteroscedasticity. This means that the ARIMA models’ residuals have constant variance (homoscedasticity), which supports the validity of the modeling assumptions and the durability of time series models for forecasting trends in heart disease mortality.

Table 3 Breusch-Pagan test results for Heteroscedasticity to different Age groups

ARIMA with hybrid models

Once we knew which ARIMA models worked best for each age group, we combined ARIMA with machine learning techniques like random forests, support vector machines, XGBoost and GARCH to make hybrid models. The hybrid method combines the best features of ARIMA for modeling linear and temporal dependencies with the power of machine learning models to find complex patterns and correlations in the data that don’t follow a straight line. This methodology seeks to enhance the forecast of heart disease mortality among various age demographics.

We used the residuals from the ARIMA model as input for the machine learning models in each age group (0–5, 6–15, 16–49, 50–69, and 70 +). These residuals signify inexplicable fluctuations subsequent to the use of the ARIMA model. We trained the machine learning models to predict the residuals, then incorporated them into the ARIMA forecasts to produce the final hybrid forecasts.

Future value forecasts were produced using both ARIMA and hybrid ARIMA-ML models. The efficacy of each hybrid model was assessed using metrics like RMSE and MAPE. The projected values for each age demographic are included in the accompanying table and shown in graphics.

Error Metrics Performances

The predictive efficacy of standalone ARIMA and hybrid models (ARIMA + RF, ARIMA + SVM, ARIMA + XGBoost, and ARIMA + GARCH) was assessed across various age demographics utilizing RMSE and MAPE are shown in Table 4. In the 0–5 age range, the ARIMA + GARCH model achieved the best performance with lowest RMSE (255.8) and MAPE (5.89), slightly outperforming standalone ARIMA indicating its capability to model volatility in early-age mortality. Comparing ARIMA to hybrid models, hybrid models did better with kids ages 6 to 15, with ARIMA + SVM being the most effective (RMSE: 110.57, MAPE: 4.35), closely followed by ARIMA + RF. In the 16–49 age demographic, ARIMA + SVM attained the lowest RMSE (326.90) and MAPE (0.16), indicating its greater accuracy compared to alternative models. Within the 50–69 age demographic, the ARIMA + XGBoost model had the highest accuracy, evidenced by the lowest RMSE (239.72) and MAPE (0.05), closely succeeded by ARIMA + SVM. In the 70 + age range, ARIMA + SVM surpassed all other models, achieving the lowest RMSE (25,254.45) and MAPE (1.77), while ARIMA + RF also showed enhancements compared to the standalone ARIMA model. Over all the results affirm that hybrid models, particularly those integrating SVM and XGBoost with ARIMA, substantially enhance predictive accuracy, while GARCH models offer moderate improvements primarily in scenarios with time-varying variance.

Table 4 Evaluation of forecasting precision among age demographics utilizing standalone ARIMA and hybrid models (ARIMA + RF, SVM, XGBoost, GARCH) According to RMSE and MAPE Metrics

Statistical Significant testing for forecasting error

The paired t-test and the Wilcoxon signed-rank test were used to statistically evaluate the performance differences between the hybrid models, using RMSE values from cross-validation results. The comparison of ARIMA + RF and ARIMA + XGBoost produced a p-value of 0.598 (t-test) and 0.9219 (Wilcoxon test), signifying no statistically significant difference in performance. The p-values of 0.179 (t-test) and 0.2324 (Wilcoxon test) indicated no statistically significant enhancement between ARIMA + SVM and ARIMA + XGBoost. The statistics show that even though the models perform differently, these differences are not statistically important at the 0.05 level, likely due to a small sample size or similar performance among the models. The ARIMA + GARCH model is aimed at understanding volatility and isn’t made to minimize RMSE during cross-validation, so it was left out of this statistical test to keep the methods consistent.

Forecasting results for all Age groups

Tables 5, 6, 7, 8, 9 and Fig. 7a to e present the forecasting results for all age groups (0–5, 6–15, 16–49, 50–69, and 70 +) using standalone ARIMA and hybrid models (ARIMA + RF, ARIMA + SVM, ARIMA + XGBoost, and ARIMA + GARCH), showcasing distinct trends and model performances across age categories. Tables 5 and 6 illustrate that the 0–5 and 6–15 age groups exhibit a steady decrease in projected deaths. The standalone ARIMA model demonstrates optimal performance for the youngest cohort, as depicted in Fig. 7a to b, whereas the ARIMA + SVM and ARIMA + RF models yield enhanced accuracy for the 6–15 age group. Tables 7 and 8 demonstrate that the models predict a steady increase in mortality rates in the 16–49 and 50–69 age groups, highlighting the growing influence of cardiovascular diseases in these groups, as shown in Fig. 7c to d. For these groups, ARIMA combined with SVM consistently demonstrated superior performance, yielding more precise forecasts with reduced error metrics. In the age group of 70 and up shown in Table 9, all models predict a significant rise in mortality, as shown in Fig. 7e. ARIMA + XGBoost predicts the biggest rise, while ARIMA + SVM gives the most stable and reasonable predictions.

Table 5 Different models forecasting value for 0 to 5 age group

Table 6 Different Models Forecasting value for 6 to 15 Age group

Table 7 Different models forecasting value for 16 to 49 Age group

Table 8 Different models forecasting value for 50 to 69 age group

Table 9 Different models forecasting value for 70 + Age group

The results support the idea that death rates vary with age, and that hybrid models, especially ARIMA + SVM, are much better at predicting the future across most age groups. These findings underscore the necessity of customizing prediction models to the distinct characteristics of each group to facilitate better healthcare planning and treatments.

We assessed the calculation time for each forecasting model across all age groups to ascertain efficiency and accuracy. The independent ARIMA and hybrid ARIMA-based models demonstrated swift computation, with all execution times remaining under one second, as indicated in Table 10. Machine learning models like RF and SVM executed almost quickly (0 s); however, XGBoost had a somewhat extended computational period (up to 0.27 s). Among hybrid models, ARIMA + XGBoost often had the longest execution time (e.g., 0.47 s for the age group 0–5), while ARIMA + SVM and ARIMA + RF consistently displayed shorter execution durations. The execution duration for ARIMA + GARCH ranges from 0.1 to 0.24 s, depending on the age group. The results indicate that all models display computational efficiency and are appropriate for real-time or near-real-time applications, with only minor discrepancies in runtime.

Table 10 Computational Time (in seconds) for forecasting models across different Age groups

The superior forecasting accuracy of hybrid models over a single ARIMA is crucial for public health planning. By using these estimates, policymakers may forecast the rates of cardiovascular death for specific age groups, allowing for targeted medical resource allocation, early intervention strategies, and awareness campaigns. For instance, healthcare organizations may proactively increase geriatric services, allocate intensive care units, and enhance cardiac screening programs if higher mortality is anticipated in older age groups. Additionally, integrating these forecasting models into existing healthcare systems through dashboards or automated reporting tools facilitates real-time monitoring and long-term strategic planning. To further refine intervention strategies, future developments may integrate these models with socioeconomic and environmental data.

link