A deep learning approach for blood glucose monitoring and hypoglycemia prediction in glycogen storage disease

0
A deep learning approach for blood glucose monitoring and hypoglycemia prediction in glycogen storage disease

Traditionally, patients with GSD have had to be hospitalized or rely on fingerprick tests at home to monitor and manage their glucose levels. However, these methods have limitations because they do not provide a comprehensive view of glucose levels throughout the day. Although the CGM was originally designed for patients with diabetes with blood glucose issues, it is also considered applicable to patients with GSD. Derks’s research on 15 patients with GSD demonstrated that an in-depth analysis using CGM data can effectively evaluate glucose management25. Additionally, a study in 2022 involving 10 adult patients with GSD showed that CGM may be beneficial26. Another study demonstrated the ability of an artificial neural network to predict glucose levels in patients with type 1 diabetes using an artificial neural network27,28.

Therefore, we hypothesized that metabolic predictions could be made based on the results of previous studies and data from patients with GSD accumulated at the Yonsei University Wonju Severance Christian Hospital. Our objective was to validate and analyze this hypothesis using the results generated by the DL models. This study yielded meaningful outcomes, confirming that by analyzing the CGM data from patients with GSD, it is possible to manage and predict their glucose levels. This finding supports the feasibility of precision medicine tailored to each individual, highlighting the potential for customized care that addresses the unique metabolic needs of each patient.

From a big data-based perspective, the data is typically divided by the number of individuals, followed by learning and testing. However, to predict the aspects related to human metabolism, we considered individualized data learning to be more appropriate. Therefore, we adopted an approach in which an independent model was trained for each study subject and quantification indicators were subsequently calculated. Given that the number of models to be tested increases the time and computational costs exponentially, we focused our efforts on deriving research results using the latest models that have achieved SOTA performance in the field of time-series forecasting.

We conducted both forecasting and classification tasks to assess the performance of the model in predicting and managing immediate blood glucose levels (Fig. 1). In the forecasting task, the LTSF N-Linear model demonstrated relatively strong forecasting ability. For blood glucose level prediction 15 min ahead, the Pearson’s R correlation coefficient was 0.887 (95% CI, 0.886–0.888), indicating a high predictive accuracy (Fig. 3). Even at the 30- and 45-min prediction horizons, the Pearson’s R values remained relatively high than those of the compared models, with values of 0.721 (95% CI, 0.719–0.724) and 0.617 (95% CI, 0.614–0.621), respectively. However, at the 60-min prediction horizon, the predictive power significantly decreased, with a Pearson’s R of 0.561 (95% CI, 0.557–0.565). The LTSF N-Linear focuses solely on the temporal relationships between data points in a linear manner. It performs well when complex modeling of the data structure is not required, owing to the simplicity of its architecture. The participants in this study continuously managed their diet to maintain stable blood sugar levels. Since changes in metabolic blood sugar primarily occur through food intake29,30, it is likely that the LTSF N-Linear benefited structurally from this process. The blood sugar data in this study followed a relatively linear trend over short prediction periods with few complex nonlinear relationships. As a result, LTSF N-Linear delivered excellent short-term prediction results, as demonstrated in this study; however, its performance for long-term predictions indicates areas for improvement. However, only the variables directly related to blood sugar levels and time were used in this study. As the input data were already time-series data, temporal information may have been inherently included, making it difficult for the model to capture additional dependencies, particularly those related to metabolism. Furthermore, the relatively complex architecture of the model may have led to overfitting, as it may have been more sensitive to noise than to learning meaningful patterns. Patch TST, which employs a patch-based approach, was designed to capture both short- and long-term trends. Although its performance was slightly lower than that of the LTSF M Linear, the difference between the two models was not significant. The patch-based TST produced reasonably predictable results, demonstrating its capability to handle tasks effectively.

In the classification task, the TS Mixer demonstrated a strong classification performance. Notably, an AUROC of 0.866 (95% CI, 0.829–0.904) for the 15-min prediction horizon (Fig. 4). Although the predictive power decreased for the 45- and 60-min prediction horizons, it still achieved a specificity of 74.09% (95% CI, 66.97–81.22%) and 75.42% (95% CI, 69.75–81.1%) for these horizons, respectively. Additionally, the model maintained a NPV of 91.21% (95% CI, 86.52–95.90%) and 92.72% (95% CI, ` 90.72–94.71%), even at longer time intervals (Table 3). Conversely, the LTSF N-Linear model showed little classification ability, with AUROC values close to 0.5. Specifically, it recorded an AUROC of 0.556 (95% CI, 0.600–0.644) for the near-term prediction horizon. For the TS Mixer, the model constructs a predictive decision boundary for binary classification based on nonlinear patterns. Conversely, LTSF N-Linear, which operates on a simple linear basis, is well suited for trend prediction but struggles in classification tasks where nonlinear patterns are crucial. In this study, even after adding an additional linear layer, the performance deteriorated, with AUROC values close to 0.5. This outcome reflects the challenges that linear classifiers face in achieving good performance in binary classification31,32. The patch TST sits between the relatively simple LTSF N-Linear and the more complex TS Mixer, and its quantitative performance indicators fall in the middle range. While Patch TST is capable of capturing dependencies and nonlinear patterns in waveforms, it is likely to be less effective at considering global patterns compared with the TS Mixer, as it relies on patch-based processing. Additionally, when analyzing the overall metrics, certain models demonstrated effective performance in hypoglycemia classification. However, a slight low PPV was observed in some cases. This reduction could result in inaccurate predictions, leading to false alarms or unnecessary treatments or interventions, highlighting the need for further refinement and improvement.

This study has some limitations. First, there is a lack of input data for predicting changes in glucose levels; human metabolism is highly complex, and incorporating additional information, such as meal timing, physical activity, and other events, could improve the model’s predictive accuracy. For example, in the OhioT1DM dataset33, a well-known dataset for blood glucose prediction, biometric information, such as insulin dosage and heart rate, and behavioral factors, such as meal intake, sleep, and physical activity, were systematically monitored. In contrast, this study relied solely on the blood glucose data obtained from the CGM device and time annotations. Although the patients in this study had already maintained optimal metabolic control, the data were retrospective, and it was difficult to obtain additional inputs owing to the limitations of the CGM device. As a result, information such as biometric data or behavioral factors could not be incorporated into this study. Because34,35, the absence of this information is considered one of the primary reasons for the poor predictive power of the model over longer prediction horizons in this study. In summary, several studies on glucose levels prediction have achieved acceptable or expected results using blood glucose data alone. However, given the characteristics of GSD patients, who frequently require dietary control, it is expected that incorporating additional information could help compensate for the model’s limited predictive capability. Second, this study was conducted using retrospective data. Future studies should incorporate prospective data to verify the applicability of the model to real-world GSD cases. If future studies can predict glucose levels using CGM data from patients who are not yet in optimal metabolic control, this could lead to a groundbreaking management method. Third, a relatively high sampling rate was used in this study, and the CGM device was set to measure blood glucose every 15 min in real time. In general, a shorter sampling rate21,27,36. Furthermore, this study aimed to explore the potential of DL models to help patients with GSD efficiently manage their glucose levels in daily life and proactively prevent possible hypoglycemic symptoms. Therefore, the sampling rate used in this study is considered a factor that directly influences the model’s predictive power. In future research, it will be important to adjust the sampling rate, considering both model performance and real-life applicability, and to evaluate whether the adjusted rate enhances efficiency37,38. Fourth, the general applicability and interpretability of error grid analysis are limited. Various analysis methods, such as the clinical error grid methodology, have been proposed to validate39,40. In contrast, this study was conducted in a group of patients with GSD, a condition with an extremely low prevalence. The type 2 diabetes is typically associated with elevated blood sugar levels, while hypoglycemia is a major concern in type 1 diabetes. In contrast, GSD is characterized by low blood sugar levels caused by genetic mutations affecting glucose storage and release due to genetic mutations. Consequently, the blood sugar range in patients with GSD differs significantly from that in patients41, making it difficult to apply the analytical frameworks designed for diabetes to GSD. When performing clinical error grid methodology on the model used in this study, it was observed that most characteristics of GSD patients fell within Region A (Fig. S1). However, the sizes of Regions B and D, which are critical for identifying in-application treatment failures or instances of hypoglycemia, differed from the average blood glucose range of GSD patients. While some quantification results demonstrated excellent accuracy, the overall persuasiveness of the methodology is limited by the constraints. These challenges limit the applicability of previously proposed glycemic analysis methodologies for GSD and consequently impose restrictions on the analysis of model outcomes in this study. Fifth, further research on interpretability is needed to understand how patient-specific factors influence the model’s performance. To investigate whether pattern-specific factors, such as gender, affect glucose level fluctuations and subsequently impact model outcomes, we analyzed test results by gender (Tables 4 and 5). However, no significant differences were observed between the two groups across any task. One potential explanation for these findings is that statistical analysis of Table 1 revealed no significant differences between the two groups. This trend is likely reflected in the model results. Additionally, due to the limitations of the retrospective dataset, the collected data lacks explanatory power, as it only includes glucose levels and does not provide additional relevant information. Consequently, these constraints make it challenging to analyze patient-specific factors and limit the examination of other variables beyond gender. In the future, follow-up studies should not only refine the research design but also enhance model interpretability and incorporate additional analyses using Explainable AI methods. Finally, this study requires external validation. Although many DL models produce promising results based on specific datasets, it is crucial to verify whether they generalize well to data from other42. This study was no exception. Although an independent model was developed for each individual to achieve personalized optimization, all data used for learning were obtained from a single hospital. In particular, owing to the rarity of GSD, there is a bias in the age range of the participants, and the study population was limited to Asian populations. The data used in this study were primarily collected from medical institutions in Korea that diagnose and treat the largest number of GSD patients. In contrast, other medical institutions rarely manage patients with rare diseases such as GSD, making it relatively difficult to collect external datasets. This limitation may reduce reliability due to the small number of contributors and raises. Additionally, there is a potential bias related to the CGM device used, particularly with respect to the sampling rate. Therefore, external validation is necessary to address these biases and verify the model’s performance across a more diverse set of subjects.

Table 4 Performance of the M linear by gender for forecasting task with 95% CI.
Table 5 Performance of the TS mixer by gender for classification task with 95% CI.

Despite these limitations, this study is significant, as it is the first to demonstrate the ability to forecast glucose levels using CGM and DL in patients with GSD. GSD is a genetic disorder for which no definitive treatment is currently available. Even if gene therapy becomes available in the future, its cost may be prohibitive for many patients. However, the use of CGM, which is relatively affordable and easily accessible, combined with DL-based personalized blood glucose management, can offer an accessible method for managing this condition. If this method is further refined, hypoglycemia can be predicted in advance and alerts can be provided, enabling better management of cornstarch intake, diet, and exercise based on blood glucose predictions. Although patients must continue to consume cornstarch and follow an appropriate diet, this approach is highly beneficial from a cost perspective. In this study, predictions were conducted using a single variable, glucose levels. While some DL models demonstrated promising results depending on the task, it is crucial to acknowledge that fluctuations in glucose levels can be highly sensitive to patient-specific factors such as fat metabolism, physical activity, and cornstarch intake. Follow-up studies are planned to address these variables. Additionally, the consideration of interpretability is being planned by leveraging explainable machine learning techniques. This approach is expected to provide clinicians with intuitive and actionable insights through the development of visualization dashboards.

link

Leave a Reply

Your email address will not be published. Required fields are marked *