A comparative analysis of parametric survival models and machine learning methods in breast cancer prognosis

Theophilus Gyedu Baidoo and Hansapani⁸ evaluated both survival-specific and machine learning models using performance metrics such as the concordance index (c-index), integrated brier score (IBS), and area under the curve (AUC). The cox proportional hazards (CPH) model, random survival forest (RSF), and deepsurv demonstrated strong performance, with RSF achieving a c-index of 0.72. Both cox and RSF recorded the lowest IBS value of 0.08. However, while machine learning models such as random forest (AUC 0.74) and xgboost (AUC 0.69) showed moderate discrimination, they lacked mechanisms for handling censored data, a key limitation in survival analysis. In a related study, the authors applied five machine learning classifiers using 13 selected features, with LightGBM optimized via a tree-structured parzen estimator, achieving 99.86% accuracy, 100% precision, and 99.60% recall, demonstrating high potential in distinguishing between malignant and benign tumors with minimal human intervention⁹.

Jialong Xiao, Miao Mo, et al.¹⁰ compared machine learning algorithms with the cox model for predicting overall survival in a large breast cancer cohort of 22,176 patients. Their findings revealed that the RSF slightly outperformed the Cox model in terms of discrimination, with a c-index of 0.827 compared to 0.814. This emphasizes the utility of the RSF in prognostic modeling. Another study explored a modified Weibull distribution capable of modeling various hazard rate shapes, including increasing, decreasing, constant, or bathtub-shaped patterns, with results closely aligned with kaplan-meier survival curves¹¹. Another study by Tizi and Abdelaziz Berrado¹² compared machine learning techniques with conventional statistical methods for cancer survival prediction. The study evaluated models, including random survival forests and cox regression with ridge regularization, using the c-index for performance comparison. The results indicated that both approaches performed similarly, although cox regression struggled with high-dimensional data. A separate study applied machine learning models to predict invasive disease-free events in 145 patients, showing that random survival forest with gradient boosting outperformed the cox model (c-index, 0.68 vs. 0.57). These findings suggest that clinical data alone can enhance prediction accuracy and reduce the need for costly genetic testing¹³.

Surbhu Gupta and Manoj K. Gupta¹⁴ assessed deep learning models, including the restricted boltzmann machine (RBM), for predicting post-operative survival in breast cancer. Using cross-validation, the RBM achieved the highest accuracy (0.97), reinforcing the need for continued evaluation of deep learning architectures for optimal predictive performance.

A study by Sahar A. and El Rahman¹⁵ investigated early breast cancer detection using machine learning algorithms and feature selection across four datasets. Classifier performance varied across datasets: Random forest with a genetic algorithm achieved 96.82% on WBC, C-SVM with RBF kernel reached 99.04% on WDBC, random forest with recursive feature elimination scored 74.13% on WPBC, and decision tree achieved 83.74%. Another comparative study¹⁶ reported SVM and LDA achieving 93% accuracy, Random forest 98%, and logistic regression 86%, demonstrating consistent effectiveness across models.

Gunjan et al.¹⁷, highlighted the importance of early breast cancer detection and reviewed advancements in AI-based computer-aided diagnosis (CAD) systems. They compared machine learning and deep learning approaches with conventional methods, discussing their benefits, limitations, and future directions for medical image analysis. Nermin Abdelhakim Othman et al.¹⁸ proposed a hybrid deep learning model for predicting breast cancer survival using multi-omics data from the METABRIC dataset. The framework combines a convolutional neural Network CNN-based feature extraction with long short-term memory (LSTM) and gated recurrent unit (GRU) classifiers, achieving an accuracy of 98.0% through decision-level fusion. This model significantly improved survival prediction over single-modality approaches, offering a more robust and accurate tool for personalized breast cancer prognosis.

Another study using the wisconsin breast cancer dataset¹⁹ evaluated several classifiers, including SVM, k-nearest neighbors, random forest, and logistic regression. SVM emerged as the most accurate, achieving 95% accuracy, reaffirming the role of CAD systems in early detection. A separate comparison of linear and nonlinear models²⁰ found that while SVM had higher sensitivity, artificial neural networks offered better overall diagnostic performance, underscoring the value of nonlinear models in complex datasets.

Using Surveillance, Epidemiology, and End Results (SEER) data from 2010 to 2019, a study²¹ developed an xgboost model to predict survival in patients with bone metastatic breast cancer (BMBC). The model achieved AUC scores above 0.79. Prognostic factors such as treatment delays and income levels were significant, with neoadjuvant chemotherapy plus surgery improving outcomes in select subgroups.

Jain et al.²² aimed to identify optimal machine learning models for automatic breast cancer diagnosis using the wisconsin dataset. Their results showed that hyperparameter-tuned models and boosting algorithms, such as xgboost, consistently achieved high accuracy for both benign and malignant classifications. A study using the cancer genome atlas – breast invasive carcinoma (TCGA-BRCA) dataset²³ explored multimodal machine learning systems for survival prediction by integrating six biomedical modalities. Dimensionality reduction techniques and classifiers (SVM, random forest) improved the accuracy and robustness. However, these models lacked prospective validation on primary datasets, indicating the need for real-world testing.

Yinan Huang, Jieni Li, Mai Li, and Rajender R²⁴. reviewed 28 studies applying machine learning models to real-world healthcare data for time-to-event outcomes. Random survival forests and neural networks are commonly used in oncology. The review noted the underuse of ML for treatment prediction and emphasized the need for methodological advances to enhance clinical utility.

The study by Chirag Nagpal, Xinyu Li, and Artur Dubrawski²⁵ proposed a fully parametric deep learning approach for time-to-event prediction, circumventing the proportional hazards assumption of the Cox model. Their model accurately estimated survival risks in datasets with complex censoring and competing risks, offering a significant advancement in parametric survival modeling. M. Darshan Teja and G. Mokesh Rayalu²⁶ utilized University of California, Irvine data to evaluate eight machine learning models for cardiovascular disease prediction. Ensemble methods like random forest and bagged trees achieved the highest accuracy and ROC-AUC. The k-fold validation confirmed model reliability, emphasizing the effectiveness of ensemble techniques in prediction tasks.

Keren Evangeline I., S. P. Angeline Kirubha, and J. Glory Precious²⁷ used the METABRIC dataset to identify the predictive variables in breast cancer. They compared the cox proportional hazards (CoxPH) model, RSF, and DeepHit. RSF and DeepHit outperformed CoxPH, both achieving a C-index of 0.86 compared with 0.85 for CoxPH. Key predictors included relapse-free status (RSF), age at diagnosis, estrogen and progesterone receptor status, and tumor stage (cox proportional hazards), aiding clinical decision-making. Recent studies have also focused on enhancing survival prediction through frailty modeling²⁸. Another study²⁹, revealed that patients in non-manual occupations had better survival (hazard ratio < 0.85), with technicians and associate professionals situated at the manual and non-manual intersection.

A study³⁰ employed machine learning to predict survival duration using tumor-related clinical features such as stage, size, and age. Kernel ridge regression, k-nearest neighbors, lasso, and decision tree models demonstrated high predictive accuracy owing to effective data integration techniques. Finally, a study using data from the University of Ilorin Teaching Hospital³¹ applied several machine learning algorithms to predict breast cancer survival. AdaBoost outperformed the other models, achieving 98.3% accuracy and 99.9 AUC, confirming its potential for clinical application.

Although survival analysis has been widely used in breast cancer studies, it has been less studied in the context of invasive lobular carcinoma (ILC). Existing literature commonly employs cox proportional hazards models and random survival forests, with fewer studies examining the performance of other established parametric models, such as weibull, exponential, logistic, log-logistic, gaussian, and log-gaussian distributions. Additionally, the application of formal model selection criteria, such as the akaike information criterion (AIC) and bayesian information criterion (BIC), is less common in studies involving machine-learning approaches. Accordingly, further exploration of diverse modeling techniques and evaluation metrics may contribute to a more comprehensive understanding of survival prediction. This study aims to address this need by comparing multiple parametric and machine learning models for ILC survival prediction, using AIC/BIC and performance metrics to support model evaluation and interpretability in a clinically meaningful context. The objectives of this study were as follows:

1.

To investigate the prognostic significance of clinical and pathological factors, such as age, tumor grade, ajcc stage, and treatment, on breast cancer survival outcomes.
2.

To conduct a comparative evaluation of parametric survival models and machine learning algorithms in predicting patient survival, utilizing statistical criteria, including AIC, BIC, and ROC-based measures.
3.

To identify the most suitable predictive model, we assessed the trade-off between model interpretability and predictive accuracy across various machine learning methods.

link