Distributed training of foundation models for ophthalmic diagnosis

Akpek, E. K. & Smith, R. A. Overview of age-related ocular conditions. Am. J. Manag Care 19, S67–75 (2013).
Google Scholar
Bressler, N. M. Age-related macular degeneration is the leading cause of blindness. JAMA 291, 1900–1901 (2004).
Google Scholar
Steinmetz, J. D. et al. Causes of blindness and vision impairment in 2020 and trends over 30 years, and prevalence of avoidable blindness in relation to vision 2020: the right to sight: an analysis for the global burden of disease study. Lancet Glob. Health 9, e144–e160 (2021).
Google Scholar
Wang, Y. et al. Global incidence, progression, and risk factors of age-related macular degeneration and projection of disease statistics in 30 years: a modeling study. Gerontology 68, 721–735 (2022).
Google Scholar
Doroudian, S. Collaboration in immersive environments: challenges and solutions. Preprint at (2023).
World Health Organization. Blindness and visual impairment (2023). https://www.who.int/news-room/fact-sheets/detail/blindness-and-visual-impairment.
Scott, A. W. & Bressler, S. B. Long-term follow-up of vascular endothelial growth factor inhibitor therapy for neovascular age-related macular degeneration. Curr. Opin. Ophthalmol. 24, 190–196 (2013).
Google Scholar
Mohamed, Q., Gillies, M. C. & Wong, T. Y. Management of diabetic retinopathy: a systematic review. JAMA 298, 902–916 (2007).
Google Scholar
El-Bouzaidi, Y. E. I. & Abdoun, O. Advances in artificial intelligence for accurate and timely diagnosis of covid-19: a comprehensive review of medical imaging analysis. Scientific African 22, e01961 (2023).
Khan, A. I., Quadri, S., Banday, S. & Shah, J. L. Deep diagnosis: a real-time apple leaf disease detection system based on deep learning. Comput. Electron. Agric. 198, 107093 (2022).
Google Scholar
Muchuchuti, S. & Viriri, S. Retinal disease detection using deep learning techniques: a comprehensive review. J. Imaging 9, 84 (2023).
Google Scholar
Tuncer, S. A., Çínar, A. & Fírat, M. Hybrid cnn based computer-aided diagnosis system for choroidal neovascularization, diabetic macular edema, drusen disease detection from oct images. Traitement du Signal 38 (2021).
Lee, C. S., Baughman, D. M. & Lee, A. Y. Deep learning is effective for classifying normal versus age-related macular degeneration oct images. Ophthalmol. Retin. 1, 322–327 (2017).
Google Scholar
Awais, M., Müller, H., Tang, T. B. & Meriaudeau, F. Classification of sd-oct images using a deep learning approach. In: 2017 Organizing Committee of the IEEE International Conference on Signal and Image Processing Applications (ICSIPA), 489–492 (IEEE, 2017).
Kugelman, J. et al. Automatic choroidal segmentation in oct images using supervised deep learning methods. Sci. Rep. 9, 13298 (2019).
Google Scholar
Li, F. et al. Deep learning-based automated detection of retinal diseases using optical coherence tomography images. Biomed. Opt. express 10, 6204–6226 (2019).
Google Scholar
Wang, D. & Wang, L. On oct image classification via deep learning. IEEE Photonics J. 11, 1–14 (2019).
Google Scholar
Alam, M., Le, D., Lim, J. I., Chan, R. V. & Yao, X. Supervised machine learning based multi-task artificial intelligence classification of retinopathies. J. Clin. Med. 8, 872 (2019).
Google Scholar
Schlegl, T. et al. Fully automated detection and quantification of macular fluid in oct using deep learning. Ophthalmology 125, 549–558 (2018).
Google Scholar
Burlina, P. M. et al. Use of deep learning for detailed severity characterization and estimation of 5-year risk among patients with age-related macular degeneration. JAMA Ophthalmol. 136, 1359–1366 (2018).
Google Scholar
Grassmann, F. et al. A deep learning algorithm for prediction of age-related eye disease study severity scale for age-related macular degeneration from color fundus photography. Ophthalmology 125, 1410–1420 (2018).
Google Scholar
Peng, Y. et al. Deepseenet: a deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs. Ophthalmology 126, 565–575 (2019).
Google Scholar
Ting, D. S. W. et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 318, 2211–2223 (2017).
Google Scholar
Abràmoff, M. D. et al. Automated analysis of retinal images for detection of referable diabetic retinopathy. JAMA Ophthalmol. 131, 351–357 (2013).
Google Scholar
Tufail, A. et al. Automated diabetic retinopathy image assessment software: diagnostic accuracy and cost-effectiveness compared with human graders. Ophthalmology 124, 343–351 (2017).
Google Scholar
Gulshan, V. et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316, 2402–2410 (2016).
Google Scholar
Li, Z. et al. An automated grading system for detection of vision-threatening referable diabetic retinopathy on the basis of color fundus photographs. Diab. Care 41, 2509–2516 (2018).
Google Scholar
Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at (2021).
Wang, Z., Liu, C., Zhang, S. & Dou, Q. Foundation model for endoscopy video analysis via large-scale self-supervised pre-train. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, 101–111 (Springer, 2023).
Zhou, Y. et al. A foundation model for generalizable disease detection from retinal images. Nature 622, 156–163 (2023).
Google Scholar
Haghighi, T. et al. Eye-llama, an in-domain large language model for ophthalmology. bioRxiv (2024).
Alam, M. N. et al. Contrastive learning-based pretraining improves representation and transferability of diabetic retinopathy classification models. Sci. Rep. 13, 6047 (2023).
Google Scholar
Bao, H., Dong, L., Piao, S. & Wei, F. BEiT: BERT pre-training of image transformers. International Conference on Learning Representations (2022).
Jannat, F.-E., Gholami, S., Alam, M. N. & Tabkhi, H. Oct-selfnet: A self-supervised framework with multi-modal datasets for generalized and robust retinal disease detection. Preprint at (2024).
Cai, Z., Lin, L., He, H., Cheng, P. & Tang, X. Uni4eye++: A general masked image modeling multi-modal pre-training framework for ophthalmic image classification and segmentation. IEEE Trans. Med. Imaging 99 (2024).
Rashvand, N. et al. Distributed learning for automatic modulation recognition in bandwidth-limited networks. In: Signal Processing, Sensor/Information Fusion, and Target Recognition XXXIII, Vol. 13057 (eds Kadar, I., Blasch, E. P. & Grewe, L. L.) 345–357 (SPIE, 2024).
McMahan, B., Moore, E., Ramage, D., Hampson, S. & y Arcas, B. A. Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics (eds Aarti, S. & Jerry, Z.) 1273–1282 (PMLR, 2017).
Gholami, S. et al. Federated learning for diagnosis of age-related macular degeneration. Front. Med. 10, 1259017 (2023).
Google Scholar
Sonti, M. & Kokil, P. Automatic diagnosis of age-related macular degeneration via federated learning. In: International Conference on Computer Vision and Image Processing (eds Harkeerat, K. et al.)128–136 (Springer, 2023).
Amgain, S. et al. Investigation of federated learning algorithms for retinal optical coherence tomography image classification with statistical heterogeneity. Preprint at (2024).
Nguyen, T. X. et al. Federated learning in ocular imaging: current progress and future direction. Diagnostics 12, 2835 (2022).
Google Scholar
Sery, T., Shlezinger, N., Cohen, K. & Eldar, Y. C. Over-the-air federated learning from heterogeneous data. IEEE Trans. Signal Process. 69, 3796–3811 (2021).
Google Scholar
Xie, Z. et al. Simmim: A simple framework for masked image modeling. In: Proc. IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (Program Chairs and Organizing Committee of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)) 9653–9663 (IEEE/CVF, 2022).
Beutel, D. J. et al. Flower: A friendly federated learning research framework. Preprint at (2020).
Roth, H. R. et al. Nvidia flare: Federated learning from simulation to real-world. Preprint at (2022).
Zhu, H., Xu, J., Liu, S. & Jin, Y. Federated learning on non-iid data: a survey. Neurocomputing 465, 371–390 (2021).
Google Scholar
Sun, B., Huo, H., Yang, Y. & Bai, B. Partialfed: Cross-domain personalized federated learning via partial initialization. Adv. Neural Inf. Process. Syst. 34, 23309–23320 (2021).
Ma, X., Zhu, J., Lin, Z., Chen, S. & Qin, Y. A state-of-the-art survey on solving non-iid data in federated learning. Future Gener. Comput. Syst. 135, 244–258 (2022).
Google Scholar
Zhao, Y. et al. Federated learning with non-iid data. Preprint at (2018).
Wang, H., Kaplan, Z., Niu, D. & Li, B. Optimizing federated learning on non-iid data with reinforcement learning. In: IEEE INFOCOM 2020-IEEE Conference on Computer Communications (Program Chairs and Organizing Committee of the IEEE INFOCOM 2020) 1698–1707 (IEEE, 2020).
Koch, V. et al. Noise transfer for unsupervised domain adaptation of retinal oct images. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (eds Wang, L., Dou, Q., Fletcher, P.T., Speidel, S. & Li, S.) 699–708 (Springer, 2022).
Dosovitskiy, A. et al. An image is worth 16×16 words: Transformers for image recognition at scale. International Conference on Learning Representations (2021).
Russakovsky, O. et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015).
Google Scholar
Vaswani, A. et al. Attention is all you need. Adv. Neural Inform. Process. Syst. 30 (2017).
Rashvand, N. et al. Enhancing automatic modulation recognition for iot applications using transformers. IoT 5, 212–226 (2024).
Google Scholar
Xu, L., Wang, L., Cheng, S. & Li, Y. Mhanet: A hybrid attention mechanism for retinal diseases classification. PLoS ONE 16, e0261285 (2021).
Google Scholar
Sun, Y., Zhang, H. & Yao, X. Automatic diagnosis of macular diseases from oct volume based on its two-dimensional feature map and convolutional neural network with attention mechanism. J. Biomed. Opt. 25, 096004–096004 (2020).
Google Scholar
Deininger, L. et al. A comparative study between vision transformers and cnns in digital pathology. Preprint at (2022).
Li, J. et al. Transforming medical imaging with transformers? a comparative review of key properties, current progresses, and future perspectives. Med. image Anal. 85, 102762 (2023).
Google Scholar
Du, J. et al. In Medical Image Computing and Computer Assisted Intervention—MICCAI 2024 (eds Linguraru, M. G. et al.) 709–719 (Springer Nature Switzerland, 2024).
Silva-Rodríguez, J., Chakor, H., Kobbi, R., Dolz, J. & Ben Ayed, I. A foundation language-image model of the retina (flair): encoding expert knowledge in text supervision. Med. Image Anal. 99, 103357 (2025).
Google Scholar
Gildenblat, J. & Contributors. Pytorch library for cam methods. (2021).
Muni Nagamani, G. & Rayachoti, E. Deep learning network (dl-net) based classification and segmentation of multi-class retinal diseases using oct scans. Biomed. Signal Process. Control 88, 105619 (2024).
Google Scholar
Upadhyay, P. K., Rastogi, S. & Kumar, K. Coherent convolution neural network based retinal disease detection using optical coherence tomographic images. J. King Saud. Univ. – Comput. Inf. Sci. 34, 9688–9695 (2022).
Google Scholar
Kamran, S. A., Saha, S., Sabbir, A. S. & Tavakkoli, A. Optic-net: A novel convolutional neural network for diagnosis of retinal diseases from optical tomography images. In: 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) (Program Chairs and Organizing Committee of the IEEE International Conference on Machine Learning and Applications (ICMLA)) (Program Chairs and Organizing Committee of the IEEE International Conference on Machine Learning and Applications (ICMLA)) 964–971 (IEEE, 2019).
Sotoudeh-Paima, S., Jodeiri, A., Hajizadeh, F. & Soltanian-Zadeh, H. Multi-scale convolutional neural network for automated amd classification using retinal oct images. Comput. Biol. Med. 144, 105368 (2022).
Google Scholar
Baharlouei, Z., Rabbani, H. & Plonka, G. Wavelet scattering transform application in classification of retinal abnormalities using oct images. Sci. Rep. 13, 19013 (2023).
Google Scholar
Kulyabin, M. et al. Octdl: Optical coherence tomography dataset for image-based deep learning methods. Sci. Data 11, 365 (2024).
Google Scholar
Zibaeirad, A., Koleini, F., Bi, S., Hou, T. & Wang, T. A comprehensive survey on the security of smart grid: Challenges, mitigations, and future research opportunities. Preprint at (2024).
Ross, T.-Y. & Dollár, G. Focal loss for dense object detection. In: Proc. IEEE Conference on Computer Vision and Pattern Recognition (Program Chairs and Organizing Committee of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE/CVF) 2980–2988 (CVPR, IEEE/CVF, 2017).
Loshchilov, I. & Hutter, F. Decoupled weight decay regularization. International Conference on Learning Representations (2019).
UNCC, Q. Distributed-training-of-foundation-models-for-ophthalmic-diagnosis repository hosted on github. (2025) (Accessed 2 Jan 2025).
Kermany, D. S. et al. Identifying medical diagnoses and treatable diseases by image-based deep learning. cell 172, 1122–1131 (2018).
Google Scholar
Srinivasan, P. P. et al. Fully auto635 mated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence tomography images. Biomed. Opt. Express 5, 3568–3577 (2014).
Google Scholar
Li, M. et al. Octa-500: a retinal dataset for optical coherence tomography angiography study. Med. Image Anal. 93, 103092 (2024).
Google Scholar
Gholami, P., Roy, P., Parthasarathy, M. K. & Lakshminarayanan, V. Octid: Optical coherence tomography image database. Comput. Electr. Eng. 81, 106532 (2020).
Google Scholar
link