Deep-Learning-Based Multi-Class Classification for Neonatal Respiratory Diseases on Chest Radiographs in Neonatal Intensive Care Units.
Summary
Using 43,338 NICU radiographs labeled by 20 neonatologists across 10 centers, a ResNet50-based model achieved 83.96% accuracy and 83.68% F1 for six neonatal respiratory classes. Performance was strongest for BPD and air leak syndrome and lowest for TTN, demonstrating feasibility for AI-assisted triage and decision support.
Key Findings
- Multicenter dataset of 43,338 NICU chest radiographs labeled by neonatologists enabled robust training and testing.
- Overall test accuracy 83.96% and F1 83.68% across six classes; class-wise F1 ranged from 70.84% (TTN) to 92.19% (BPD).
- Integration of demographic data (gestational age, birth weight) with imaging in a modified ResNet50 framework.
Clinical Implications
The model could prioritize reads, flag high-risk cases (e.g., suspected ALS/BPD), and standardize interpretation across centers, potentially reducing time-to-treatment. Prospective validation and domain shift assessment are needed before deployment.
Why It Matters
Large, multicenter, expert-annotated dataset with strong multi-class performance in a clinically urgent domain positions this work to influence diagnostic workflows. It bridges AI methods with neonatal care, a high-need area.
Limitations
- Retrospective design without prospective clinical impact evaluation
- Generalizability to different devices/sites and relatively lower performance for TTN not yet addressed
Future Directions
Prospective, multi-country impact trials, domain adaptation to new scanners/sites, incorporation of temporal imaging and clinical trajectories, and calibration for triage thresholds.
Study Information
- Study Type
- Cohort
- Research Domain
- Diagnosis
- Evidence Level
- III - Large multicenter retrospective diagnostic development and validation study without randomized intervention.
- Study Design
- OTHER