The MIMIC-CXR database is a massive, de-identified dataset that includes both images and the original free-text radiology reports. MIMIC-CXR Database v2.1.0 - PhysioNet
: Available directly from the NIH Box Site or through Kaggle . 2. MIMIC-CXR (Large-Scale with Reports)
: 14 distinct disease labels (e.g., Atelectasis, Cardiomegaly, Effusion) mined from radiology reports via NLP. Format : PNG format.
Chest radiography is the most common and cost-effective medical imaging examination worldwide. For AI researchers, accessing high-quality data is the first step toward building models for computer-aided diagnosis (CAD).
Top Public Chest X-Ray Datasets: A Comprehensive Guide for Researchers