Understanding severe asthma through small and Big Data in Spanish hospitals - PAGE Study
Melero Moreno C1,2, Almonacid Sánchez C3, Bañas Conejero D4, Quirce S5, Álvarez Gutiérrez FJ6, Cardona V7, Sánchez-Herrero MG4, Soriano JB8, on behalf of the PAGE Study group9
1Hospital Universitario La Princesa, Madrid, Spain
2Hospital Universitario 12 de Octubre, Madrid, Spain
3Hospital Universitario de Toledo, Toledo, Spain
4Specialty Care Medical Department, GlaxoSmithKline, Madrid, Spain
5Hospital Universitario La Paz, IdiPAZ, and CIBER of Respiratory Diseases (CIBERES), Madrid, Spain
6Hospital Universitario Virgen del Rocío, Sevilla, Spain
7Hospital Universitario Vall d’Hebron, Barcelona, Spain
8Hospital Universitario de La Princesa, Facultad de Medicina, Universidad Autónoma de Madrid, CIBER of Respiratory Diseases (CIBERES), Instituto de Salud Carlos III (IS-CIII), Madrid, Spain
9Collaborators listed in Supplementary Appendix
J Investig Allergol Clin Immunol 2023; Vol. 33(5)
Background: Data on severe asthma prevalence is limited. The implementation of Electronic Health Records (EHRs) offers a unique research opportunity to test machine (ML) tools in epidemiological studies. The aim was to estimate severe asthma (SA) prevalence amongst the asthmatic patients seen in hospital asthma units, using both ML and traditional research methodologies. Secondary objectives were to describe non-severe asthma (NSA) and SA patients during a follow-up period of 12 months.
Methods: The PAGE study is a multicenter, controlled, observational study conducted in 36 Spanish hospitals and split into two phases: a first cross-sectional phase for the estimation of SA prevalence, and a second, prospective phase (3 visits in 12 months) for the follow-up and characterisation of SA and NSA patients. A sub-study with ML was included in 6 hospitals. This ML tool uses EHRead technology, which extracts clinical concepts from EHRs and standardizes them to SNOMED CT.
Results: A SA prevalence of 20.1% was obtained amongst asthma patients in Spanish hospitals, compared with 9.7% prevalence by the ML tool. The proportion of SA phenotypes and the features of followed-up patients were consistent with previous studies. The clinical predictions of patients’ clinical course was unreliable, while the ML only found two predictive models with discriminatory potential to predict outcomes.
Conclusion: This study is the first to estimate SA prevalence, in a hospital population of asthma patients, and to predict patient outcomes using both standard and ML techniques. These findings offer relevant insights for further epidemiological and clinical research in SA.
Key words: Severe asthma. Prevalence. Big data. Machine learning. Natural language processing. Predictive models