D3.481 - Machine Learning–Based Plasma Proteomic Signatures for Prediction of Chronic Rhinosinusitis

Poster abstract

Background

Chronic rhinosinusitis (CRS) imposes a significant burden on quality of life, yet scalable, non-invasive biomarkers for early detection and risk stratification remain elusive. While tissue-based studies have characterized local inflammation, the systemic proteomic signature of CRS has not been comprehensively defined at a population scale. This study aimed to identify circulating protein signatures associated with CRS and evaluate their predictive utility using machine learning.

Method

We analyzed large-scale plasma proteomic data from the UK Biobank Pharma Proteomics Project (UKB-PPP), comprising 44,344 participants (725 CRS cases and 43,619 controls). Protein profiling was performed using the Olink Explore 3072 platform. We conducted differential expression and gene set enrichment analyses to characterize systemic biological pathways. Five machine learning algorithms, including random forest and support vector machine, were developed to distinguish CRS patients from controls. We further assessed the incremental predictive value of proteomic features when added to routine blood markers and clinical covariates using SHAP-based explainable AI.

Results

We identified a distinct systemic proteomic signature of CRS, consisting of 46 differentially expressed proteins, predominantly characterized by upregulated type 2 inflammatory mediators (e.g., MMP10, Periostin, CLC, PRG2) and markers of tissue remodeling and innate immune activation. Pathway analysis revealed enrichment in leukocyte proliferation, extracellular matrix disassembly, and neutrophil degranulation. In predictive modeling, the random forest classifier achieved the highest performance (AUC = 0.747). Importantly, the integration of proteomic data consistently improved predictive accuracy across all models compared to baselines using only routine blood markers and clinical history. Feature importance analysis confirmed that MMP10 and eosinophil-related proteins were the primary drivers of prediction.

Conclusion

This study defines a robust systemic proteomic fingerprint of CRS, reflecting eosinophilic inflammation, matrix remodeling, and metabolic alterations. Our findings demonstrate that plasma proteomics captures non-redundant biological information beyond standard clinical variables, highlighting its potential as a tool for non-invasive risk stratification in CRS.