D2.516 - Can social media track seasonal allergy burden? An evidence-weighted geo-location pipeline and validation against airborne pollen exposure in Augsburg, Germany

Poster abstract

Background

Digital traces from social media may complement traditional aeroallergen surveillance by capturing population symptom experience and public awareness. However, separating symptom reporting from general attention and disentangling pollen from co-exposures remain key challenges. The aim was to evaluate whether geo-located Twitter posts can track seasonal allergy-related burden in a European city by linking daily tweet-derived signals to pollen monitoring and environmental covariates.

Method

We collected geo-referenced Twitter posts within predefined city bounding boxes using a multilingual allergy-related keyword list. Tweets were stored as a partitioned Parquet dataset and assigned to location via a hierarchical geo-assignment framework using coordinates, place bounding-box overlap, and an uncertainty score (0–3). Daily symptom-flag and pollen-awareness signals were derived. For Augsburg (2 March–16 May 2023), these signals were linked with daily pollen totals and taxa, meteorology, and air-quality data. Associations were examined using time-series overlays, lagged correlations, partial Spearman correlation, and exploratory negative binomial regression.

Results

Symptom-flag tweets increased during the main pollen season and visually aligned with pronounced pollen peaks. After adjustment for meteorology and air quality, symptom-flag tweets showed a positive association with pollen total (partial Spearman r = 0.120; 76 complete-case days). Lag analyses indicated heterogeneous timing and strength across pollen taxa, with modest effect sizes and no universal lag structure. In contrast, the pollen-awareness signal remained elevated even during low measured pollen, consistent with information-seeking, seasonal anticipation, and contextual drivers beyond contemporaneous exposure. Exploratory multi-exposure models suggested non-linear pollen–symptom relationships and highlighted the potential relevance of co-exposures, particularly ozone/nitrogen dioxide and pollen.

Conclusion

Geo-located Twitter data can complement traditional aeroallergen monitoring, but only when symptom-like posts are distinguished from attention-driven activity and environmental co-exposures are accounted for. Our findings highlight the potential of social media as an additional indicator on seasonal allergy burden, motivating broader multi-city, multi-year validation.