American Association for Aerosol Research - Abstract Submission

AAAR 39th Annual Conference
October 18 - October 22, 2021

Virtual Conference

Abstract View


New Application of Gaussian Mixture Regression to Bias-Correct Low Cost PM2.5 Monitoring Data in sub-Saharan Africa

Celeste McFarlane, DANIEL WESTERVELT, Columbia University

     Abstract Number: 219
     Working Group: Urban Aerosols

Abstract
Reference grade PM2.5 monitors can serve as an important basis for the correction and calibration of low-cost sensors (LCS) for air quality monitoring. LCS, however, are affected by environmental factors such as temperature and relative humidity (RH), leaving a need for a correction factor to establish high quality data. LCS also have enormous potential for improvement in data coverage in resource-limited parts of the world such as sub-Saharan Africa. In March of 2020, a low-cost PurpleAir PM2.5 monitor was collocated for at least one year next to a Met One Beta Attenuation Monitor in Accra, Ghana which provided mediocre correlation and moderate bias between PurpleAir and BAM PM2.5 data (R2 = 0.66, MAE = 6 µg m-3). Both multiple linear and quadratic regression, which have previously been shown to reduce bias and increase correlation between PurpleAir and reference in many regions including sub-Saharan Africa, yielded minimal improvement in the correlation of the Accra collocation (R2 = 0.71, R2 = 0.83 respectively). Here, we develop a gaussian mixture regression (GMR) model, incorporating temperature and relative humidity, to improve the collocated PM2.5 correlation to R2 = 0.90 and the bias to MAE = 2.5 µg m-3. Gaussian mixture models (GMMs) are a clustering method with a characteristically high probabilistic nature that can help capture relationships between data and heterogenous variables for which we lack data. Using the probability distributions generated within each cluster, we use GMR to build a nonlinear regression model for the data. When coupled with time, temperature and RH, GMR can provide valuable insights towards understanding PurpleAir monitor limitations. We present the first ever application of GMR to geophysical data and demonstrate a substantial improvement over traditional methods without succumbing to overfitting.