Advancing Low-cost Air Quality Monitor Calibration with Artificial Intelligence

SINAN SOUSAN, Rui Wu, Ciprian Popoviciu, Sarah Fresquez, Yoo Min Park, Department of Public Health, East Carolina University

     Abstract Number: 276
     Working Group: Instrumentation and Methods

Abstract
Low-cost sensors used for measuring airborne contaminants have become popular in recent years due to their price, portability, and ease of use. However, these sensors often exhibit high biases compared to their expensive counterparts and are often calibrated using a high-cost or reference instrument. Calibration models are affected by aerosol type, composition, and particle size, which render calibration models ineffective outside the original calibration environment. Therefore, low-cost monitors must be calibrated and validated in their immediate environment for best accuracy, which can be challenging due to high-costs and limited availability of reference instruments nationwide. This work proposes a novel machine-learning calibration method that uses one high-cost instrument with groups of low-cost sensors to perform the calibration. The machine learning methods employed were random forest and Gradient boosting trees, which were compared with simple linear regression. The conceptual model was demonstrated in a chamber study using three electronic cigarette (ECIG) brands that generate different aerosol and volatile organic compounds (VOC) exposures while performing measurements with 30 low-cost GeoAir2 monitors, forming 10 groups with 3 monitors in each group, with the personal Data Ram (pDR-1500) and MiniRAE high-cost monitors. The proposed method employed two groups of regression models. The first can be built by collocating all groups with the high-cost monitor using the first ECIG brand. Then, the regression models of 9 groups, except group 1, were used to measure the exposure of the second and third ECIG brands while applying errors calculated from group 1 and the high-cost monitor with a second set of regression models. The method showed a substantial improvement in some cases, up to 109% increase in r2 and 45% and 41% decrease in RMSE and precision values, respectively, depending on the machine learning model used. This work shows promising results that can be applied in environmental studies