American Association for Aerosol Research - Abstract Submission

AAAR 37th Annual Conference
October 14 - October 18, 2019
Oregon Convention Center
Portland, Oregon, USA

Abstract View


Spatiotemporal Modeling of PM2.5, CO and NO2 Concentrations Measured by a Low-cost Sensor Network: Comparison of Linear and Machine-learning Enabled Land Use Models

SAKSHI JAIN, Albert Presto, Naomi Zimmerman, University of British Columbia

     Abstract Number: 638
     Working Group: Air Quality Sensors: Low-cost != Low Complexity

Abstract
Many previous studies have characterized spatial patterns of pollution by building land use regression (LUR) models from distributed passive samplers or filter samplers. These models can be generated with high spatial resolution, thereby producing estimates of long-term (e.g., annual average) spatial patterns in concentration, but generally have poor temporal resolution. Deployment of low-cost sensors, which typically sample in real time, creates the possibility of time-resolved and/or real-time modeling of concentration surfaces.

The aim of this study was to develop spatiotemporal models for PM2.5, CO and NO2 using measurements collected by a network of low-cost sensors in Pittsburgh, Pennsylvania. Models were developed for daily average concentrations for periods spanning August 2016 – December 2017 across 50 unique sites. Land use variables included 15 different time-independent (e.g., elevation) and time-dependent (e.g., temperature) predictor variables. We examined two different models: LUR and a machine learning enabled land use model (land use random forest – LURF) that uses random forests to link observed concentrations to land use variables. A hybrid LUR-LURF model was generated to resolve the shortcomings associated with individual models. The models were also evaluated using time-decomposed signals (e.g., short-lived spikes vs. long-term enhancements).

Time decomposition of signals resulted in equal (PM2.5, CO) or better (NO2) R2 values, especially for LUR models. LURF models outperformed LUR models in all cases. PM2.5 LURF and hybrid models were characterized by high R2 (median~0.7), low normalized mean absolute error (CvMAE, median~ 20%), and low variability. NO2 time-decomposed models had higher R2 (~20% improvement), slightly lower CvMAE values (~10%), and lower variability as compared to their standard signal counterparts. The results of our study show that a combination of low-cost sensors and novel data analytics can be successfully used to build more robust land use models, locate hotspots and provide preliminary information about air pollution gradients to policymakers.