American Association for Aerosol Research - Abstract Submission

AAAR 38th Annual Conference
October 5 - October 9, 2020

Virtual Conference

Abstract View


Application of Machine Learning for Future Air Quality Predictions in Southern California

KHANH DO, Arash Kashfi Yeganeh, Cesunica E. Ivey, University of California, Riverside

     Abstract Number: 496
     Working Group: Urban Aerosols

Abstract
California’s South Coast Air Basin (SoCAB) is well-known for extremely poor air quality due to its unique terrain and high levels of anthropogenic emissions. In this study, we use machine learning (ML) to recognize the natural pattern of ambient air pollutants in SoCAB and to explore the link between precursor emissions, meteorology, and PM2.5/ozone. We investigated the historical changes of PM2.5 and ozone using 25 years of air pollutant, emissions, and meteorological data. We tested the random forest regression (RFR) algorithm under multiple configurations to tune the prediction and provide the best air quality predictions. We first trained the RFR model with hourly meteorology and air pollutant data from 1994 to 2018. Meteorological data was retrieved from Ontario and Los Angeles International Airport monitoring stations, and air quality data were retrieved from the San Bernardino (CA) air monitoring station. The RFR training features were NO, NO2, surface temperature, dew point temperature, visibility, surface pressure, relative humidity, wind speed, and wind direction. The RFR model was trained in five-year increments from 1994 to 2018. The R2 ranged from 0.6 – 0.7 for historical hourly predictions. The model also enabled predictions of 2023 PM2.5 and ozone using input data from a 2023 CMAQ simulation. The freedom of choosing input features enabled the investigation of PM2.5 and ozone sensitivity to fluctuations in key variables, such as temperature and NOx. These promising results indicate that ML can accelerate air quality research by augmenting traditional air quality modeling, reducing simulation time, and exploiting large datasets for historical simulations and future air quality predictions.