Predicting Current and Future Concentrations of Biogenic Aerosol Precursors Using Machine Learning and Detailed Chemical Modeling

SINA TAYYEBI NIA, Nazifa Sayeed, Namrata Shanmukh Panji, Chenyang Bi, Gabriel Isaacman-VanWertz, Virginia Tech

     Abstract Number: 582
     Working Group: Advancing Aerosol Science through Data Analysis Tools

Abstract
Natural biogenic and anthropogenic sources continuously emit reactive volatile organic compounds (VOCs) into the atmosphere, where they undergo oxidation to form oxygenated VOCs and secondary organic aerosol (SOA). Quantifying their concentration and chemical composition is crucial for constraining the formation pathways and loadings of secondary aerosols. Using ~5 years of in-canopy concentration measurements from a tower at a mixed-forest site in central Virginia, we examine discrepancies between modeled emissions and real-world observations of concentrations. We apply machine-learning approaches (Random Forest, feature importance) (i) uncover seasonal and regime-specific patterns in the measurement record, (ii) train predictive models for individual biogenic VOC species, and (iii) identify specific environmental drivers that current emissions models may not fully capture. Perhaps unsurprisingly, with sufficient training data, machine-learning models outperform these conventional mechanistic approaches in predicting VOC concentrations; we use this model to examine the ability to predict future periods beyond the training data set and examine differences between mechanistic and machine-learning models. Permutation and feature importance analyses reveal that the variables explaining VOC variability in our measurements are not entirely captured by conventional emissions and chemistry models (e.g., the importance of humidity and soil parameters is higher than expected for certain VOCs). To understand the impacts of these discrepancies on aerosol formation, we are implementing the SAPRC MechGen mechanism generator in an automated way into a 0-d dimensional box model alongside dry- and wet-deposition modules to include a far larger array of chemical pathways and aerosol precursors than previously possible. This addition improves detailed modeling of the impacts of changes in emissions and will enable studies of future emission scenarios, as well as the impact of meteorology on multi-generational chemistry.