Using Machine Learning to Derive Long-Term Aerosol Liquid Water Concentrations from Aerosol Optical Properties

LIFEI YIN, Bin Bai, Shreya Suri, Yuhan Yang, James Sherman, Robert Swarthout, Pengfei Liu, Georgia Institute of Technology

     Abstract Number: 562
     Working Group: Advancing Aerosol Science through Data Analysis Tools

Abstract
Aerosol liquid water content (ALWC) is a critical component of ambient particulate matter, influencing the optical properties, radiative forcing, multiphase chemistry, and aging processes of atmospheric aerosols. However, long-term ALWC datasets are scarce due to the absence of direct measurement techniques. Since ALWC is closely tied to aerosol size, hygroscopicity, and ambient relative humidity (RH)—all of which affect aerosol optical properties (AOPs)—we propose a novel machine-learning approach to estimate ALWC utilizing a decade-long AOP dataset collected at the Appalachian Atmospheric Interdisciplinary Research Program (AppalAIR) site in Boone, NC. This work aims to understand decadal trends in ALWC in the southeastern United States, a region marked by high aerosol burden, elevated humidity, and substantial changes in aerosol chemical composition in the past ten years.

The AppalAIR site hosts one of the longest records (since 2012) of AOPs and humidified scattering enhancement factor f(RH) in the U.S. Starting in August 2024, additional particle number size distribution (PNSD) and chemical composition data are collected, enabling us to build a physically-based theoretical model that simulates aerosol optical properties under varying RH using Mie theory. Using this optical model constrained by observations, we trained Random Forest models to predict dry aerosol volume concentrations and hygroscopicity from AOPs, which allowed us to estimate ALWC directly from AOP measurements without requiring real-time size distribution or chemical composition.

Our trained and validated model will be applied to the long-term AOP dataset at AppalAIR to generate the first continuous, multi-year record of ALWC in the southeastern U.S. This novel dataset offers new opportunities to investigate long-term trends in aerosol water uptake, radiative properties, and aerosol-cloud interactions in a region heavily impacted by biogenic and anthropogenic emissions.