A Lagrangian Time-Series Machine Learning Framework for Predicting Concentrations and Exploring Drivers of Cloud Condensation Nuclei in Marine Boundary Layer
Shengqian Zhou, Dong Qi, Hanyang Liu, Chenyang Lu, Yevgeniy Vorobeychik, JIAN WANG, Washington University in St. Louis
Abstract Number: 623
Working Group: Advancing Aerosol Science through Data Analysis Tools
Abstract
The concentration of cloud condensation nuclei (CCN) can strongly influence the albedo, lifetime, and coverage of marine low clouds and consequently the climate. At present, the key factors and processes driving the CCN concentration remain not well understood, and global models generally perform poorly in predicting CCN concentrations in remote marine boundary layer (MBL), contributing to substantial uncertainty in simulated aerosol-indirect forcing. While machine learning is a powerful tool for prediction and deciphering complex relationships in a multivariate system, its application to studying CCN processes remains limited. Additionally, existing machine learning models often use local environmental variables alone as input features, therefore cannot capture the processes during the long-range transport, which can strongly influence CCN concentrations in remote MBL. Here we present a Lagrangian time-series machine learning framework to better represent the dynamic processes driving CCN concentration in the atmosphere. Three-dimensional back trajectories of airmasses arriving at the receptor site were derived using a Lagrangian transport model. The time series of relevant environmental variables along the airmass trajectories were utilized as input features and fed into a long short-term memory (LSTM) model to predict the CCN concentrations at the receptor site. The model was trained and evaluated using multiyear CCN measurements on Graciosa Island in the eastern North Atlantic (ENA), and it demonstrates high prediction accuracy. In addition, while the model was built on measurements in the ENA, it can reproduce a large fraction of the CCN variation at another site in tropical South Atlantic, indicating that it captures the general processes that drive marine CCN concentration. Besides previously recognized wet removal, the model reveals that short-wave radiation is another key factor controlling CCN variations in the MBL of remote ENA. This Lagrangian time-series framework, which combines physics-driven airmass trajectory modeling and data-driven LSTM, can be applied to the prediction and mechanism understanding of other atmospheric components.