Calibration of Low-Cost Particulate Matter (PM2.5) Sensors under West Africa Weather: A Year-Long Evaluation Integrating K-Kohler Hygroscopic Correction and Machine Learning

JAMES NIMO, Yusif Ibrahim-Anyass, Mathias A. Borketey, Nancy Owusuaa, Nathaniel Obeng Anim, Emmanuel Banahene Asante, Selina Amoah, Esi Nerquaye-Tetteh, Victoria Owusu-Tawiah, Michael R. Giordano, Daniel Westervelt, Md. Aynul Bari, Raphael E Arku, Allison Felix Hughes, University at Albany, State University of New York

     Abstract Number: 291
     Working Group: Instrumentation and Methods

Abstract
With scarce regulatory monitors, air quality management in West Africa increasingly relies on low-cost PM2.5 sensors (LCS). Yet, their accuracy is frequently compromised by regional meteorological extremes, notably high humidity, temperature, and transported dust during the dry Harmattan season. This study addresses the critical need for robust LCS calibration under these challenging weather conditions. We conducted a year-long (2024-2025) field evaluation of nine distinct commercially available LCS at the University of Ghana's Air Sensor Evaluation and Training Facility (Afri-SET) in Accra, Ghana. We systematically investigated the influence of applying Köhler theory-based hygroscopic growth correction factors in the assessment of sensor performance across low (20-60%) and high (60-99%) relative humidity in both wet and dry seasons. Subsequently, four machine learning algorithms, Multiple Linear Regression (MLR), XGBoost, Random Forest (RF), and LightGBM (LGB), were developed and compared to derive sensor-specific calibration models. We also examined long-term sensor drift among the evaluated LCS brands and developed practical guidance for end-users on effective re-calibration strategies in West African contexts. Preliminary results indicate that the integrated approach, combining hygroscopic pre-correction with advanced machine learning models, particularly LGB, substantially improves LCS data accuracy. For example, the LGB model demonstrated strong performance across various sensors, achieving coefficients of determination (R²) as high as 0.987, with corresponding Root Mean Square Errors (RMSE) as low as 5.56 and Mean Absolute Errors (MAE) as low as 3.36µg/m3 under overall seasonal conditions. These ensemble methods, including Random Forest and XGBoost, consistently outperformed Multiple Linear Regression, underscoring the benefits of this integrated calibration strategy for enhancing data reliability, particularly under challenging and varied humidity conditions. These findings demonstrate a viable pathway for optimizing LCS data reliability in West Africa, crucial for enhancing air quality assessments and supporting evidence-based environmental policy.