10th International Aerosol Conference
September 2 - September 7, 2018
America's Center Convention Complex
St. Louis, Missouri, USA

Abstract View


Application of Boosted Regression Trees Technique to Analyse Particle Number Count Concentrations [PNC] at the East Coast of Malaysia

NOOR ZAITUN YAHAYA, Siew Moi Phang, Azizan Abu Samah, Intan Nabila Azman, Senior Lecturer, Universiti Malaysia Terengganu, Malaysia

     Abstract Number: 1116
     Working Group: Aerosol Modeling

Abstract
Study on Particle Number Count Concentrations ([PNC]) were conducted at The Institute of Ocean and Earth Sciences (IOES), University Malaya in Bachok, Kelantan, Malaysia, which located approximately 100m from sea edge of the South China Sea at the east coast of the Peninsular Malaysia.

The study focused to understand particle patterns, temporal and relationship between the meteorological factors (wind speed, direction, humidity, temperature and pressure) and gasses that influence the [PNC] in coastal environment. A one-minute data from 6th January to 5th July 2015 (n=259,200) were collected from the IOES station. [PNC] data were obtained by using Particle Counter GRIMM Model EDM180 which built in with 31 channels and located estimated 20 meter above the ground level. [PNC] data were grouped into two, which are fine particles (FPNC) and coarse particles (CPNC), with diameter between 0.265-2.25µm and 2.75-9.25µm respectively. Statistical data analysis was carried out by using a comprehensive statistical software R programming language and its packages.

FPNC were found are more dominant particle in this area,compared to the CPNC with the maximum of FPNC 5,204,079 counts/liter and only 907 counts/liter recorded by CPNC. A resembling pattern was showed throughout the period, which the increasing number of fine particles starting from early May to mid of June, due to the South-West Monsoon effects. Compared to the FPNC, there were decreasing numbers of particle during the North-East Monsoon (January to March). This shows that particles concentrations is influenced mostly by the speed of the wind and from where it blew from (wind direction).

The Boosted Regression Trees (BRT) model was constructed from multiple regression models, and the best iteration of BRT model was performed by optimizing prediction performance. The FPNC and CPNC model were developed with nine variables in three categories namely time system (time in a day, Julian day), meteorological factors (humidity, temperature, pressure, wind speed, wind direction) and gases (SO2, NOx). The BRT algorithm with combination of parameters lr=0.001, tc=5 and nt=10,000 for both [PNC] that achieves minimum predictive error were found best fit the data. The FPNC and CPNC R(R2) values were found to be 0.87 (R2 = 0.75) and 0.85 (R2 = 0.72) respectively, which indicates both observed and model [PNC] were good correlation to each other. The FAC2 values for both [PNC] are 0.81 and 0.76, which are in the recommended range, which is in between 0.5 to 2. The COE values for both [PNC] are 0.56 and 0.54 respectively, which show the models were in a predictive advantage as the value were approaching to 1. In this case, the developed model was within the acceptable value range for predictive evaluation performance. The analysis demonstrates significant variations in FPNC, largely influenced by SO2 (64.12%), prevailing wind direction (11.82%) with the physical strength index, H-Index = 0.097. Meanwhile the other variables show less than 10% influenced with the FPNC. Contradict to FPNC, CPNC largely influenced by wind speed, Julian day and wind direction, which are 29.84%, 22.46% and 22.17% respectively, followed by other parameters which are 10% and less. The H-Index values for CPNC, wind speed and wind direction is 0.219.

BRT model has the ability to identify the most influential variables and rank it in percentage which very useful for planner, designer and policy maker to take into their considerations in their work. The used of a boosting regression trees model in has proven as a statistical tool for predicting of particles at the coastal environment specifically and for other different area in general.