Multiview Conformal Prediction (MVCP) Utilizes Infrared and Raman Spectra for Improved Atmospheric Microplastic Identification
REBECCA L. PARHAM, Eduardo Ochoa Rivera, Madeline E. Clough, Abbygail Ayala, Anne J. McNeil, Ambuj Tewari, Andrew P. Ault, University of Michigan
Abstract Number: 372
Working Group: Advancing Aerosol Science through Data Analysis Tools
Abstract
Spectroscopy techniques, such as infrared (IR) and Raman, are often used to analyze ambient atmospheric samples for microplastics (MPs). Unknown spectra are commonly identified using database matching methods, where the label with the highest similarity metric score is chosen if its score is above a set threshold. However, these scores have no statistical confidence associated with them, and most reported thresholds are arbitrary. Moreover, it is recommended to use both IR and Raman spectral inputs to improve the accuracy of MP identification due to their complimentary natures. Multiview conformal prediction (MVCP) is a machine learning method that uses scores from multiple spectral inputs to return prediction sets containing labels at a user-defined theoretical confidence. Herein, MVCP is assessed for identifying MPs in laboratory-generated samples of aerosolized MP particle types and atmospherically-relevant proxies. Spectra were collected with optical photothermal infrared and Raman (O-PTIR+Raman) spectroscopy, which utilizes a pump-probe system to collect IR and Raman spectra simultaneously for individual particles. The likeness between the collected spectra and an in-house reference library was calculated using the nearest neighbor similarity metric. A continuum of paired IR and Raman thresholds was then established to evaluate scores in a two-dimensional space. The size of returned prediction sets from MVCP were compared with its single-view counterparts (i.e. only IR or Raman), and the overall accuracy of MVCP was verified with empirical confidence. The capability of MVCP to remain unaffected by low-quality spectra—due to noise, fluorescence, or oversaturation—was also assessed. Finally, the method was used to demonstrate MP identification in environmental matrices by analyzing a sample of ambient particles spiked with aerosolized MP particles. The quality of MP identification was greatly improved by MVCP and should be applied to database matching at large.