Poster
in
Workshop: AI for Accelerated Materials Design (AI4Mat-2023)
High throughput decomposition of spectra
Dumitru Mirauta · Vladimir Gusev
Keywords: [ spectroscopic data ] [ optimal transport ] [ unmixing ] [ basis variation ] [ decomposition ]
In order to fully utilise the potential throughput of automated synthesis and characterisation data collection, data analysis capabilities must have matching throughput, which consumes excessive (human) expert time even for small datasets.One such analysis task is unmixing; being able to generally separate, from a sample consisting of multiple components, the individual patterns characteristic of the constituent parts.Being able to do so quickly and reliably is important both for identifying samples containing unknown materials in large parallel batches (e.g. spray deposition) and for autonomous/closed loop refinement (e.g. flow synthesis).The problem can be akin to finding a needle in a haystack, where only a minuscule proportion of the many samples accessed by the automated synthesis contain some of the unknown in a small amount by mass, which may not even be proportionally reflected in the spectra.Even if patterns corresponding to each chemical component are known ahead of time, they are not trivial to separate, as they in fact change from sample to sample (e.g. peak shifting) due to small modifications in the component produced or processing conditions.Conventional approaches can be narrowly applicable (without severe modification or retraining) and suffer from excessive local minima.We propose instead a non-parametric approach based on exact optimal transport (OT) which allows for arbitrary variation through flexible patterns and better defined local minima.