In the burgeoning field of precision medicine and biomarker discovery, untargeted metabolomics plays a crucial role. However, compound identification remains challenging due to the incompleteness of existing spectral reference libraries. To address this issue, a research team from the Federal Institute for Materials Research and Testing (BAM) in Germany and Freie Universität Berlin jointly developed FIORA, an open-source graph neural network (GNN) designed to simulate the tandem mass spectrometry process and improve the accuracy of mass spectrometry identification.
At the heart of the FIORA model is its ability to utilize local neighborhood information of bonds within a molecule to learn fragmentation patterns and predict the probabilities of fragment ions. Compared to traditional fragmentation algorithms like ICEBERG and CFM-ID, FIORA demonstrates superior performance in mass prediction and can predict other features such as retention time (RT) and collision cross section (CCS). This groundbreaking research was published in Nature Communications on March 7, 2025.
FIORA leverages high-performance GPUs to rapidly validate putative compound annotations and significantly expands spectral reference libraries through high-quality predictions. This advancement is crucial for propelling untargeted metabolomics research, particularly in analyzing unknown compounds. Progress in this field has been slow over the past decade due to the scarcity of high-quality reference spectra. For example, the 2016 CASMI challenge showed a recall rate of only 34% for computational methods, dropping below 30% by 2022. This highlights the urgent need for a novel solution.
FIORA's uniqueness lies in its ability to independently assess bond dissociation events based on the local structure of each compound. This approach more directly simulates the physical fragmentation process in mass spectrometry than many existing algorithms. Furthermore, FIORA excels not only with similar compounds but also demonstrates impressive generalization capabilities to unfamiliar structures.
To ensure its effectiveness, FIORA was tested on multiple datasets, showing a median similarity of over 0.8 between its predicted and reference spectra, exceeding competing algorithms by 10% to 49% in some cases. Moreover, FIORA's modular design allows for flexible adaptation to different prediction targets, showcasing remarkable versatility.
The introduction of FIORA not only fills a gap in mass spectrometry analysis but also provides a powerful tool for future compound identification and research.