摘要
Comprehensive two-dimensional gas chromatography (GC × GC) produces large and complex multidimensional data that requires computer software for visualization, processing, and analysis. For visualization, one-dimensional, time-ordered data from the detector(s) must be rasterized for the two chromatographic dimensions. Additionally, mass spectrometry (MS) data has a spectral dimension that requires indexing for access and visualization. Computer visualization techniques include one-dimensional graphs, two-dimensional images, and three-dimensional projections that can use colour and time dimensions to increase visual communication of the chromatographic and spectral features. Fundamental data processing includes data preprocessing, peak detection, and analyte identification. Two important data preprocessing steps are modulation-phase adjustment and detector-baseline correction. For GC × GC, peak detection requires delineating two-dimensional, unimodal regions (blobs) that are the chromatographic responses to analytes. Coeluted analytes may require unmixing or deconvolving blobs. Analyte identification typically involves recognizing patterns of retention times and/or spectra, e.g., by matching a predefined template representing the retention-times pattern and other characteristics of target peaks to detected peaks or by matching detected mass spectra to those in a spectral library. Higher-level analysis uses the metadata from detected analytes, e.g., compiled in a table, for tasks such as identification, classification, and regression. For example, a chemical fingerprint might be used to identify the source of an olive oil, to classify its quality, or to assess ripeness. Machine learning is important area of pattern analysis research in which chromatographic assays of many samples from different individuals or classes are used to develop methods for identification, classification, and regression analysis.