Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis

蛋白质组串联质谱法化学分析物质谱法样品制备多路复用计算机科学色谱法数据采集分析化学（期刊）生物系统生物信息学生物生物化学操作系统

作者

Ludovic Gillet,Pedro Navarro,Stephen Tate,Hannes Röst,Nathalie Selevsek,Lukas Reiter,R. F. Bonner,Ruedi Aebersold

出处

期刊：Molecular & Cellular Proteomics [Elsevier]
日期：2012-01-20 卷期号：11 (6): O111.016717-O111.016717 被引量：2583

链接

mcponline.org europepmc.org europepmc.org nih.gov nih.govdoi.org

标识

DOI：10.1074/mcp.o111.016717

摘要

Most proteomic studies use liquid chromatography coupled to tandem mass spectrometry to identify and quantify the peptides generated by the proteolysis of a biological sample. However, with the current methods it remains challenging to rapidly, consistently, reproducibly, accurately, and sensitively detect and quantify large fractions of proteomes across multiple samples. Here we present a new strategy that systematically queries sample sets for the presence and quantity of essentially any protein of interest. It consists of using the information available in fragment ion spectral libraries to mine the complete fragment ion maps generated using a data-independent acquisition method. For this study, the data were acquired on a fast, high resolution quadrupole-quadrupole time-of-flight (TOF) instrument by repeatedly cycling through 32 consecutive 25-Da precursor isolation windows (swaths). This SWATH MS acquisition setup generates, in a single sample injection, time-resolved fragment ion spectra for all the analytes detectable within the 400–1200 m/z precursor range and the user-defined retention time window. We show that suitable combinations of fragment ions extracted from these data sets are sufficiently specific to confidently identify query peptides over a dynamic range of 4 orders of magnitude, even if the precursors of the queried peptides are not detectable in the survey scans. We also show that queried peptides are quantified with a consistency and accuracy comparable with that of selected reaction monitoring, the gold standard proteomic quantification method. Moreover, targeted data extraction enables ad libitum quantification refinement and dynamic extension of protein probing by iterative re-mining of the once-and-forever acquired data sets. This combination of unbiased, broad range precursor ion fragmentation and targeted data extraction alleviates most constraints of present proteomic methods and should be equally applicable to the comprehensive analysis of other classes of analytes, beyond proteomics. Most proteomic studies use liquid chromatography coupled to tandem mass spectrometry to identify and quantify the peptides generated by the proteolysis of a biological sample. However, with the current methods it remains challenging to rapidly, consistently, reproducibly, accurately, and sensitively detect and quantify large fractions of proteomes across multiple samples. Here we present a new strategy that systematically queries sample sets for the presence and quantity of essentially any protein of interest. It consists of using the information available in fragment ion spectral libraries to mine the complete fragment ion maps generated using a data-independent acquisition method. For this study, the data were acquired on a fast, high resolution quadrupole-quadrupole time-of-flight (TOF) instrument by repeatedly cycling through 32 consecutive 25-Da precursor isolation windows (swaths). This SWATH MS acquisition setup generates, in a single sample injection, time-resolved fragment ion spectra for all the analytes detectable within the 400–1200 m/z precursor range and the user-defined retention time window. We show that suitable combinations of fragment ions extracted from these data sets are sufficiently specific to confidently identify query peptides over a dynamic range of 4 orders of magnitude, even if the precursors of the queried peptides are not detectable in the survey scans. We also show that queried peptides are quantified with a consistency and accuracy comparable with that of selected reaction monitoring, the gold standard proteomic quantification method. Moreover, targeted data extraction enables ad libitum quantification refinement and dynamic extension of protein probing by iterative re-mining of the once-and-forever acquired data sets. This combination of unbiased, broad range precursor ion fragmentation and targeted data extraction alleviates most constraints of present proteomic methods and should be equally applicable to the comprehensive analysis of other classes of analytes, beyond proteomics. Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) 1The abbreviations used are:LC-MS/MSliquid chromatography coupled to tandem mass spectrometryDDAdata-dependent acquisitionDIAdata-independent acquisitionSRMsingle reaction monitoringRTretention timeLODlimit of detection. 1The abbreviations used are:LC-MS/MSliquid chromatography coupled to tandem mass spectrometryDDAdata-dependent acquisitionDIAdata-independent acquisitionSRMsingle reaction monitoringRTretention timeLODlimit of detection. is considered the method of choice for the identification and quantification of proteins and proteomes (1Aebersold R. Mann M. Mass spectrometry-based proteomics.Nature. 2003; 422: 198-207Crossref PubMed Scopus (5585) Google Scholar, 2MacCoss M.J. Matthews D.L. Teaching a new dog old tricks.Anal. Chem. 2005; 77: 295A-302ACrossref Google Scholar, 3Han X. Aslanian A. Yates 3rd, J.R. Mass spectrometry for proteomics.Curr. Opin. Chem. Biol. 2008; 12: 483-490Crossref PubMed Scopus (504) Google Scholar, 4Walther T.C. Mann M. Mass spectrometry-based proteomics in cell biology.J. Cell Biol. 2010; 190: 491-500Crossref PubMed Scopus (307) Google Scholar) and for the analysis of metabolites, lipids, glycans, and many other types of (bio)molecules. For proteomics, two main LC-MS/MS strategies have been used thus far. They have in common that the sample proteins are converted by proteolysis into peptides, which are then separated by (capillary) liquid chromatography. They differ in the mass spectrometric method used. The first and most widely used strategy is known as shotgun or discovery proteomics. For this method, the MS instrument is operated in data-dependent acquisition (DDA) mode, where fragment ion (MS2) spectra for selected precursor ions detectable in a survey (MS1) scan are generated (5Domon B. Aebersold R. Mass spectrometry and protein analysis.Science. 2006; 312: 212-217Crossref PubMed Scopus (1610) Google Scholar). The resulting fragment ion spectra are then assigned to their corresponding peptide sequences by sequence database searching (6Kapp E. Schutz F. Overview of tandem mass spectrometry (MS/MS) database search algorithms: Current Protocols in Protein Science. John Wiley & Sons, Inc, Hoboken, New Jersey, USA2007: 25.2.1-25.2.19Google Scholar, 7Nesvizhskii A.I. Protein identification by tandem mass spectrometry and sequence database searching.Methods Mol. Biol. 2007; 367: 87-119PubMed Google Scholar). The second main strategy is referred to as targeted proteomics. There, the MS instrument is operated in selected reaction monitoring (SRM) (also called multiple reaction monitoring) mode. With this method, a sample is queried for the presence and quantity of a limited set of peptides that have to be specified prior to data acquisition. SRM does not require the explicit detection of the targeted precursors but proceeds by the acquisition, sequentially across the LC retention time domain, of predefined pairs of precursor and product ion masses, called transitions, several of which constitute a definitive assay for the detection of a peptide in a complex sample (8Lange V. Picotti P. Domon B. Aebersold R. Selected reaction monitoring for quantitative proteomics: a tutorial.Mol Syst. Biol. 2008; 4 (222): 1-14Crossref Scopus (1121) Google Scholar). Data analysis in targeted proteomics essentially consists of computing the likelihood that a group of transition signal traces are derived from the targeted peptide (9Reiter L. Rinner O. Picotti P. Hüttenhain R. Beck M. Brusniak M.Y. Hengartner M.O. Aebersold R. mProphet: Automated data processing and statistical validation for large-scale SRM experiments.Nat Methods. 2011; 8: 430-435Crossref PubMed Scopus (365) Google Scholar). Both methods have different and largely complementary preferred uses and performance profiles that have been extensively discussed elsewhere (10Domon B. Aebersold R. Options and considerations when selecting a quantitative proteomics strategy.Nat. Biotechnol. 2010; 28: 710-721Crossref PubMed Scopus (482) Google Scholar). Specifically, shotgun proteomics is the method of choice for discovering the maximal number of proteins from one or a few samples. It does, however, have limited quantification capabilities on large sample sets because of stochastic and irreproducible precursor ion selection (11Liu H. Sadygov R.G. Yates 3rd, J.R. A model for random sampling and estimation of relative protein abundance in shotgun proteomics.Anal. Chem. 2004; 76: 4193-4201Crossref PubMed Scopus (2066) Google Scholar) and under-sampling (12Michalski A. Cox J. Mann M. More than 100,000 detectable peptide species elute in single shotgun proteomics runs but the majority is inaccessible to data-dependent LC-MS/MS.J. Proteome Res. 2011; 10: 1785-1793Crossref PubMed Scopus (476) Google Scholar). In contrast, targeted proteomics is well suited for the reproducible detection and accurate quantification of sets of specific proteins in many samples as is the case in biomarker or systems biology studies (13Addona T.A. Abbatiello S.E. Schilling B. Skates S.J. Mani D.R. Bunk D.M. Spiegelman C.H. Zimmerman L.J. Ham A.J. Keshishian H. Hall S.C. Allen S. Blackman R.K. Borchers C.H. Buck C. Cardasis H.L. Cusack M.P. Dodder N.G. Gibson B.W. Held J.M. Hiltke T. Jackson A. Johansen E.B. Kinsinger C.R. Li J. Mesri M. Neubert T.A. Niles R.K. Pulsipher T.C. Ransohoff D. Rodriguez H. Rudnick P.A. Smith D. Tabb D.L. Tegeler T.J. Variyath A.M. Vega-Montoto L.J. Wahlander A. Waldemarson S. Wang M. Whiteaker J.R. Zhao L. Anderson N.L. Fisher S.J. Liebler D.C. Paulovich A.G. Regnier F.E. Tempst P. Carr S.A. Multi-site assessment of the precision and reproducibility of multiple reaction monitoring-based measurements of proteins in plasma.Nat. Biotechnol. 2009; 27: 633-641Crossref PubMed Scopus (862) Google Scholar, 14Cima I. Schiess R. Wild P. Kaelin M. Schüffler P. Lange V. Picotti P. Ossola R. Templeton A. Schubert O. Fuchs T. Leippold T. Wyler S. Zehetner J. Jochum W. Buhmann J. Cerny T. Moch H. Gillessen S. Aebersold R. Krek W. Cancer genetics-guided discovery of serum biomarker signatures for diagnosis and prognosis of prostate cancer.Proc. Natl. Acad. Sci. U.S.A. 2011; 108: 3342-3347Crossref PubMed Scopus (148) Google Scholar, 15Picotti P. Bodenmiller B. Mueller L.N. Domon B. Aebersold R. Full dynamic range proteome analysis of S. cerevisiae by targeted proteomics.Cell. 2009; 138: 795-806Abstract Full Text Full Text PDF PubMed Scopus (647) Google Scholar). At present, however, the method is limited to the measurements of a few thousands transitions per LC-MS/MS run (16Kiyonami R. Schoen A. Prakash A. Peterman S. Zabrouskov V. Picotti P. Aebersold R. Huhmer A. Domon B. Increased selectivity, analytical precision, and throughput in targeted proteomics.Mol. Cell. Proteomics. 2011; 10 (M110.002931)Abstract Full Text Full Text PDF PubMed Scopus (150) Google Scholar). It therefore lacks the throughput to routinely quantify large fractions of a proteome. liquid chromatography coupled to tandem mass spectrometry data-dependent acquisition data-independent acquisition single reaction monitoring retention time limit of detection. liquid chromatography coupled to tandem mass spectrometry data-dependent acquisition data-independent acquisition single reaction monitoring retention time limit of detection. To alleviate the limitations of either method, strategies have been developed that rely on neither detection nor knowledge of the precursor ions to trigger acquisition of fragment ion spectra. Those methods operate via unbiased “data-independent acquisition” (DIA), in the cyclic recording, throughout the LC time range, of consecutive survey scans and fragment ion spectra for all the precursors contained in predetermined isolation windows. Various implementations of DIA methods have already been described using isolation windows of various widths, ranging from the complete m/z range to few Daltons (17Purvine S. Eppel J.T. Yi E.C. Goodlett D.R. Shotgun collision-induced dissociation of peptides using a time of flight mass analyzer.Proteomics. 2003; 3: 847-850Crossref PubMed Scopus (130) Google Scholar, 18Venable J.D. Dong M.Q. Wohlschlegel J. Dillin A. Yates J.R. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra.Nat. Methods. 2004; 1: 39-45Crossref PubMed Scopus (509) Google Scholar, 19Plumb R.S. Johnson K.A. Rainville P. Smith B.W. Wilson I.D. Castro-Perez J.M. Nicholson J.K. UPLC/MS(E): A new approach for generating molecular fragment information for biomarker structure elucidation.Rapid Commun. Mass Spectrom. 2006; 20: 1989-1994Crossref PubMed Scopus (389) Google Scholar, 20Panchaud A. Scherl A. Shaffer S.A. von Haller P.D. Kulasekara H.D. Miller S.I. Goodlett D.R. Precursor acquisition independent from ion count: How to dive deeper into the proteomics ocean.Anal. Chem. 2009; 81: 6481-6488Crossref PubMed Scopus (188) Google Scholar, 21Geiger T. Cox J. Mann M. Proteomics on an Orbitrap benchtop mass spectrometer using all ion fragmentation.Mol. Cell. Proteomics. 2010; 9: 2252-2261Abstract Full Text Full Text PDF PubMed Scopus (189) Google Scholar, 22Bern M. Finney G. Hoopmann M.R. Merrihew G. Toth M.J. MacCoss M.J. Deconvolution of mixture spectra from ion-trap data-independent-acquisition tandem mass spectrometry.Anal. Chem. 2010; 82: 833-841Crossref PubMed Scopus (99) Google Scholar, 23Carvalho P.C. Han X. Xu T. Cociorva D. Carvalho Mda G. Barbosa V.C. Yates 3rd, J.R. XDIA: Improving on the label-free data-independent analysis.Bioinformatics. 2010; 26: 847-848Crossref PubMed Scopus (70) Google Scholar, 24Panchaud A. Jung S. Shaffer S.A. Aitchison J.D. Goodlett D.R. Faster, quantitative, and accurate precursor acquisition independent from ion count.Anal. Chem. 2011; 83: 2250-2257Crossref PubMed Scopus (65) Google Scholar) (Table I). Using such scans, the link between the fragment ions and the precursors from which they originate is lost, complicating the analysis of the acquired data sets. Also, large selection window widths increase the number of concurrently fragmented precursors and therefore the complexity of the acquired composite fragment ion spectra. To date, the composite spectra generated by DIA methods have been principally analyzed with the standard database searching tools developed for DDA, either by searching the composite MS2 spectra directly (18Venable J.D. Dong M.Q. Wohlschlegel J. Dillin A. Yates J.R. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra.Nat. Methods. 2004; 1: 39-45Crossref PubMed Scopus (509) Google Scholar, 20Panchaud A. Scherl A. Shaffer S.A. von Haller P.D. Kulasekara H.D. Miller S.I. Goodlett D.R. Precursor acquisition independent from ion count: How to dive deeper into the proteomics ocean.Anal. Chem. 2009; 81: 6481-6488Crossref PubMed Scopus (188) Google Scholar) or by searching pseudo MS2 spectra reconstituted postacquisition based on the co-elution profiles of precursor ions (from the survey scans) and of their potentially corresponding fragment ions (22Bern M. Finney G. Hoopmann M.R. Merrihew G. Toth M.J. MacCoss M.J. Deconvolution of mixture spectra from ion-trap data-independent-acquisition tandem mass spectrometry.Anal. Chem. 2010; 82: 833-841Crossref PubMed Scopus (99) Google Scholar, 25Wong J.W. Schwahn A.B. Downard K.M. ETISEQ: An algorithm for automated elution time ion sequencing of concurrently fragmented peptides for mass spectrometry-based proteomics.BMC Bioinformatics. 2009; 10 (244): 1-10Crossref PubMed Scopus (9) Google Scholar, 26Geromanos S.J. Vissers J.P. Silva J.C. Dorschel C.A. Li G.Z. Gorenstein M.V. Bateman R.H. Langridge J.I. The detection, correlation, and comparison of peptide precursor and product ions from data independent LC-MS with data dependant LC-MS/MS.Proteomics. 2009; 9: 1683-1695Crossref PubMed Scopus (394) Google Scholar, 27Li G.Z. Vissers J.P. Silva J.C. Golick D. Gorenstein M.V. Geromanos S.J. Database searching and accounting of multiplexed precursor and product ion spectra from the data independent analysis of simple and complex peptide mixtures.Proteomics. 2009; 9: 1696-1719Crossref PubMed Scopus (436) Google Scholar, 28Blackburn K. Mbeunkui F. Mitra S.K. Mentzel T. Goshe M.B. Improving protein and proteome coverage through data-independent multiplexed peptide fragmentation.J. Proteome Res. 2010; 9: 3621-3637Crossref PubMed Scopus (65) Google Scholar).Table ILC time-resolved data-independent acquisition setups: description and current performance profiles Open table in a new tab Here, we report an alternative approach to proteome quantification that combines a high specificity DIA method with a novel targeted data extraction strategy to mine the resulting fragment ion data sets. For the data acquisition, we implement the sequential isolation window acquisition principle introduced by former DIA studies (18Venable J.D. Dong M.Q. Wohlschlegel J. Dillin A. Yates J.R. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra.Nat. Methods. 2004; 1: 39-45Crossref PubMed Scopus (509) Google Scholar, 20Panchaud A. Scherl A. Shaffer S.A. von Haller P.D. Kulasekara H.D. Miller S.I. Goodlett D.R. Precursor acquisition independent from ion count: How to dive deeper into the proteomics ocean.Anal. Chem. 2009; 81: 6481-6488Crossref PubMed Scopus (188) Google Scholar) on a high resolution MS instrument. This time- and mass-segmented acquisition method generates, in a single injection, fragment ion spectra of all precursor ions within a user-defined precursor RT and m/z space and records the ensemble of these fragment ion spectra as complex fragment ion maps. Using computer simulations we show that the resulting maps achieve the highest fragment ion specificity of any DIA method described to date. We term this acquisition strategy “SWATH MS,” in reference to the swaths that are conceptually referred to designate the series of isolation windows acquired for a given precursor mass range across the LC. To analyze the high specificity, multiplexed data sets generated by SWATH MS, we developed a novel data analysis strategy that fundamentally differs from the database search approaches used so far to identify peptides from DIA data sets. It consists of using a targeted data extraction strategy to query the acquired fragment ion maps for the presence and quantity of specific peptides of interest, using a priori information contained in spectral libraries. Practically, the fragment ion signals, their relative intensities, chromatographic concurrence, and other information accessible from a spectral library for each targeted peptide are used to mine the DIA fragment ion maps for constellations of signals that precisely correlate with the known coordinates of a targeted peptide, thus uniquely identifying the peptide in the map. The extraction of fragment ion traces from data-independently acquired samples sets has been reported for the quantification of formerly identified peptides (18Venable J.D. Dong M.Q. Wohlschlegel J. Dillin A. Yates J.R. Automated approach for quantitative analysis of complex peptide mixtures from tandem mass spectra.Nat. Methods. 2004; 1: 39-45Crossref PubMed Scopus (509) Google Scholar); however, this strategy has never been purposely used to systematically search and identify peptides from the fragment ion maps of DIA data sets. Indeed, it is only with the increasing availability of proteome-wide spectral libraries that this targeted data extraction strategy becomes largely applicable to mine the acquired data sets for peptides never identified thus far with regular shotgun proteomics approaches. We show that the combination of high specificity fragment ion maps and targeted data analysis using information from spectral libraries of complete organisms offers unprecedented possibilities for the qualitative and quantitative probing of proteomes. This approach should be applicable beyond proteomics to other “omics” measurements, including metabolomics and lipidomics, or to forensics or biomedical analytics fields, which require accurate quantitative analysis of as many analytes as possible from a LC-MS/MS single sample injection. A commercial 5600 TripleTOFTM (ABSciex, Concord, Canada) was used for all the experiments. The instrument was coupled with an Eksigent 1D+ Nano LC system (Eksigent, Dublin, CA) for the stable isotope dilution experiments or with an Eksigent NanoLC-2DPlus with nanoFlex cHiPLC system for the diauxic shift sample acquisition. The same solvents were used on both LC systems, with solvent A being composed of 0.1% (v/v) formic acid in water and solvent B comprising 95% (v/v) acetonitrile with 0.1% (v/v) formic acid. The serial dilution experiments were performed with a customer-packed emitter, which was created using a laser puller to an orifice of 4 μm and packed with 3-μm Zorbax C18 column using a pressure bomb. The samples were loaded directly onto this column from the nano LC system at a flow rate of 500 nl·min−1. The loaded material was eluted from this column in a linear gradient of 5% solvent B to 30% solvent B over 90 min. The column was regenerated by washing at 90% solvent B for 10 min and re-equilibrated at 5% solvent B for 10 min. The diauxic shift sample acquisitions were performed using a “trap and elute” configuration on the nanoFlex system. The trap column (200 μm × 0.5 mm) and the analytical column (75 μm × 15 cm) were packed with 3 μm ChromXP C18 medium. The samples were loaded at a flow rate of 2 μl·min−1 for 10 min and eluted from the analytical column at a flow rate of 300 nl·min−1 in a linear gradient of 5% solvent B to 35% solvent B in 155 min. The column was regenerated by washing at 80% solvent B for 10 min and re-equilibrated at 5% solvent B for 10 min. For standard data-dependent analysis experiments, the mass spectrometer was operated in a manner where a 250-ms survey scan (TOF-MS) was collected, from which the top 20 ions were selected for automated MS/MS in subsequent experiments where each MS/MS event consisted of a 50-ms scan. The selection criteria for parent ions included intensity, where ions had to be greater than 150 counts/s with a charge state greater than 1+ and were not present on the dynamic exclusion list. Once an ion had been fragmented by MS/MS, its mass and isotopes were excluded for a period of 15 s. Ions were isolated using a quadrupole resolution of 0.7 Da and fragmented in the collision cell using collision energy ramped from 15 to 45 eV within the 50-ms accumulation time. In the instances where there were less than 20 parent ions that met the selection criteria, those ions that did were subjected to longer accumulation times to maintain a constant total cycle time of 1.25 s. For SWATH MS-based experiments, the mass spectrometer was operated in a looped product ion mode. In this mode, the instrument was specifically tuned to allow a quadrupole resolution of 25 Da/mass selection. The stability of the mass selection was maintained by the operation of the Radio Frequency (RF) and Direct Current (DC) voltages on the isolation quadrupole in an independent manner. Using an isolation width of 26 Da (25 Da of optimal ion transmission efficiency + 1 Da for the window overlap), a set of 32 overlapping windows was constructed covering the mass range 400–1200 Da. Consecutive swaths need to be acquired with some precursor isolation window overlap to ensure the transfer of the complete isotopic pattern of any given precursor ion in at least one isolation window and thereby to maintain optimal correlation between parent and fragment isotopes peaks at any LC time point (supplemental Fig. S1, a–f). This overlap was reduced to a minimum of 1 Da, which experimentally matched the almost squared shape of the fragment ion transmission profile achieved through the specific quadrupole tuning developed for SWATH MS (supplemental Fig. S1, g and h). The windows setups used for these runs were as follows: Experiment 1: MS1 scan (see below); Experiment 2: 400–426; Experiment 3: 425–451… Experiment 33: 1175–1201. Those isolation windows of 26-Da width (25 Da + 1 Da) are the “nominal” windows used to compute the RF/DC voltages used to drive the isolation quadrupole during the acquisition. However, because the isolation windows are only “almost square shapes” (supplemental Fig S1, g and h), ∼0.3–0.5 Da of ion transmission can be estimated as being “lost” on either sides of the windows. The “100% efficient” transmission of precursor ions is therefore happening only for 25 Da effectively. In other words, the “effective” isolation windows can be considered as being 400.5–425.5, 425.5–450.5, etc. (plus the potential overlap left from the nominal window transmission). The collision energy for each window was determined based on the appropriate collision energy for a 2+ ion centered upon the window with a spread of 15 eV. This ensured optimal fragmentation for the broad range of precursors co-selected within the isolation windows. An accumulation time of 100 ms was used for each fragment ion scan and for the (optional) survey scans acquired at the beginning of each cycle. This results in a total duty cycle of 3.3 s (3.2 s total for stepping through the 32 isolation windows + 0.1 s for the optional survey scan). The mass resolution was between 15,000 and 30,000 for the MS/MS scans, depending on the mode used to record the SWATH MS data sets (high sensitivity or high resolution). For this study, the high sensitivity mode was used, which still allows accurate extraction of the fragment ion masses at 10–50 ppm accuracy (optimal extraction for the area under curve of the MS/MS profile signals at half peak width). To generate the background for our simulations, the Saccharomyces cerevisiae protein sequences were downloaded from ensembl.org (release 57_1j). The peptide set resulting from trypsin proteolysis (no missed cleavages) was generated in silico using carbamidomethyl cysteine as fixed modification. We then selected the peptides with theoretical precursor ion charge states 2+ and 3+ and with the monoisotopic and the first 13C isotopic masses (+0 and +1 Da) within the mass range of 400 to 1,200 m/z. For each of those precursor ions, the theoretical set of fragment ions was generated (all b and y ions of charge 1+ and 2+), giving rise to transition pairs. This data set contained 111,880 peptides (corresponding to 6,557 proteins) resulting in 194,314 doubly and triply charged precursors (388,781 overall, taking into account the monoisotope and first 13C isotope) and in 10,004,504 transitions altogether that constituted thus the background of our simulations. We also prepared a reduced data set that only contained the precursors of peptides that were reported in the PeptideAtlas (Yeast PeptideAtlas 200904 build, also containing the MS-identified modifications and nontryptic peptides). This reduced data set contained 48,087 peptides (corresponding to 3,898 proteins), resulting in 93,875 doubly and triply charged precursors (187,777 overall, taking into account the monoisotope and first 13C isotope) and in 5,476,964 transitions altogether that we used as a more realistic proteomic background. The retention times of the peptides were computed using the SSRCalc algorithm (50). To estimate the number of SRM interferences, we generated in silico query assays for all proteotypic peptides of the yeast genome as targets. We considered all singly charged b and y transitions of the monoisotopic 2+ precursor of those peptides as targets and ran them against the computed backgrounds (theoretical yeast digest or PeptideAtlas) and recorded an interference whenever a transition from the background (that did not belong to the query peptide) was within a specified distance of Q1, Q3, and RT from the target queried peptide. For each target peptide, the number of transitions that were interfered with was recorded and later used to compute the statistics. The detailed algorithm for the computation of the product ion interferences will be the subject of a separate study (51Rost H.L. Malmstrom L. Ruedi Aebersold R. A computational tool to detect and avoid redundancy in selected reaction monitoring.Mol. Cell. Proteomics. 2012; (mcp.M111.013045. First Published on April 24, 2012)10.1074/mcp.M111.013045Abstract Full Text Full Text PDF Scopus (81) Google Scholar). This algorithm essentially expands on the principle of the “unique ion signatures” described by Sherman et al. (29Sherman J. McKay M.J. Ashman K. Molloy M.P. Unique ion signature mass spectrometry, a deterministic method to assign peptide identity.Mol. Cell. Proteomics. 2009; 8: 2051-2062Abstract Full Text Full Text PDF PubMed Scopus (35) Google Scholar) by taking into account peptide RT as an additional constraint for the calculation of fragment ion interferences. It should be acknowledged that the current algorithm does not simulate the peptide signal intensities. Even if, in theory, the different MS response factors of those peptides could be retrieved from, for example, the PeptideAtlas database, it is unlikely that those response factors can be extrapolated from one sample to another or from one study to the other, because of ion suppression effects during the ionization. However, it was not the aim of these theoretical simulations to perfectly depict the reality, but rather to give an impression about the overall ranking of the different Q1/Q3 scenarios. In this respect, the simulations are valid because upon increasing the background for the fragment ion simulations (from 93,875 to 194,314 precursors), the overall ranking of the scenarios is maintained. This means that those simulations may not capture the exact reality but can be perfectly used a

求助该文献

最长约 10秒，即可获得该文献文件

Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis

今日热心研友