吉祥物
假阳性悖论
计算机科学
串联质谱法
鉴定(生物学)
背景(考古学)
序列数据库
串联质量标签
数据挖掘
数据库搜索引擎
蛋白质组学
肽序列
序列(生物学)
蛋白质测序
质谱法
搜索引擎
化学
人工智能
情报检索
定量蛋白质组学
色谱法
生物
古生物学
基因
法学
植物
生物化学
政治学
作者
David N. Perkins,Darryl Pappin,David M. Creasy,John S. Cottrell
出处
期刊:Electrophoresis
[Wiley]
日期:1999-12-01
卷期号:20 (18): 3551-3567
被引量:7725
标识
DOI:10.1002/(sici)1522-2683(19991201)20:18<3551::aid-elps3551>3.0.co;2-2
摘要
Several algorithms have been described in the literature for protein identification by searching a sequence database using mass spectrometry data. In some approaches, the experimental data are peptide molecular weights from the digestion of a protein by an enzyme. Other approaches use tandem mass spectrometry (MS/MS) data from one or more peptides. Still others combine mass data with amino acid sequence data. We present results from a new computer program, Mascot, which integrates all three types of search. The scoring algorithm is probability based, which has a number of advantages: (i) A simple rule can be used to judge whether a result is significant or not. This is particularly useful in guarding against false positives. (ii) Scores can be compared with those from other types of search, such as sequence homology. (iii) Search parameters can be readily optimised by iteration. The strengths and limitations of probability-based scoring are discussed, particularly in the context of high throughput, fully automated protein identification.
科研通智能强力驱动
Strongly Powered by AbleSci AI