发起人
随机六聚体
生物
RNA聚合酶
序列(生物学)
数学
遗传学
计算生物学
大肠杆菌
分子生物学
基因
基因表达
作者
Ramit Bharanikumar,Keshav Aditya R Premkumar,Ashok Palaniappan
出处
期刊:PeerJ
[PeerJ]
日期:2018-11-07
卷期号:6: e5862-e5862
被引量:18
摘要
We present PromoterPredict, a dynamic multiple regression approach to predict the strength of Escherichia coli promoters binding the σ 70 factor of RNA polymerase. σ 70 promoters are ubiquitously used in recombinant DNA technology, but characterizing their strength is demanding in terms of both time and money. We parsed a comprehensive database of bacterial promoters for the −35 and −10 hexamer regions of σ 70 -binding promoters and used these sequences to construct the respective position weight matrices (PWM). Next we used a well-characterized set of promoters to train a multivariate linear regression model and learn the mapping between PWM scores of the −35 and −10 hexamers and the promoter strength. We found that the log of the promoter strength is significantly linearly associated with a weighted sum of the −10 and −35 sequence profile scores. We applied our model to 100 sets of 100 randomly generated promoter sequences to generate a sampling distribution of mean strengths of random promoter sequences and obtained a mean of 6E-4 ± 1E-7. Our model was further validated by cross-validation and on independent datasets of characterized promoters. PromoterPredict accepts −10 and −35 hexamer sequences and returns the predicted promoter strength. It is capable of dynamic learning from user-supplied data to refine the model construction and yield more robust estimates of promoter strength. PromoterPredict is available as both a web service ( https://promoterpredict.com ) and standalone tool ( https://github.com/PromoterPredict ). Our work presents an intuitive generalization applicable to modelling the strength of other promoter classes.
科研通智能强力驱动
Strongly Powered by AbleSci AI