发起人                        
                
                                
                        
                            随机六聚体                        
                
                                
                        
                            生物                        
                
                                
                        
                            RNA聚合酶                        
                
                                
                        
                            序列(生物学)                        
                
                                
                        
                            数学                        
                
                                
                        
                            遗传学                        
                
                                
                        
                            计算生物学                        
                
                                
                        
                            大肠杆菌                        
                
                                
                        
                            分子生物学                        
                
                                
                        
                            基因                        
                
                                
                        
                            基因表达                        
                
                        
                    
            作者
            
                Ramit Bharanikumar,Keshav Aditya R Premkumar,Ashok Palaniappan            
         
                    
            出处
            
                                    期刊:PeerJ
                                                         [PeerJ, Inc.]
                                                        日期:2018-11-07
                                                        卷期号:6: e5862-e5862
                                                        被引量:18
                                 
         
        
    
            
        
                
            摘要
            
            We present PromoterPredict, a dynamic multiple regression approach to predict the strength of Escherichia coli promoters binding the σ 70 factor of RNA polymerase. σ 70 promoters are ubiquitously used in recombinant DNA technology, but characterizing their strength is demanding in terms of both time and money. We parsed a comprehensive database of bacterial promoters for the −35 and −10 hexamer regions of σ 70 -binding promoters and used these sequences to construct the respective position weight matrices (PWM). Next we used a well-characterized set of promoters to train a multivariate linear regression model and learn the mapping between PWM scores of the −35 and −10 hexamers and the promoter strength. We found that the log of the promoter strength is significantly linearly associated with a weighted sum of the −10 and −35 sequence profile scores. We applied our model to 100 sets of 100 randomly generated promoter sequences to generate a sampling distribution of mean strengths of random promoter sequences and obtained a mean of 6E-4 ± 1E-7. Our model was further validated by cross-validation and on independent datasets of characterized promoters. PromoterPredict accepts −10 and −35 hexamer sequences and returns the predicted promoter strength. It is capable of dynamic learning from user-supplied data to refine the model construction and yield more robust estimates of promoter strength. PromoterPredict is available as both a web service ( https://promoterpredict.com ) and standalone tool ( https://github.com/PromoterPredict ). Our work presents an intuitive generalization applicable to modelling the strength of other promoter classes.
         
            
 
                 
                
                    
                    科研通智能强力驱动
Strongly Powered by AbleSci AI