已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

AGEseq: Analysis of Genome Editing by Sequencing

生物 计算生物学 基因组 DNA测序 基因组编辑 遗传学 进化生物学 基因
作者
Liang-Jiao Xue,Chung-Jui Tsai
出处
期刊:Molecular Plant [Elsevier]
卷期号:8 (9): 1428-1430 被引量:45
标识
DOI:10.1016/j.molp.2015.06.001
摘要

Knockout experiments are critical for the evaluation of gene function. Researchers have increasingly relied on genome editing technologies for precise mutagenesis at loci of interest, using engineered nucleases such as Zinc finger nucleases, transcription activator-like effector nucleases (TALENs), and CRISPR (clustered regularly interspaced short palindromic repeats)-associated proteins. Sequence-specific targeting and cleavage by these systems generate double-stranded breaks and trigger endogenous repair machineries, resulting in small indels that can disrupt reading frames and gene function. These methods have been successfully applied to plants; the CRISPR system is particularly powerful for non-model species (Belhaj et al., 2013Belhaj K. Chaparro-Garcia A. Kamoun S. Nekrasov V. Plant genome editing made easy: targeted mutagenesis in model and crop plants using the CRISPR/Cas system.Plant Methods. 2013; 9: 39Crossref PubMed Scopus (409) Google Scholar, Lozano-Juste and Cutler, 2014Lozano-Juste J. Cutler S.R. Plant genome engineering in full bloom.Trends Plant Sci. 2014; 19: 284-287Abstract Full Text Full Text PDF PubMed Scopus (65) Google Scholar). Several tools, such as TALENT (Cermak et al., 2011Cermak T. Doyle E.L. Christian M. Wang L. Zhang Y. Schmidt C. Baller J.A. Somia N.V. Bogdanove A.J. Voytas D.F. Efficient design and assembly of custom TALEN and other TAL effector-based constructs for DNA targeting.Nucleic Acids Res. 2011; 39: e82Crossref PubMed Scopus (1560) Google Scholar) and CRISPR-P (Lei et al., 2014Lei Y. Lu L. Liu H.Y. Li S. Xing F. Chen L.L. CRISPR-P: a web tool for synthetic single-guide RNA design of CRISPR-system in plants.Mol. Plant. 2014; 7: 1494-1496Abstract Full Text Full Text PDF PubMed Scopus (401) Google Scholar), have been developed to facilitate the design of genome editing experiments. However, few tools are available to evaluate the outcome of genome editing. Amplicon sequencing is commonly employed for genome editing analysis where genomic sequences that span the target loci are amplified, sometimes cloned, and sequenced. A number of programs have been developed to decode heterozygous chromatograms from direct sequencing of PCR products for identification of sequence polymorphisms (Crowe, 2005Crowe M.L. SeqDoC: rapid SNP and mutation detection by direct comparison of DNA sequence chromatograms.BMC Bioinformatics. 2005; 6: 133Crossref PubMed Scopus (31) Google Scholar, Dmitriev and Rakitov, 2008Dmitriev D.A. Rakitov R.A. Decoding of superimposed traces produced by direct sequencing of heterozygous indels.PLoS Comput. Biol. 2008; 4: e1000113Crossref PubMed Scopus (102) Google Scholar, Ma et al., 2015Ma X. Chen L. Zhu Q. Chen Y. Liu Y.-G. Rapid decoding of sequence-specific nuclease-induced heterozygous and biallelic mutations by direct sequencing of PCR products.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.02.012Abstract Full Text Full Text PDF Scopus (98) Google Scholar). However, the throughput of Sanger sequencing, even without cloning, is not amenable to screening large numbers of transgenic lines, especially with increasingly sophisticated multiplex targeting (Xie et al., 2015Xie K.B. Minkenberg B. Yang Y.N. Boosting CRISPR/Cas9 multiplex editing capability with the endogenous tRNA-processing system.Proc. Natl. Acad. Sci. USA. 2015; 112: 3570-3575Crossref PubMed Scopus (764) Google Scholar). No open-source programs are currently available for analysis of amplicon-sequencing data from high-throughput sequencers. After quality-control filtering and demultiplexing, amplicon sequence analysis usually involves alignment with target/reference sequences and detection of editing events, such as indels or single nucleotide polymorphisms (SNPs). Much bioinformatic effort is required, unless commercial software is available. A web-based tool for amplicon-sequencing data analysis was recently reported (Guell et al., 2014Guell M. Yang L.H. Church G.M. Genome editing assessment using CRISPR Genome Analyzer (CRISPR-GA).Bioinformatics. 2014; 30: 2968-2970Crossref PubMed Scopus (98) Google Scholar). However, only one reference sequence is accepted at a time, which makes application to large datasets cumbersome. Here, we report a versatile and user-friendly tool, Analysis of Genome Editing by Sequencing (AGEseq), to address this limitation. AGEseq is available from AspenDB (http://aspendb.uga.edu) as a standalone program or a Galaxy (Goecks et al., 2010Goecks J. Nekrutenko A. Taylor J. Team T.G. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.Genome Biol. 2010; 11: R86Crossref PubMed Scopus (2670) Google Scholar)-based web tool. AGEseq supports both Sanger and deep-sequencing reads. For deep sequencing, degenerate primers can be designed to amplify both alleles of the target gene as well as closely related gene(s). Amplicons from unrelated genes or across samples are then barcoded and pooled for sequencing (Figure 1A). For data analysis, AGEseq requires a design file and a directory of read files as inputs. The design file describes the reference sequences, usually containing 30–40 bp flanking regions of the target editing site(s) (Figure 1B). The read files are stored in a directory named “reads” by default, and multiple file formats are accepted (Figure 1A). AGEseq uses BLAT to align reference and read sequences. Aligned reads are assigned to the best hit among the reference sequences provided in the design file, and matching regions are extracted for indel or SNP calling. The output file reports the aligned (target and read) sequences and detection frequency for each editing event (Figure 1C and 1D). Our laboratory has recently applied CRISPR-based genome editing to lignin biosynthesis perturbations in Populus. A gene-specific guide RNA (gRNA) was designed to target 4-coumarate:CoA ligase 1 (4CL1), but not the paralogous 4CL5 (Zhou et al., 2015Zhou X. Jacobs T.B. Xue L.-J. Harding S.A. Tsai C.-J. Exploiting SNPs for biallelic CRISPR mutations in the outcrossing woody perennial Populus reveals 4-coumarate:CoA ligase specificity and redundancy.New Phytol. 2015; https://doi.org/10.1111/nph.13470Crossref Scopus (200) Google Scholar). Degenerate primers were designed to amplify both 4CL1 (target) and 4CL5 (off-target) sequences from independent transgenic lines to assess editing specificity. AGEseq successfully distinguished the duplicates as well as their alleles (Figure 1B), and confirmed biallelic mutations in all transgenic lines examined, with no off-target cleavage of 4CL5 (Figure 1E). In support of a null 4CL1, all primary transformants exhibited a reddish-brown wood discoloration (Figure 1F) known to be associated with lignin modification (Zhou et al., 2015Zhou X. Jacobs T.B. Xue L.-J. Harding S.A. Tsai C.-J. Exploiting SNPs for biallelic CRISPR mutations in the outcrossing woody perennial Populus reveals 4-coumarate:CoA ligase specificity and redundancy.New Phytol. 2015; https://doi.org/10.1111/nph.13470Crossref Scopus (200) Google Scholar). As a further test, AGEseq was applied to amplicon data of soybean with DDM1 (Decrease in DNA Methylation) editing in one or two homoeologous loci as described in Jacobs et al., 2015Jacobs T.B. LaFayette P.R. Schmitz R.J. Parrott W.A. Targeted genome modifications in soybean with CRISPR/Cas9.BMC Biotechnol. 2015; 15: 16Crossref PubMed Scopus (351) Google Scholar. The editing patterns detected by AGEseq were consistent with those obtained by Geneious R7 (Biomatters Ltd.) used in that study, ranging from small indels (<5 nt) to large deletions (>10 nt), with varying (1–98%) editing efficiencies (Supplemental Table 1) (Jacobs et al., 2015Jacobs T.B. LaFayette P.R. Schmitz R.J. Parrott W.A. Targeted genome modifications in soybean with CRISPR/Cas9.BMC Biotechnol. 2015; 15: 16Crossref PubMed Scopus (351) Google Scholar). AGEseq flags events with a long stretch of indels and/or mismatches as “strange events” that require manual examination, and three such cases were identified. Manual inspection confirmed a large (44 nt) deletion in one case, while the other two were found by Jacobs et al., 2015Jacobs T.B. LaFayette P.R. Schmitz R.J. Parrott W.A. Targeted genome modifications in soybean with CRISPR/Cas9.BMC Biotechnol. 2015; 15: 16Crossref PubMed Scopus (351) Google Scholar to harbor unusual insertions from the Agrobacterium rhizogenes root-inducing plasmid after additional cloning and sequencing. These results demonstrate the versatility of AGEseq in detecting or flagging genome editing patterns across a wide range of data scenarios. Detailed instructions on AGEseq are provided for all operating systems (Supplemental Text). The analysis sensitivity can be adjusted by two user-configurable parameters: mismatch allowance (default at 10%) and minimum read coverage (default at 0). Systematic errors introduced during amplicon library preparation and sequencing that involve PCR or by base-calling algorithms are common in deep-sequencing data, and they will appear as “SNPs” in the AGEseq report (Figure 1C and Supplemental Text). For this reason, AGEseq considers indels as potential genome editing events by default, although SNPs are also reported. If SNPs are of interest, setting a minimum read coverage is recommended to reduce random errors. A known limitation of BLAT and similar aligners is their inconsistent gap handling in the presence of homo-nucleotides, as shown for both 4CL1 alleles in Figure 1C (red boxes, 1-nt deletion at position 56 or 57). AGEseq does not consider these differences and reports, by default, the sum of all indel reads as well as wild-type (WT)-like (non-edited) reads from each sample in the summary (Figure 1D). User inspection is therefore recommended. As mentioned, AGEseq also facilitates identification of unusual events that require manual inspection, and sometimes follow-up experiments to confirm the editing patterns. The ability of AGEseq to effectively discriminate allelic sequences of duplicated genes suggests that it can support analysis with polyploid genomes. When only one reference sequence is provided, the AGEseq output can be mined for allelic variations, if any, in the target region. As a standalone software, AGEseq is (1) easy to use; no command line or programming skill is required for Windows or Mac users; (2) versatile; multiple sequencing platforms and file types are supported for assessing genome editing, allelic variation and/or off-target cleavage; and (3) extensible; the Perl script can easily be exported to other bioinformatics pipelines. As an example, we adapted AGEseq as a utility in the Galaxy platform (Goecks et al., 2010Goecks J. Nekrutenko A. Taylor J. Team T.G. Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.Genome Biol. 2010; 11: R86Crossref PubMed Scopus (2670) Google Scholar) to support web-based analysis. It is accessible at AspenDB (http://aspendb.uga.edu/ageseq) or through the Galaxy Tool Shed (https://toolshed.g2.bx.psu.edu) for installation in local instances. A limitation of the web tool is that only one sequence read file can be processed at a time. For a multiplexed dataset with a large number of samples, the use of the standalone AGEseq program is recommended. Although developed for genome editing analysis, AGEseq can be adapted for SNP genotyping, metagenomic analysis, or other amplicon-sequencing applications.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
NexusExplorer应助青青采纳,获得10
刚刚
甜蜜的小甜瓜关注了科研通微信公众号
1秒前
yayyaya完成签到 ,获得积分10
2秒前
莫逆之交发布了新的文献求助10
4秒前
wangy完成签到 ,获得积分10
4秒前
聂裕铭完成签到 ,获得积分10
4秒前
橘子屿布丁完成签到,获得积分10
4秒前
酷波er应助热心枕头采纳,获得10
6秒前
小鲨鱼发布了新的文献求助10
6秒前
7秒前
严明完成签到,获得积分10
7秒前
严明完成签到,获得积分10
7秒前
petrichor发布了新的文献求助10
9秒前
左一酱完成签到 ,获得积分10
9秒前
CHENG发布了新的文献求助10
10秒前
棒棒冰完成签到 ,获得积分10
11秒前
清沐完成签到 ,获得积分10
11秒前
仙人掌先生完成签到,获得积分10
11秒前
RerrentLinden完成签到,获得积分10
12秒前
wanci应助科研通管家采纳,获得10
16秒前
星辰大海应助科研通管家采纳,获得10
16秒前
科研通AI2S应助科研通管家采纳,获得10
16秒前
lod完成签到,获得积分10
16秒前
布丁完成签到 ,获得积分10
18秒前
20秒前
petrichor完成签到,获得积分10
23秒前
雷雷完成签到,获得积分10
24秒前
25秒前
CHENG完成签到,获得积分10
25秒前
25秒前
jd发布了新的文献求助10
26秒前
田様应助petrichor采纳,获得10
26秒前
27秒前
收集快乐完成签到 ,获得积分10
27秒前
rofsc完成签到 ,获得积分10
28秒前
潘道士完成签到 ,获得积分10
29秒前
太清发布了新的文献求助10
29秒前
AEFGGS发布了新的文献求助10
29秒前
热心枕头发布了新的文献求助10
30秒前
yzxzdm完成签到 ,获得积分10
32秒前
高分求助中
Solution Manual for Strategic Compensation A Human Resource Management Approach 1200
Natural History of Mantodea 螳螂的自然史 1000
Glucuronolactone Market Outlook Report: Industry Size, Competition, Trends and Growth Opportunities by Region, YoY Forecasts from 2024 to 2031 800
A Photographic Guide to Mantis of China 常见螳螂野外识别手册 800
Autoregulatory progressive resistance exercise: linear versus a velocity-based flexible model 500
The analysis and solution of partial differential equations 400
Spatial Political Economy: Uneven Development and the Production of Nature in Chile 400
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 细胞生物学 免疫学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3335171
求助须知:如何正确求助?哪些是违规求助? 2964370
关于积分的说明 8613487
捐赠科研通 2643195
什么是DOI,文献DOI怎么找? 1447252
科研通“疑难数据库(出版商)”最低求助积分说明 670587
邀请新用户注册赠送积分活动 658921