摘要
CRISPR (clustered regularly interspaced short palindromic repeats), first discovered as an immune system of prokaryotes, has become a powerful tool for genome editing in eukaryotes (Gaj et al., 2013). Co-expression of the CRISPR-associated endonuclease (Cas9) with a chimeric guide-RNA (gRNA) targeting a GN19NGG motif results in a double-strand DNA break near NGG, the protospacer adjacent motif (PAM) (Jinek et al., 2012). Processing by the endogenous DNA repair machinery generates small indels that, when located within a coding sequence, can disrupt the reading frame and render the gene nonfunctional. The CRISPR/Cas9 system has been successfully applied in several herbaceous systems (Belhaj et al., 2013; Harrison et al., 2014). Here we report its application in the woody perennial Populus using the 4-coumarate:CoA ligase (4CL) gene family as a case study. We achieved 100% mutational efficiency for two 4CL genes targeted, with every transformant examined carrying biallelic modifications. The CRISPR/Cas9 system is highly sensitive to single nucleotide polymorphisms (SNPs), as cleavage for a third 4CL gene was abolished due to SNPs in the target sequence. For outcrossing species with a highly heterozygous genome, gRNA design must take into account the frequent occurrence of SNPs to achieve efficient genome editing. Two 4CL genes, 4CL1 and 4CL2, associated with lignin and flavonoid biosynthesis, respectively (Hu et al., 1998; Harding et al., 2002), were targeted for CRISPR/Cas9 editing. The Populus tremula × alba clone 717-1B4 (717) routinely used for transformation is divergent from the genome-sequenced Populus trichocarpa (Hamzeh & Dayanandan, 2004). Therefore, the 4CL1 and 4CL2 gRNAs designed from the reference genome were interrogated with in-house 717 RNA-Seq data to ensure the absence of SNPs which could limit Cas9 efficiency (Supporting Information Fig. S1). A third gRNA designed for 4CL5, a genome duplicate of 4CL1, was also included. The corresponding 717 sequence harbors one SNP in each allele near/within the PAM, both of which are expected to abolish targeting by the 4CL5-gRNA (Fig. S1). All three gRNA target sites are located within the first exon (Fig. S1a). For 717 transformation, the gRNA was expressed from the Medicago U6.6 promoter, along with a human codon-optimized Cas9 (Mali et al., 2013) under control of the CaMV 35S promoter in binary vector p201N-Cas9 (Jacobs et al., 2015) (Methods S1). Transformation with the Cas9-only vector serves as a control. Altogether, we generated over 30 independent transgenic lines per construct. Randomly selected 4CL1 and 4CL2 lines were subjected to amplicon-sequencing following the protocol of Jacobs et al. (2015). The data were then processed by AGEseq, a custom program we developed for analysis of genome editing by sequencing (L-J. Xue & C-J. Tsai, unpublished) (Methods S1), and biallelic mutations were confirmed in all cases (Fig. 1a). The indel patterns differed between alleles in most cases, with the vast majority of the mutations predicted to disrupt the open reading frame. Although 4CL1 and 4CL5 share a high level (89%) of nucleotide identity, no off-target cleavage of 4CL5 was detected in the 4CL1 mutants (Fig. S2a), supporting specificity of the 4CL1-gRNA. For plants transformed with the 4CL5-gRNA, no editing events were detected as predicted by the presence of SNPs within the corresponding 717 sequence (Fig. S2b). Five randomly selected Cas9 lines were also analyzed and no editing was detected in the three 4CL loci as shown for two representative lines (Figs 1a, S2). These results indicate that the CRISPR editing system is highly efficient and sequence-specific in poplar. Down-regulation of 4CL1 has been shown to reduce lignin accrual, alter monolignol composition and cause stem wood discoloration in several species, including Populus (Kajita et al., 1996; Hu et al., 1999; Voelker et al., 2010). Indeed, a reddish-brown wood color was observed in each of the 36 mutant 4CL1 lines we obtained (Fig. 1b), but with a uniformity that was unlike the patchy discoloration associated with uneven 4CL1 suppression in previous studies (Kajita et al., 1996; Voelker et al., 2010). Lignin content and syringyl-to-guaiacyl (S : G) monolignol ratio were reduced in the 4CL1 mutants (Fig. 1c,d). The lignin and wood discoloration phenotypes were remarkably steadfast among independent mutant lines with varying indel patterns, consistent with a null 4CL1. This high level of mutation efficiency and phenotypic reproducibility is unmatched in previous (antisense, sense co-suppression, or RNAi) studies where typically only a small number of the transformants showed significant changes (Kajita et al., 1996; Hu et al., 1999; Voelker et al., 2010). Despite the efficiency, lignin content was only reduced by c. 23%, suggesting functional redundancy of the 4CL family. The 4CL2 has been associated with flavonoid biosynthesis based on its expression in nonlignifying tissues (Hu et al., 1998; Harding et al., 2002), but this function has not been confirmed by mutant or transgenic characterization. Condensed tannins (CTs, proanthocyanidins) are among the most abundant flavonoid derivatives in Populus, especially in roots (Kao et al., 2002; Tsai et al., 2006). Consistent with a role in flavonoid biosynthesis, 4CL2 mutations led to drastically (52–92%) reduced CT levels in roots (Fig. 1e) without affecting stem lignin (Fig. 1c,d). The CT reductions among 4CL2 mutants were less uniform than the lignin reductions of 4CL1 mutants, which can be attributed to greater diversity of metabolic fates in the flavonoid than lignin biosynthetic pathway (Tsai et al., 2006). Interestingly, several 4CL1 mutants also accumulated lower-than-WT levels of CTs in roots (Fig. 1e), suggesting a minor or conditional role of 4CL1 in flavonoid biosynthesis. Chlorogenic acids (caffeoylquinate isomers) are another class of phenylpropanoids abundant in Populus (Tsai et al., 2006). Their synthesis depends on 4CL, although exactly which isoform is involved remains unclear. We showed that total abundance of the two predominant isomers was unaffected in leaves of 4CL1 mutants, but was reduced significantly, by c. 30%, in 4CL2 mutants (Fig. 1f). Our results support a primary role of 4CL1 and 4CL2 in lignin and flavonoid biosynthesis, respectively, but also reveal different degrees of 4CL functional redundancy in support of the two pathways. Residual levels of lignin, CTs or chlorogenic acids in the 4CL1 or 4CL2 mutants suggest that other 4CLs are likely involved. The specificity and efficiency of the CRISPR/Cas9 system now affords a facile means for systematically targeting one or more members of a multi-gene family to delineate their functions.Thus gRNA stacking experiments in existing 4CL1 and/or 4CL2 mutants are expected to shed further light on the functional redundancy of the 4CL family. An advantage of gRNA stacking is the simplicity of construct preparation, since a functional Cas9 is already present in existing mutants. The frequent occurrence of SNPs in outcrossing species represents an underappreciated impediment for efficient genome editing. While some studies recommended the use of at least two gRNAs per gene (Ran et al., 2013), we show that one carefully selected gRNA can be effective for study systems with a demanding transformation process. To assess the SNP interference of gRNA specificity, we adopted the CRISPR-P pipeline (Lei et al., 2014) for gRNA design using a custom, variant-substituted 717 genome based on c. 20× resequencing data. Off-target cleavage potential was assessed as described (Hsu et al., 2013), and up to 10 highest-scoring gRNA candidates (with a specificity score ≥ 0.5) per genes were checked for the presence of sequence variants in both alleles. Of the 394 008 exonic gRNA candidates derived from 41 264 genes, c. 42% contain indels or SNPs in the target sequences (Fig. 1g). In the majority of cases (c. 66%), SNPs are detected within or proximal to the PAM (Fig. 1g), likely rendering those gRNAs ineffective (Jinek et al., 2012). From this analysis, we extracted a subset of exonic gRNAs for 38 509 gene models (up to 30 gRNAs per gene, for a total 886 786 gRNAs) that are predicted to be 717-specific with no known SNPs or indels (Table S1). In addition, a searchable database (http://aspendb.uga.edu/s717) is provided to facilitate screening of custom-designed 717 gRNAs for sequence variants that may compromise CRISPR efficiency. The database can also be used to locate SNP-bearing gRNA target sites that can be exploited for allele-specific gene editing. As 717 is arguably the most widely used transformation clone in Populus functional genomics research, these resources should facilitate the wide adoption of CRISPR in this effort. For woody perennials with long generation cycles, accelerated breeding is finally within reach using CRISPR genome editing. As CRISPR editing at the target loci is biallelic and the transgene (Cas9 and selectable marker) integration is hemizygous, we envision a scenario where CRISPR editing is performed in both male and female clones, such as the early-flowering FT transgenics (Hoenicka et al., 2014). Controlled crosses between male and female primary transformants with confirmed biallelic mutations should in theory produce transgene-free, biallelic mutants in 25% of the progeny (or 6.25% for two-locus crosses). Elite clones carrying targeted gene mutation(s) without foreign DNA may ultimately help increase public acceptance of bioengineered agricultural products. The authors thank Batbayar Nyamdari for chlorogenic acid analysis, Mohammad Mohebbi for implementation of the database, Wayne Parrott for supporting development of the CRISPR vectors, Wade Newbury for photographic assistance, the Analytical Services of the Complex Carbohydrate Research Center for lignin analysis, and Magdy Alabady and Jeff Wagner at the Georgia Genomics Facility for assistance with Illumina sequencing. This work was supported by the Georgia Research Alliance-Haynes Forest Biotechnology endowment. Please note: Wiley Blackwell are not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing material) should be directed to the New Phytologist Central Office. Fig. S1 gRNA locations and alignments. Fig. S2 CRISPR/Cas9-mediated editing did not occur at the 4CL5 locus due to SNPs. Table S1 Gene-specific and SNP-free 717 gRNAs Methods S1 Methods and primer sequences. Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.