Scholarly article on topic 'Efficient targeted mutagenesis of rice and tobacco genomes using Cpf1 from Francisella novicida'

Efficient targeted mutagenesis of rice and tobacco genomes using Cpf1 from Francisella novicida Academic research paper on "Biological sciences"

0
0
Share paper
Academic journal
Sci. Rep.
OECD Field of science
Keywords
{""}

Academic research paper on topic "Efficient targeted mutagenesis of rice and tobacco genomes using Cpf1 from Francisella novicida"

SCIENTIFIC REPORTS

Received: 09 September 2016 Accepted: 04 November 2016 Published: 01 December 2016

Efficient targeted mutagenesis of rice and tobacco genomes using Cpfl from Francisella novicida

Akira Endo1'*, Mikami Masafumi1,2'*, Hidetaka Kaya1 & Seiichi Toki1,2,3

CRISPR/Cas9 systems are nowadays applied extensively to effect genome editing in various organisms including plants. CRISPR from Prevotella and Francisella 1 (Cpf1) is a newly characterized RNA-guided endonuclease that has two distinct features as compared to Cas9. First, Cpf1 utilizes a thymidine-rich protospacer adjacent motif (PAM) while Cas9 prefers a guanidine-rich PAM. Cpf1 could be used as a sequence-specific nuclease to target AT-rich regions of a genome that Cas9 had difficulty accessing. Second, Cpf1 generates DNA ends with a 5' overhang, whereas Cas9 creates blunt DNA ends after cleavage. "Sticky" DNA ends should increase the efficiency of insertion of a desired DNA fragment into the Cpf1-cleaved site using complementary DNA ends. Therefore, Cpf1 could be a potent tool for precise genome engineering. To evaluate whether Cpf1 can be applied to plant genome editing, we selected Cpf1 from Francisella novicida (FnCpf1), which recognizes a shorter PAM (TTN) within known Cpf1 proteins, and applied it to targeted mutagenesis in tobacco and rice. Our results show that targeted mutagenesis had occurred in transgenic plants expressing FnCpf1 with crRNA. Deletions of the targeted region were the most frequently observed mutations. Our results demonstrate that FnCpf1 can be applied successfully to genome engineering in plants.

Targeted mutagenesis and gene targeting using sequence specific nucleases (SSNs) are powerful strategies used to accelerate molecular breeding of crops. Several types of SSN, such as ZFNs (zinc-finger nucleases), TALENs (transcription-activator-like effector nucleases) and CRISPR/Cas9 (clustered regularly interspaced short palindromic repeats/CRISPR-associated protein 9) have been intensively adapted for use in genome editing in plants1-3. The number of papers reporting plant genome engineering with CRISPR/Cas9 has increased markedly during the last few years4, highlighting the fact that CRISPR/Cas9 is a versatile tool with which to perform targeted mutagenesis in plants5.

Several types of CRISPR/Cas9 systems are known to function as adaptive immune systems in archaea and bacteria6. The well-characterized CRISPR/Cas9 system is categorized as a class 2/type II immune system comprised of single-component effector proteins, and has been engineered for genome editing7,8. Cas9 protein is an endonuclease functioning with CRISPR RNA (crRNA) and transactivating crRNA (tracr RNA)9. The Cas9 RNA complex scans double-stranded DNA to find a DNA sequence complementary to the 20-nucleotide (nt) spacer region (target sequence) within the crRNA, as well as a protospacer adjacent motif (PAM), and then cleaves the target sequence on the invader DNA9. The recognition sequence of the PAM, which is located immediately downstream of the target sequence, varies in each Cas9 protein6,7,10. Widely used Cas9 proteins from Streptococcus pyogenes (SpCas9) and Staphylococcus aureus (SaCas9) prefer a guanidine-rich PAM, with the PAM sequences of SpCas9 and SaCas9 being NGG and NGRRT, respectively6,11,12.

Recently, Cpf1-a new type of RNA-directed endonuclease—was classified as class 2/type V in the CRISPR/Cas system8,13. Some features of Cpf1 differ from those of Cas9 although their functions are similar. While the Cas9 RNA complex contains two RNA molecules in nature, Cpf1 functions with a single crRNA to search and cleave target sequences in infiltrator DNA. A PAM sequence, located immediately upstream of the spacer sequence (target sequence), is also necessary for recognition of the 24 nt target sequence of Cpf113. PAM recognition sequences

1Plant Genome Engineering Research Unit, Institute of Agrobiological Sciences, National Agriculture and Food

Research Organization, 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan. 2Graduate School of Nanobioscience,

Yokohama City University, 22-2 Seto, Yokohama, Kanagawa 236-0027, Japan. 3Kihara Institute for Biological

Research, Yokohama City University, 641-12 Maioka-cho, Yokohama, Kanagawa 244-0813, Japan. *These authors

contributed equally to this work. Correspondence and requests for materials should be addressed to S.T. (email:

stoki@affrc.go.jp)

Figure 1. T-DNA constructions for FnCpf1 expression in tobacco and rice. (a) Construct for targeted mutagenesis in tobacco. FnCpfl (At) was inserted downstream of the PcUbi promoter. The Athsp terminator was placed at the end of FnCpf1 ORF. The AtADH 5'-UTR was introduced between the PcUbi promoter and FnCpfl (At) to enhance translation. the nuclear localization signal (NLS) from the SV40 large T-antigen was fused translationally to the C-terminus of FnCpfl. The crRNA is under the control of Arabidopsis U6-26 promoter. To isolate transformants with kanamycin resistance, an NPTII cassette was included in the construct. AtADH 5'-UTR: 5' untranslated region of Arabidopsis thaliana ALCOHOLDEHYDROGENASE gene. Athsp ter: the terminator region of Arabidopsis thaliana HEAT SHOCK PROTEIN 18.2 gene. (b) Construct used for targeted mutagenesis in rice. The ZmUbi-1 promoter drives expression of FnCpfl (Os). The OsADH 5'-UTR was introduced between the ZmUbi promoter and FnCpfl (At) to enhance translation. An NLS was fused translationally to the C-terminus of FnCpfl (Os). Pea3A and OsActl terminators were inserted tandemly downstream of FnCpfl to terminate transcription. Expression of crRNA is driven by the rice U6-2 promoter. To screen transformants with hygromycin resistance, HPT cassettes were included in the construct. OsADH 5'-UTR: 5' untranslated region of Oryza sativa ALCOHOLDEHYDROGENASE gene. Pea3A ter: the terminator region of Pisum sativum rbcS 3A gene. OsAct ter: the terminator region of Oryza sativa Actin gene.

of Cpfl are different among bacterial species, and known Cpfl proteins tend to utilize a thymidine-rich PAMl3. The PAM sequence of Cpfl from Francisella novicida is TTN. TTTN is recognized as PAM by Cpfl isolated from Acidaminococcus sp. BV3L6 (AsCpfl) and Lachnospiraceae bacterium MA2020 (LbCpfl)n. In addition, Cpfl and Cas9 generate different types of DNA ends after cleavage of the target sequence. Cpfl creates DNA ends with a 5' overhang, while Cas9 generates blunt endsl3. Cleavage by FnCpfl occurs at the l8th base from PAM on the non-targeted (+) strand, and at the 23rd base from PAM on the targeted (-) strand within 24 nt spacer sequencel3. Since DSBs with compatible overhangs can be repaired via precise end joiningl4, the sticky DNA ends generated by Cpfl are thought to be ideally suited to precise genome editing such as knock-in or replacement of a desired DNA fragment using compatible DNA ends. These specific features of Cpfl can broaden the spectrum of genome editing that is possible using SSNs.

To apply Cpfl to plant genome engineering, we selected FnCpfl for the following reasons. Since FnCpfl recognizes TTN as PAM sequence, the frequency of target sequences for FnCpfl in plant genomes is thought to be higher than that of AsCpfl and LbCpfl, which utilize TTTN as a PAM sequencel3. The shorter PAM of FnCpfl is thus a practical and favorable feature for targeted mutagenesis, although the genome editing activity of FnCpfl in human cells is reported to be lower than that of AsCpfl and LbCpfll3. Targeted mutagenesis of plants has been performed mostly via stable transformation, introducing T-DNA harboring SSNs into plant genomesl5J6. We hypothesized that the lower genome editing activity of FnCpfl might be compensated by the constitutive expression of FnCpfl in plants. We first engineered a binary vector to optimize the expression of FnCpfl in plants, then designed a targeted mutagenesis experiment in tobacco and rice.

Results

FnCpf1 expression vectors for targeted mutagenesis in tobacco and rice. To perform targeted mutagenesis using FnCpfl in tobacco and rice, we first constructed binary vectors harboring FnCpfl and antibiotic resistance genes in the T-DNA region (Fig. la,b) (cf. our previous construction of binary vectors to express SpCas9l2,ly). The codon usage of FnCpfl ORF was optimized for effective translation in A. thaliana and rice, respectively. Codon-optimized FnCpfl (At) and FnCpfl (Os) were cloned into the binary vectors, pRI20l-AN and pPZP200, respectively. FnCpfl (At) was driven by the ubiquitin 4-2 promoter from Petroselinum crispum (Fig. la)l8a9. On the other hand, FnCpfl (Os) was placed under the control of the ubiqcruitin promoter from Zea mays (Fig. lb)20. To express crRNA of FnCpfl, Arabidopsis U6-26 and rice U6-2 small nuclear RNA gene promoters were used in tobacco and rice, respectively (Fig. la,b)l7J9.

Targeted mutagenesis in tobacco. To examine whether FnCpfl (At) can induce targeted mutation in tobacco, 24 nt target sequences were designed to induce mutations in two genes, i.e., phytoene desaturase (NtPDS) and STENOFOLIA ortholog in Nicotiana tabacum (NtSTF1). Mutation in NtPDS will cause an albino phenotype since a defect in carotenoid biosynthesis leads to loss of pigments such as chlorophyll^. NtSTF1 is thought to be involved in leaf blade expansion since a lam mutant having a defect in LAM (NsSTF1) shows a narrow leaf

Organism

Target gene

Target sequence

Nicotiana tabacum

NtSTFI

cr NtPDS-\ crNtPDS-2 crNtSTF1-\ cxNtSTF1-2 <xNtSTF1-Z crNtSTF1-A

Oryza sativa

OsNCEDI OsAOl

crOsDL-1 crOsDL-2 cr OsALS-l cr OsALS-2 cr OsNCEDI-1 cxOsAOIA

TCATCCAGTCCTTAACACTTAAAC ACATGGCAATGAACACCTCATCTG CTAGCTGATCAAAGGAATGCCACG GCTCCATTGTCGTTCTTGGTGTTG TAAGTGGAAGAAACTCAAAAAACT AGAGAAGGATGAAGTAGAGATATC

GTCTTTTGGGTAGCTGCAGGTTGG GGGACCTTGCACTGACTGCAGGAG ACTCTTCTTTGTTACACGGACTGC CCAACATACAGATTATAGATTAAT CCCAAGGCCATTGGGGAGCTCCAT GCAATGCTGTGTCATATGTTAATT

Table 1. List of target genes, guide RNAs (gRNA), target sequences and PAM sequences used in this study.

Green characters in target sequences indicate PAM motif of FnCpf1.

phenotype in Nicotiana sylvestris22'23. Nicotiana tabacum is an amphidiploid species derived from ancestors that are closely related to the diploid species N. sylvestris and N. tomentosiformis. Therefore, mutant phenotypes can be observed when mutations occur in functionally identical genes located in the N. sylvestris and N. tomentosiformis genomes (S and T genomes), respectively. To select target sequences in these genes on both S and T genomes, TTN, a PAM sequence of FnCpfl, is first searched for within exons of these two genes, and then, 24 nt sequences immediately downstream of the PAM were selected as target sequence. Two and four target sequences, respectively, were designed against NtPDS and NtSTF1 genes (Table 1). These crRNAs were named as follows, crNtPDS-1 and crNtPDS-2, and crNtSTF1 -1-crNtSTF1 -4. The mutation ratio was estimated by scoring the number of regenerated plants with mutation around the target sequence of FnCpfl relative to the total number of regenerated plants. The mutation frequency represented the ratio of mutated clones per total randomly sequenced clones.

Transgenic TO plants showing kanamycin resistance (between 14 and 20 lines) were isolated for each of the target loci. Genomic DNA was isolated from each TO transgenic plant, and the target loci were then amplified by PCR. To detect mutations in NtPDS genes, PCR products were resolved by performing a heteroduplex mobility assay (HMA). As shown in Fig. 2a, DNA bands with higher molecular weights were observed in several transgenic lines but not in wild-type (WT) (Fig. 2a). Mutation ratios at crNtPDS-1 and crNtPDS-2 loci were around 45%. Mutation patterns of these transgenic lines were analyzed by DNA sequencing, and mutation frequencies were estimated. Deletion mutations were observed around the cleavage site of FnCpf1 (Fig. 2b). Mutation frequencies at crNtPDS-1 and crNtPDS-2 loci were 12.5-65.2% and 4.3-50%, respectively (Fig. 2b, top and middle). Regardless of crRNA, mutation frequencies on the S and T genomes had no consistency in any of the transgenic plants tested (Fig. 2b, top and middle). FnCpf1-induced mutations on the S or T genome seem to occur stochastically at NtPDS loci.

Among the four crRNAs designed to target NtSTF1, crNtSTF1 -1 to crNtSTF1-3 did not induce mutation in any of the transgenic lines (Supplemental Fig. 1), while mutations were seen with crNtSTF1-4. This latter crRNA was able to target the NtSTF1 gene in the N. tomentosiformis genome but not in the N. sylvestris genome since crNt-STF1-4 had a one-base mismatched sequence against NtSTF1 (S) (Fig. 2b, lower). To find mutations in NtSTF1, PCR products were subjected to CAPS assay. The mutation ratios of crNtSTF1-4 at the S and T loci were 7.1% and 71.4%, respectively (Fig. 2c). Mutation frequencies of crNtSTF1-4 on the T locus were 28.6-68.2% (Fig. 2b, bottom). These results clearly showed that FnCpf1 could induce mutation at target sites in tobacco. However, we could not recover transgenic plants harboring biallelic mutations at the target sites in the TO generation.

We next confirmed whether FnCpf1-induced mutations on crNtSTF1-4 locus are genetically transmitted to the next generation. PCR was performed using DNA extracted from progenies of crNtSTF1 -t4 line #7 and used for CAPS analysis. As a result, homoallelic mutation was observed in some of the progenies from transgenic tobacco line #7 (Fig. 2d).

Targeted mutagenesis in rice. Next, we applied FnCpf1 (Os) to induce mutation in rice. Agrobacterium-mediated transformation was performed to introduce the T-DNA harboring FnCpf1 (Os) into scutellum-derived calli. The genes OsDrooping leaf (OsDL) and OsAcetolactone synthase (OsALS) were selected as the target genes. dl mutants show a loss of midrib in the leaf blade, resulting in a drooping leaf phenotype24,25. Acetolactone synthase is involved in the synthesis of branched-chain amino acids26,27. Loss of ALS activity leads to lethality. As in tobacco, two target sequences were designed for targeted mutagenesis of each gene. The

Figure 2. Analyses of FnCpf1-induced mutations in tobacco. (a) Heteroduplex mobility assay to detect mutation on crNtPDS-l (upper two panels) and crNtPDS-2 (lower panel) loci. (S) and (T) indicate PCR products amplified from the loci including each target sequence on N. sylvestris and N. tomentosiformis genomes, respectively. (b) Patterns of mutations detected in crNtPDS-l (top), crNtPDS-2 (middle) and crNtSTF-4 (bottom) loci. The target DNA sequences of each crRNA are shown as wild-type (WT) at the top with underlined. (S) and (T) indicate the target sequence on both S and T genomes. The PAM regions are shown by green. Mismatched nucleotide is indicated in red. Line numbers of transgenic plants were indicated as # at left side of each sequence. DNA deletions are presented as dashes. The length of indel and the number of clones are represented at the right side of each sequence (+, insertion; —, deletion; x, number of clones). Mut. Freq. (%): Mutation frequency. (c) CAPS analysis of crNtSTF-4 locus in T0 generation. (d) CAPS analysis of crNtSTF-4 locus in Tl generation of line #7. —: Non-digested PCR products, +: EcoRV-digested PCR products. Arrow head indicated the position of undigested PCR products. An undigested band indicates mutation at the crNtSTF-4 locus.

corresponding crRNAs were named crOsDL-l~2, and crOsALS-l~2 (Table l). The digestion sites of both FnCpfl and restriction enzymes were designed to overlap within the target sequences, so that the restriction site would be disrupted if mutations occur at the target loci following cleavage by FnCpfl. The CAPS assay was used to assess the presence of mutations in the target sequence.

To examine FnCpfl-induced mutations in rice calli, genomic DNA was extracted from each callus showing hygromycin resistance. Target loci were amplified by PCR, and the products were subjected to CAPS assay (Fig. 3a). Undigested PCR products were observed in transgenic lines, indicating that the introduction of FnCpfl with each of the crRNAs was able to induce mutations in rice calli. At the crOsDL-2 and crOsALS-2 target loci, the mutation frequency in rice calli was over 60% (Fig. 3b). Mutation frequencies at the crOsDL-l and crOsALS-l target loci were 8.3-25% and l5%, respectively (Fig. 3b). As shown in Fig. 3b, deletions occurred mostly at target sites of all crRNA. In the case of regenerated plants obtained from transgenic calli harboring FnCpfl with crOsDL-2 and crOsALS-2, the mutation ratio of the regenerated plants at the crOsDL-2 target locus was 85.7% (6/7), and the three plants with bi-allelic mutations showed a drooping leaf phenotype in the T0 generation (Supplemental Fig. 2). In addition, the ratio of regenerated plants with mutation at the crOsALS-2 target locus was 90% (9/l0), and five bi-allelic mutant plants were obtained. Because all the bi-allelic mutant plants had deletions with mutations that did not generate frameshifts on the OsALS gene, these als mutants were assumed to be

Figure 3. Analyses of FnCpf1-induced mutations in rice. (a) CAPS analysis of crOsDL-l~2 and crOsALS-l~2 loci. —: Non-digested PCR products, +: PstI or AseI-digested PCR products. Arrow head indicated the position of undigested PCR products. An undigested band indicates mutation in the target loci. (b) Patterns of mutations detected in crOsDL-l~2 and crOsALS-l~2 loci. The target DNA sequences of each crRNA are shown as wildtype (WT) at the top with underlined. (S) and (T) indicate the target sequence on both S and T genomes. The pAm regions are shown by green. Mismatched nucleotide is indicated in red. Line numbers of transgenic plants were indicated as # at left side of each sequence. DNA deletions are presented as dashes. The length of indel and the number of clones are represented at the right side of each sequence (+, insertion; —, deletion; x, number of clones). Mut. Freq. (%): Mutation frequency.

viable (data not shown). In addition, heteroallelic mutation on crOsDL-l target locus was inherited as homoallelic mutation in the progeny of line #l8 (Supplemental Fig. 3). These results clearly indicate that FnCpfl is able to cleave target sites in rice.

Off target mutation analysis of FnCpf1 in rice. Zetsche et al. reported that the seed region of the FnCpfl crRNA is within the first 5 nt of the 5'-end of the spacer sequence in vitro13. We tried to examine the possibility of FnCpfl-induced mutations at off-target genes in rice. To explore this possibility, we selected the 9-cis-epoxycarotenoid dioxygenase (NCED) gene family (OsNCED1-3), and the aldehyde oxidase (AO) gene family (OsAO1-5)28,29. When the crOsNCED1-1 sequence of the OsNCED1 gene was defined as the target sequence, the corresponding sequence of the OsNCED2 has one, and that of OsNCED3 has two, mismatched bases (Table 2). The mutation frequency using FnCpfl with crOsNCED1-l in rice calli was 2.l4-23.3% in the target gene (OsNCED1), while those in the off-target genes (OsNCED2 and OsNCED3) were 0-6.25% and 0%, respectively (Table 2). Next, crOsAO1-l was designed to target both the OsAO1 and OsAO2 genes. The crOsAO1-l has one mismatched base on the corresponding region of the OsAO3 and OsAO4 genes, and two mismatched bases in the OsAO5 gene. When crOsAO-1 was used with FnCpfl, mutation frequencies in the OsAO1, OsAO2, and OsAO4 genes were 38.8-50%, 24.l-36.6% and 0-5%, respectively (Table 2). No mutations were observed in OsAO3 and OsAO5.

Target gene crRNA Target gene PAM Target sequence Mutation frequency of calli (%)

OsNCED 1 w OsNCED 1A OsNCED 1 (On) TTC CCCAAGGCCATTGGGGAGCTCCAT 21.4 23.3

OsNCED2 (Off) TTC CCCAAGGCCATCGGCGAGCTCCAT 0 6.25

OsNCED3 (Off) TTC CCCAAGGCCATCGGCGAGCTCCAC 0 0

OsAOl crOsAOIA OsAOl (On) TTG GCAATGCTGTGTCATATGTTAATT 38.8 50

0sA02 (On) TTG GCAATGCTGTGTCATATGTTAATT 24.1 36.6

0sA03 (Off) TTG GCAATGCTGTTTCATATGTTAATT 0 0

0sA04 (Off) TTG GCAATGCTGTTTCATATGTTAATT <5 <5

0sA05 (Off) TTG GCAATGCTGTCTCATATGTGAATT 0 0

Table 2. Off-target mutation analysis in OsNCED or OsAAO gene families in rice. Red characters indicate mismatched nucleotide of off-target genes against each crRNA.

Discussion

In this study, we evaluated the use of FnCpfl in targeted mutagenesis of rice and tobacco genomes; our data showed clearly that FnCpfl can be applied to targeted mutagenesis in these crops.

Zetsche et al. evaluated the genome editing activity of various Cpfl proteins via transient assay in human cells13. Their results revealed that AsCpfl and LbCpfl exhibited a higher activity to induce mutation than other Cpfl enzymes, including FnCpfll3. In our study, FnCpfl was able to effectively induce mutations in various target genes upon constitutive expression of FnCpfl in tobacco and rice. We previously reported that the engineering of binary vectors expressing SpCas9 significantly affected the mutation ratio in ricel7. To enhance the expression of FnCpfl in tobacco and rice, previous studies have introduced a variety of devices such as codon-optimization of FnCpfl, addition of a nuclear localization signal sequence to FnCpfl, translational enhancer, transcriptional terminator and constitutive promoters, to the binary vectors usedl7J9. It is highly likely that a combination of a stable expression system and these additional tweaks contributed to improving the genome editing activity of FnCpfl in plant cells.

In our targeted mutagenesis experiments, the average mutation frequencies on targeted loci in tobacco and rice were 28.2% and 47.2%, respectively. The average mutation frequency in tobacco was lower than that in rice. In addition, we successfully isolated bialellic mutants in the TO generation in rice but not in tobacco. This may be due to differences in the transformation processes between rice and tobacco. In the case of rice transformation, dedifferentiated callus showing relatively higher cell division activity was utilized for Agrobacterium infection, and the callus state was maintained until the regeneration step30. On the other hand, the tobacco transformation process started from leaf discs. Transformed leaf discs were subjected to selection on antibiotics. During this process, screening proceeded in parallel with the other processes, including callus induction and regeneration. When we applied SpCas9 to targeted mutagenesis in rice, mutation ratio and frequency increased in accordance with the duration of the callus state^. Therefore, it may be necessary to prolong the duration of the callus state in tobacco transformation in order to isolate biallelic mutants in the TO generation.

FnCpfl induced mostly chimeric mutations in tobacco, with various mutations being observed in each regenerated plant (Fig. 2b). On the other hand, in our previous study using SpCas9l7,32, each regenerated rice plant possessed monoallelic or biallelic mutation. This difference may due to the difference in the transformation process as described above. Rice plants could be regenerated mostly from genetically homogeneous callus, and SSN-induced mutation rarely occurs in regenerated rice plants33, whereas tobacco might accumulate mutation events in going from regeneration to the reproductive stage. As a result, constitutively occurring mutations could create genetically chimeric plants. When targeted mutagenesis was performed with SpCas9 driven by ubiquitously active promoters, such as 35 S or the ubiquitin promoter, chimeric mutations occurred similarly in tobacco or Arabidopsisl2,34. To address this, several groups have already reported the use of tissue-specific promoters to express SpCas9, e.g., in flower meristem or germ line cells, and have succeeded in reducing chimeric mutation in Arabidopsis34,35. Therefore, application of promoters functioning in an inducible or tissue-specific manner to control the expression of FnCpfl spatiotemporally could contribute to reducing chimeric mutation.

FnCpfl-induced mutations were mostly deletions. The mutation patterns of FnCpfl were similar to those of TALENs, ZFNs and paired nickases (Cas9)33,36-38. FnCpfl generates DNA ends with 5' overhangs; TALENs, ZFNs and paired nickases also generate sticky DNA ends after cleavage of target sequences. It is highly possible that similar DNA repair mechanisms operate after cleavage with these nucleases. DSBs activate DNA repair machinery such as homology dependent repair (HDR) or non-homologous end joining (NHEJ)39. DSB repair is affected greatly by the structure of the DNA ends39. Cohesive DSBs with compatibility tend to be repaired by precise end joining, while non-compatible DSB ends with various deletions are repaired via NHEJ4O. In our experiments, we tried to introduce mutations in a single gene with a single crRNA. Single digestion of a target locus generates a DSB with compatibility, and most such DSBs could be repaired precisely.

Four crRNAs were designed for the NtSTFl gene, Three of which did not induce mutations in the target loci. Two possible explanations can be considered: (l) there could be epigenomic modification or chromatin structure around the target regions, which could decrease the accessibility of FnCpfl to the target site4M2. (2) The secondary structure of FnCpfl crRNA; FnCpfl crRNA is 43 nt in length, while the single chimeric guide RNA of Cas9 is around lOO nt9,n. The crRNA of FnCpfl consists of two parts, which are responsible for scaffold (l9 nt) and target recognition (24 nt), respectivelyl3. The interaction between FnCpfl protein and crRNA could be affected

by the secondary structure of the crRNA, which depends strongly on that part of the target sequence. Chemical modification of the crRNA, or an artificially designed crRNA, may improve the interaction between FnCpfl and crRNA. Chemical modification of the ribonucleotide in the guide RNA of SpCas9 improved mutation frequency by altering RNA stability, and the secondary structure of the guide RNA43,44. Since FnCpfl has RNase III activity and trims its crRNA by itself45, expression of FnCpfl crRNA with extra oligonucleotides contributing to preventing unfavorable secondary structure of the crRNA may improve the mutation efficiency of FnCpf l as long as a stable transformation system is used to express FnCpfl with crRNA in plant cells.

Although off-target mutation of FnCpfl was found in two of five off-target genes in rice (Table 2), mutation frequencies at off-target genes were lower than those at on-target genes (Table 2). Each crRNA had one base mismatch in these two off-target genes, OsNCED2 and OsAAO4 (Table 2). A mismatch was found at llth nucleotide from PAM. We also found an off-target mutation in the NtSTF1 gene in the S genome of tobacco. The mismatched nucleotide was found just next to the PAM sequence (Fig. 2b,c). Similar to our results, AsCpfl exhibited consistent tolerance to a single mismatch at positions l, 8, 9 and l9-23 within the 24 nt spacer sequence in human cells46. The problem of off-target mutation by FnCpfl should be improved by increasing the fidelity of FnCpfl. In the case of SpCas9, the fidelity of target recognition depends greatly on precise recognition of PAM47-49; as the PAM sequence becomes longer, fidelity increases. The length of PAM restricts the number of target sites; as PAM becomes shorter, the frequency of occurrence of target sequences in the genome increases. Crystal-structure-based rational engineering of the positively charged groove between the HNH-, RuvC- and PAM-interacting domains in SpCas9 improved the fidelity of target recognition50, thus proving it is possible to improve the fidelity of FnCpfl by structure-based engineering without increasing PAM length.

It was shown recently that AsCpfl only rarely induces off-target mutation during genome editing in human cells and mice46'5l>52. AsCpfl may have higher fidelity than FnCpfl since the PAM sequences of FnCpfl and AsCpfl are TTN and TTTN, respectively. Application of AsCpfl and LbCpfl will be the next challenge in expanding the utility of Cpfl in plant genome editing.

Methods

Vector construction. Two types of FnCpfl coding sequence were synthesized to optimize codon usage for Arabidopsis thaliana and Oryza sativa, respectively. The coding sequence of each codon-optimized nuclease, FnCpfl (At) and FnCpfl (Os), was cloned into the binary vectors, pRI20l-AN (TaKaRa, Japan) and pPZP20053, respectively. The crRNA of FnCpfl was placed under the control of the U6-26 promoter from Arabidopsis, or the U6-2 promoter from ricel7>l9; 24 nt target sequences were inserted into the Bbs I site next to the crRNA. The expression cassette of FnCpfl crRNA with target sequences was cloned into the site generated by digestion of the binary vector using two restriction enzymes, Asc I and Pac I.

Transformation of tobacco or rice. For tobacco transformation, the binary vector harboring FnCpf1 (At) was introduced into Agrobacterium strain LBA4404. Leaf discs (8 mm diameter) collected from fully expanded leaves of tobacco (Nicotiana tabacum L. cv. Petit Havana SR-l) were used for Agrobacterium-mediated transformation as described in Kaya et al.l2. For rice transformation, the Agrobacterium strain EHAl05 transformed with the binary vector containing FnCpf1 (Os), was used to infect scutellum-derived rice callus (Oryza sativa L. ssp. japonica cv. Nipponbare). Details of the rice transformation procedure have been described previously32.

CAPS analysis and heteroduplex mobility assay. Genomic DNA was extracted from regenerated shoots of tobacco, hygromycin-resistant rice calli, or regenerated rice plants, using Agencourt Chloro Pure (BECKMAN COULTER, USA), and target loci were amplified by PCR using the primer sets listed in supplemental table l. For cleaved amplified polymorphic sequences (CAPS) analysis, PCR products were digested by the appropriate restriction enzymes, and then analyzed by agarose gel electrophoresis. A heteroduplex mobility assay (HMA) was performed using MultiNA (SHIMADZU, Japan) according to our previous reportl2.

Sequencing analysis. PCR products used in CAPS analysis or HMA were cloned into pCR-BluntII-TOPO (Thermo Fisher Scientific, USA). DNA sequence was determined using a 3500xL genetic analyzer (Applied Biosystems, USA).

References

1. Lee, J., Chung, J.-H., Kim, H. M., Kim, D.-W. & Kim, H. Designed nucleases for targeted genome editing. Plant Biotechnol. J. 14, 448-462 (20l6).

2. Osakabe, Y. & Osakabe, K. Genome editing with engineered nucleases in plants. Plant Cell Physiol. 56, 389-400 (20l5).

3. Voytas, D. F. Plant genome engineering with sequence-specific nucleases. Annu. Rev. Plant Biol. 64, 327-350 (20l3).

4. Khatodia, S., Bhatotia, K., Passricha, N., Khurana, S. M. P. & Tuteja, N. The CRISPR/Cas Genome-Editing Tool: Application in Improvement of Crops. Front. Plant Sci. 7, 506 (20l6).

5. Kumar, V. & Jain, M. The CRISPR-Cas system for plant genome editing: advances and opportunities. J. Exp. Bot. 66, 47-57 (20l5).

6. Ran, F. A. et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature 520, l86-l9l (20l5).

7. Fonfara, I. et al. Phylogeny of Cas9 determines functional exchangeability of dual-RNA and Cas9 among orthologous type II CRISPR-Cas systems. Nucleic Acids Res. 42, 2577-2590 (20l4).

8. Makarova, K. S. et al. An updated evolutionary classification of CRISPR-Cas systems. Nat. Rev. Microbiol. 13, 722-736 (20l5).

9. Jinek, M. et al. A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity. Science (80-. ). 337, 8l6-82l (20l2).

10. Hou, Z. et al. Efficient genome engineering in human pluripotent stem cells using Cas9 from Neisseria meningitidis. Proc. Natl. Acad. Sci. U. S. A. 110, l5644-l5649 (20l3).

11. Steinert, J., Schiml, S., Fauser, F. & Puchta, H. Highly efficient heritable plant genome engineering using Cas9 orthologues from Streptococcus thermophilus and Staphylococcus aureus. Plant J. 84, l295-l305 (20l5).

12. Kaya, H., Mikami, M., Endo, A., Endo, M. & Toki, S. Highly specific targeted mutagenesis in plants using Staphylococcus aureus Cas9. Sci. Rep. 6, 2687l (20l6).

13. Zetsche, B. et al. Cpfl is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell 163, 759-77l (20l5).

14. Budman, J. & Chu, G. Processing of DNA for nonhomologous end-joining by cell-free extract. EMBO J. 24, 849-860 (2005).

15. Altpeter, F. et al. Advancing Crop Transformation in the Era of Genome Editing. Plant Cell 28, l5l0-l520 (20l6).

16. Schaeffer, S. M. & Nakata, P. A. CRISPR/Cas9-mediated genome editing and gene replacement in plants: Transitioning from lab to field. Plant Sci. 240, l30-l42 (20l5).

17. Mikami, M., Toki, S. & Endo, M. Comparison of CRISPR/Cas9 expression constructs for efficient targeted mutagenesis in rice. Plant Mol. Biol. 88, 56l-572 (20l5).

18. Kawalleck, P., Somssich, I. E., Feldbrugge, M., Hahlbrock, K. & Weisshaar, B. Polyubiquitin gene expression and structural properties of the ubi4-2 gene in Petroselinum crispum. Plant Mol. Biol. 21, 673-684 (l993).

19. Fauser, F., Schiml, S. & Puchta, H. Both CRISPR/Cas-based nucleases and nickases can be used efficiently for genome engineering in Arabidopsis thaliana. Plant J. 79, 348-359 (20l4).

20. Toki, S. et al. Expression of a Maize Ubiquitin Gene Promoter-bar Chimeric Gene in Transgenic Rice Plants. Plant Physiol. 100, l503-l507 (l992).

21. Norris, S. R., Barrette, T. R. & DellaPenna, D. Genetic dissection of carotenoid synthesis in arabidopsis defines plastoquinone as an essential component of phytoene desaturation. Plant Cell 7, 2l39-2l49 (l995).

22. McHale, N. A. & Marcotrigiano, M. LAMl is required for dorsoventrality and lateral growth of the leaf blade in Nicotiana. Development 125, 4235-4243 (l998).

23. Tadege, M. et al. STENOFOLIA regulates blade outgrowth and leaf vascular patterning in Medicago truncatula and Nicotiana sylvestris. Plant Cell 23, 2l25-2l42 (20ll).

24. Nagasawa, N. et al. SUPERWOMANl and DROOPING LEAF genes control floral organ identity in rice. Development 130, 705-7l8 (2003).

25. Yamaguchi, T. et al. The YABBY gene DROOPING LEAF regulates carpel specification and midrib development in Oryza sativa.

Plant Cell 16, 500-509 (2004).

26. McCourt, J. A. & Duggleby, R. G. Acetohydroxyacid synthase and its role in the biosynthetic pathway for branched-chain amino acids. Amino Acids 31, l73-2l0 (2006).

27. Tan, S., Evans, R. R., Dahmer, M. L., Singh, B. K. & Shaner, D. L. Imidazolinone-tolerant crops: history, current status and future. Pest Manag. Sci. 61, 246-257 (2005).

28. Tan, B. C. et al. Molecular characterization of the Arabidopsis 9-cis epoxycarotenoid dioxygenase gene family. Plant J 35, 44-56 (2003).

29. Hirano, K. et al. Comprehensive transcriptome analysis of phytohormone biosynthesis and signaling genes in microspore/pollen and tapetum of rice. Plant Cell Physiol. 49, l429-l450 (2008).

30. Toki, S. et al. Early infection of scutellum tissue with Agrobacterium allows high-speed transformation of rice. Plant J. 47, 969-976

(2006).

31. Mikami, M., Toki, S. & Endo, M. Parameters affecting frequency of CRISPR/Cas9 mediated targeted mutagenesis in rice. Plant Cell Rep. 34, l807-l8l5 (20l5).

32. Endo, M., Mikami, M. & Toki, S. Multigene knockout utilizing off-target mutations of the CRISPR/Cas9 system in rice. Plant Cell Physiol. 56, 4l-47 (20l5).

33. Nishizawa-Yokoi, A. et al. A Defect in DNA Ligase4 Enhances the Frequency of TALEN-Mediated Targeted Mutagenesis in Rice. Plant Physiol. 170, 653-666 (20l6).

34. Mao, Y. et al. Development of germ-line-specific CRISPR-Cas9 systems to improve the production of heritable gene modifications in Arabidopsis. Plant Biotechnol. J. 14, 5l9-532 (20l6).

35. Hyun, Y. et al. Site-directed mutagenesis in Arabidopsis thaliana using dividing tissue-targeted RGEN of the CRISPR/Cas system to generate heritable null alleles. Planta 241, 27l-284 (20l5).

36. Osakabe, K., Osakabe, Y. & Toki, S. Site-directed mutagenesis in Arabidopsis using custom-designed zinc finger nucleases. Proc. Natl. Acad. Sci. 107, l2034-l2039 (20l0).

37. Mikami, M., Toki, S. & Endo, M. Precision Targeted Mutagenesis via Cas9 Paired Nickases in Rice. Plant Cell Physiol. 57, l058-l068 (20l6).

38. Kim, Y., Kweon, J. & Kim, J.-S. TALENs and ZFNs are associated with different mutation signatures. Nature methods 10, l85 (20l3).

39. Lieber, M. R. The mechanism of double-strand DNA break repair by the nonhomologous DNA end-joining pathway. Annu. Rev.

Biochem. 79, l8l-2ll (20l0).

40. Budman, J., Kim, S. A. & Chu, G. Processing of DNA for nonhomologous end-joining is controlled by kinase activity and XRCC4/ ligase IV. J. Biol. Chem. 282, ll950-ll959 (2007).

41. Chen, X. et al. Probing the impact of chromatin conformation on genome editing tools. Nucleic Acids Res. 44, 6482-6492 (20l6).

42. Hilton, I. B. et al. Epigenome editing by a CRISPR-Cas9-based acetyltransferase activates genes from promoters and enhancers. Nat.

Biotechnol. 33, 5l0-5l7 (20l5).

43. Hendel, A. et al. Chemically modified guide RNAs enhance CRISPR-Cas genome editing in human primary cells. Nat. Biotechnol.

33, 985-989 (20l5).

44. Doench, J. G. et al. Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat. Biotechnol. 32, l262-l267 (20l4).

45. Fonfara, I., Richter, H., Bratovic, M., Le Rhun, A. & Charpentier, E. The CRISPR-associated DNA-cleaving enzyme Cpfl also processes precursor CRISPR RNA. Nature 532, 5l7-52l (20l6).

46. Kleinstiver, B. P. et al. Genome-wide specificities of CRISPR-Cas Cpfl nucleases in human cells. Nat. Biotechnol. 34, 869-74 (20l6).

47. Kleinstiver, B. P. et al. Engineered CRISPR-Cas9 nucleases with altered PAM specificities. Nature 523, 48l-485 (20l5).

48. Hirano, S., Nishimasu, H., Ishitani, R. & Nureki, O. Structural Basis for the Altered PAM Specificities of Engineered CRISPR-Cas9. Mol. Cell61, 886-894 (20l6).

49. Anders, C., Bargsten, K. & Jinek, M. Structural Plasticity of PAM Recognition by Engineered Variants of the RNA-Guided Endonuclease Cas9. Mol. Cell61, 895-902 (20l6).

50. Slaymaker, I. M. et al. Rationally engineered Cas9 nucleases with improved specificity. Science 351, 84-88 (20l6).

51. Kim, D. et al. Genome-wide analysis reveals specificities of Cpfl endonucleases in human cells. Nat. Biotechnol. 34, 863-868 (20l6).

52. Kim, Y. et al. Generation of knockout mice by Cpfl-mediated gene targeting. Nat. Biotechnol. 34, 808-8l0 (20l6).

53. Hajdukiewicz, P., Svab, Z. & Maliga, P. The small, versatile pPZP family of Agrobacterium binary vectors for plant transformation. Plant Mol. Biol. 25, 989-994 (l994).

Acknowledgements

We thank Drs. M. Endo, A. Nishizawa-Yokoi, H. Saika, S. Hirose, K. Abe and N. Ohtsuki for valuable discussions and suggestions, and K. Amagai, R. Aoto, C. Furusawa, A. Mori, A. Nagashii, A. Nakano, F. Suzuki and R. Takahashi for general support. This work was supported by Cabinet Office, Government of Japan, the Cross-ministerial Strategic Innovation Promotion program (SIP).

Author Contributions

A.E., M.M. and S.T. designed the experiments. A.E. and M.M. performed the experiments and wrote the main manuscript. All authors reviewed the manuscript.

Additional Information

Supplementary information accompanies this paper at http://www.nature.com/srep Competing financial interests: The authors declare no competing financial interests.

How to cite this article: Endo, A. et al. Efficient targeted mutagenesis of rice and tobacco genomes using Cpfl from Francisella novicida. Sci. Rep. 6, 38l69; doi: l0.l038/srep38l69 (20l6).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© I This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

© The Author(s) 20l6