摘要
Soybean [Glycine max (L.) Merr.] is an important protein source for both humans and animals. Its relatively low cost combined with its excellent nutritive value has enabled soybeans to attain elite stature as the world's dominant protein feed ingredient. However, soybean protein is relatively poor in sulfur-containing essential amino acids (SCEAA), especially methionine (Met). The SCEAA Met is central to protein synthesis, and it is encoded by the first codon that initiates protein synthesis and, hence, is essential in all living organisms including plants. It is the most limiting amino acid and roughly US$100 million are spent annually by poultry and swine producers to supplement animal feed with Met. The leaching of Met supplements leads to the formation of undesirable volatile sulfides due to bacterial degradation, which can have negative effects on the environment. Hence, a goal of soybean research has been to improve the quality of soy protein by increasing the levels of Met to create a more complete, high-quality food and feed items. However, although a variety of attempts have been made, these efforts have largely failed, with little or no increase in soybean seed Met levels, suggesting a need for new strategies. Low abundance of Met codons in seed storage proteins (SSP) genes and Met catabolism (or degradation) are major factors that limit the production of total Met in seeds. In this dissertation, a 'push' and 'pull' strategy was used. Push refers to efforts to increase the pool levels of free Met (FM) to be incorporated into soybean SSP by blocking Met catabolism. Pull refers to efforts to increase the levels of SSP rich in Met codons by knocking out the soybean [eszett] -conglycinin genes (Gm7s), which encodes SSP that are relatively Met-poor (7S). Through protein rebalancing, the lack of 7S proteins can be compensated by increased production of the relatively Met-rich 11S proteins. These efforts made broad use of CRISPR/Cas9 gene editing tools to knock-out the genes for Methionine [gamma]-lyase (MGL), a Met catabolic enzyme, and 7S SSPs. Consistent with newly emerging literature, a positive connection between high Met content and the synthesis of other amino acids was observed in the generated mutant genotypes. The initial milestone of increasing overall amino acid content in soybean was achieved as gene edited mutant lines showed higher 11s and higher Met levels. The exact relationship between free amino acids (FAA) and protein bound amino acids (PBAA), particularly for soybean, is an open question. Moreover, prediction of total free amino acid (TFAA) and total protein bound amino acids (TPBAA) from individual AA metabolic data is critical for planning AA biofortification, especially in designing CRISPR/Cas9 edits where multiple genes or pathways can be targeted. Machine learning (ML) algorithms are particularly useful for studying complex biological systems, as they can efficiently capture non-linear relationships and complex interactions among the driving variables. ML predictive models for TFAA and TPBAA were developed. TFAA model shows R2 of 0.86 with FAA such as arginine, asparagine, and isoleucine showing top importance in TFAA predictions. TPBAA model shows R2 of 0.95 with PBAA such as Asx (i.e., output of glutamine and asparagine after hydrolysis), leucine and alanine show top importance in TPBAA predictions. Mathematical equations were generated to explain the relationship of TPBAA with TFAA (TPBAA = B0 + B1TFAA) and protein bound Met (PBM) with FM (PBM = B0 + B1FM) where B1 are coefficients (slopes) and B0 are intercepts. Also, ML classification model to differentiate mutant from controls based on AA metabolomic data was developed with accuracy of 1 and robust classification report. Results presented here showed that the dual-gRNA CRISPR/Cas9 system indeed offers a rapid and highly efficient genetic tool to knockout multiple genes simultaneously. Knock out mutations in three GmMGLs genes (GmMGL1, GmMGL2 and GmMGL3) were simultaneously created and, as predicted, the resulting soybean genotypes were 'pushed' for increased FM content. Simultaneous knock out mutations in 7S genes were also created to create protein rebalanced soybean genotypes. Furthermore, ML predictive models developed from AA metabolomic data mining which can aid in planning soybean AA composition biofortification experiments especially CRISPR/Cas9 system where multiple genes (pathways) can be targeted simultaneously.