机器学习
人工智能
杠杆(统计)
计算机科学
标杆管理
深度学习
领域(数学)
纯数学
数学
业务
营销
作者
Daniel J. Diaz,Anastasiya V. Kulikova,Andrew D. Ellington,Claus O. Wilke
标识
DOI:10.1016/j.sbi.2022.102518
摘要
Machine and deep learning approaches can leverage the increasingly available massive datasets of protein sequences, structures, and mutational effects to predict variants with improved fitness. Many different approaches are being developed, but systematic benchmarking studies indicate that even though the specifics of the machine learning algorithms matter, the more important constraint comes from the data availability and quality utilized during training. In cases where little experimental data are available, unsupervised and self-supervised pre-training with generic protein datasets can still perform well after subsequent refinement via hybrid or transfer learning approaches. Overall, recent progress in this field has been staggering, and machine learning approaches will likely play a major role in future breakthroughs in protein biochemistry and engineering.
科研通智能强力驱动
Strongly Powered by AbleSci AI