维数之咒
基础(证据)
计算机科学
转录组
比例(比率)
人工智能
计算生物学
机器学习
基因表达
基因
生物
生物化学
物理
考古
量子力学
历史
作者
Minsheng Hao,Jianya Gong,Xin Zeng,Chi-Ming Liu,Yucheng Guo,Xingyi Cheng,Taifeng Wang,Jianzhu Ma,Le Song,Xuegong Zhang
标识
DOI:10.1101/2023.05.29.542705
摘要
Abstract Large-scale pretrained models have become foundation models leading to breakthroughs in natural language processing and related fields. Developing foundation models in life science for deciphering the “languages” of cells and facilitating biomedical research is promising yet challenging. We developed a large-scale pretrained model scFoundation with 100M parameters for this purpose. scFoundation was trained on over 50 million human single-cell transcriptomics data, which contain high-throughput observations on the complex molecular features in all known types of cells. scFoundation is currently the largest model in terms of the size of trainable parameters, dimensionality of genes and the number of cells used in the pre-training. Experiments showed that scFoundation can serve as a foundation model for single-cell transcriptomics and achieve state-of-the-art performances in a diverse array of downstream tasks, such as gene expression enhancement, tissue drug response prediction, single-cell drug response classification, and single-cell perturbation prediction.
科研通智能强力驱动
Strongly Powered by AbleSci AI