Supervised learning of high-confidence phenotypic subpopulations from single-cell data

表型 计算生物学 特征选择 降维 计算机科学 可扩展性 范畴变量 生物 机器学习 人工智能 基因 遗传学 数据库
作者
Tao Ren,Canping Chen,Alexey V. Danilov,Susan Liu,Xiangnan Guan,Shunyi Du,Xiwei Wu,Mara H. Sherman,Paul T. Spellman,Lisa M. Coussens,Andrew Adey,Gordon B. Mills,Ling‐Yun Wu,Zheng Xia
出处
期刊:Nature Machine Intelligence [Nature Portfolio]
卷期号:5 (5): 528-541 被引量:7
标识
DOI:10.1038/s42256-023-00656-y
摘要

Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here by deploying a Learning with Rejection strategy, we developed a novel supervised learning framework called PENCIL to identify subpopulations associated with categorical or continuous phenotypes from single-cell data. By embedding a feature selection function into this flexible framework, for the first time, we were able to simultaneously select informative features and identify cell subpopulations, enabling accurate identification of phenotypic subpopulations otherwise missed by methods incapable of concurrent gene selection. Furthermore, the regression mode of PENCIL presents a novel ability for supervised phenotypic trajectory learning of subpopulations from single-cell data. We conducted comprehensive simulations to evaluate PENCIL's versatility in simultaneous gene selection, subpopulation identification and phenotypic trajectory prediction. PENCIL is fast and scalable to analyse one million cells within 1 h. Using the classification mode, PENCIL detected T-cell subpopulations associated with melanoma immunotherapy outcomes. Moreover, when applied to single-cell RNA sequencing of a patient with mantle cell lymphoma with drug treatment across multiple timepoints, the regression mode of PENCIL revealed a transcriptional treatment response trajectory. Collectively, our work introduces a scalable and flexible infrastructure to accurately identify phenotype-associated subpopulations from single-cell data. To detect phenotype-related cell subpopulations from single-cell data, appropriate feature sets need to be chosen or learned simultaneously. Ren et al. present here a tool based on Learning with Rejection, a method that during training learns features from cells that can be predicted with high confidence, while cells that the model is not yet certain about are rejected.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
duck0008完成签到,获得积分10
刚刚
2秒前
lmd250909完成签到,获得积分10
2秒前
浪子应助晨雨采纳,获得10
3秒前
3秒前
3秒前
4秒前
duck0008发布了新的文献求助10
4秒前
毕院士发布了新的文献求助30
4秒前
科研通AI6应助发文必过采纳,获得10
5秒前
浮游应助667采纳,获得30
5秒前
Poisomber发布了新的文献求助10
5秒前
6秒前
6秒前
浮游应助心随以动采纳,获得10
7秒前
lmd发布了新的文献求助10
7秒前
8秒前
9秒前
酷波er应助glory0510采纳,获得10
9秒前
傲娇衬衫完成签到,获得积分10
9秒前
善学以致用应助明亮嘉熙采纳,获得10
10秒前
wangchenhong发布了新的文献求助10
10秒前
11秒前
科研通AI6应助开心若菱采纳,获得10
12秒前
CXHY完成签到,获得积分10
12秒前
13秒前
浮游应助顺利的边牧采纳,获得10
13秒前
路振银发布了新的文献求助10
13秒前
13秒前
感动的世倌完成签到,获得积分10
14秒前
子云发布了新的文献求助10
14秒前
勤奋橘子发布了新的文献求助10
14秒前
tjy发布了新的文献求助10
15秒前
droke完成签到,获得积分10
16秒前
悦耳的诗云完成签到,获得积分10
16秒前
17秒前
不想起名字完成签到,获得积分10
17秒前
17秒前
九方嘉许发布了新的文献求助10
17秒前
18秒前
高分求助中
Comprehensive Toxicology Fourth Edition 24000
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
LRZ Gitlab附件(3D Matching of TerraSAR-X Derived Ground Control Points to Mobile Mapping Data 附件) 2000
World Nuclear Fuel Report: Global Scenarios for Demand and Supply Availability 2025-2040 800
The Social Work Ethics Casebook(2nd,Frederic G. R) 600
Lloyd's Register of Shipping's Approach to the Control of Incidents of Brittle Fracture in Ship Structures 500
AASHTO LRFD Bridge Design Specifications (10th Edition) with 2025 Errata 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 内科学 生物化学 物理 计算机科学 纳米技术 遗传学 基因 复合材料 化学工程 物理化学 病理 催化作用 免疫学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 5124930
求助须知:如何正确求助?哪些是违规求助? 4328978
关于积分的说明 13489368
捐赠科研通 4163582
什么是DOI,文献DOI怎么找? 2282431
邀请新用户注册赠送积分活动 1283622
关于科研通互助平台的介绍 1222842