特征选择
一致性
疾病
推论
生物标志物
特征(语言学)
计算机科学
肿瘤科
生物标志物发现
医学
机器学习
人工智能
内科学
生物
蛋白质组学
生物化学
语言学
哲学
基因
作者
Mengjiao Peng,Liming Xiang
摘要
The increased availability of ultrahigh-dimensional biomarker data and the high demand of identifying biomarkers importantly related to survival outcomes made feature screening methods commonplace in the analysis of cancer genome data. When survival outcomes include endpoints of overall survival (OS) and time-to-progression (TTP), a high concordance is typically found in both endpoints in cancer studies, namely, patients' OS would most likely be extended when tumour progression is delayed. Existing screening procedures are often performed on a single survival endpoint only and may result in biased selection of features for OS in ignorance of disease progression. We propose a novel feature screening method by incorporating information of TTP into the selection of important biomarker predictors for more accurate inference of OS subsequent to disease progression. The proposal is based on the rank of correlation between individual features and the conditional distribution of OS given observations of TTP. It is advantageous for its flexible model nature, which requires no marginal model assumption for each endpoint, and its minimal computational cost for implementation. Theoretical results show its ranking consistency, sure screening and false rate control properties. Simulation results demonstrate that the proposed screener leads to more accurate feature selection than the method without considering the prior observations of disease progression. An application to breast cancer genome data illustrates its practical utility and facilitates disease classification using selected biomarker predictors.
科研通智能强力驱动
Strongly Powered by AbleSci AI