清晨好,您是今天最早来到科研通的研友!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您科研之路漫漫前行!

A Conceptual Framework of Data Readiness for the Health and Aging Brain Study‐Health Disparities

缺少数据 插补(统计学) 计算机科学 离群值 支持向量机 数据质量 特征工程 特征(语言学) 机器学习 数据挖掘 数据科学 人工智能 工程类 深度学习 语言学 运营管理 哲学 公制(单位)
作者
Fan Zhang,Melissa Petersen,Leigh Johnson,James Hall,Sid E. O’Bryant
出处
期刊:Alzheimers & Dementia [Wiley]
卷期号:19 (S15)
标识
DOI:10.1002/alz.079959
摘要

Abstract Background The Health and Aging Brain Study: Health Disparities (HABS‐HD) seeks to understand the biological, social and environmental factors that impact brain aging among diverse communities. HABS‐HD, like many other NIH funded data‐sharing projects, has important data assets for various uses, including social, environmental and behavioral data, and multiple data flow pathways. Machine learning (ML) develops algorithms and models to continuously improve itself over time, but the determination of data quality and its readiness are needed for these models to operate efficiently. Therefore, developing a data readiness reporting methodology has become a very urgent task for HABS‐HD. Method In this study, we developed a conceptual framework of data readiness. First, we analyzed the missing data percentage and used ML‐Based Multiple Imputation (MLMI) for missing data imputation. Then, we performed SVM based on Recursive Feature Elimination and Cross Validation (SVM‐RFE‐CV) for feature elimination and outlier removal. Lastly, we rated the data readiness based on the three metrics: missing data percentage, performance before feature engineering, and performance after feature engineering to rate data readiness. All the three scores were averaged to rate the overall readiness of data. Result A framework for calculating overall average score for readiness of data was presented (1 stands for completely accessible, 0 for not accessible at all, and 0.5 for neutral). Our results show that the framework of data readiness was straightforward and useful in assessing how ready the HABS‐HD data is for ML. Conclusion The systematic analysis of readiness of data before building ML models is of utmost importance. And it has a significant impact on biomarker discovery and disease prediction application for Alzheimer’s disease. The conceptual framework of data readiness works well for our Alzheimer’s disease models in HABS‐HD. It can also be applied to other disease data readiness reporting.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
AOTUMAN完成签到,获得积分10
刚刚
奋斗雅香完成签到 ,获得积分10
13秒前
zsfxqq完成签到 ,获得积分10
28秒前
领导范儿应助方俊驰采纳,获得10
32秒前
charih完成签到 ,获得积分10
37秒前
39秒前
Akim应助cc采纳,获得10
39秒前
方俊驰发布了新的文献求助10
43秒前
nini完成签到,获得积分10
52秒前
52秒前
冬1完成签到 ,获得积分10
53秒前
53秒前
57秒前
wayne完成签到 ,获得积分10
59秒前
cc发布了新的文献求助10
1分钟前
苗条的一一完成签到,获得积分10
1分钟前
1分钟前
量子星尘发布了新的文献求助10
1分钟前
1分钟前
fjhsg25完成签到,获得积分20
1分钟前
个性仙人掌完成签到 ,获得积分10
1分钟前
孤独剑完成签到 ,获得积分10
1分钟前
celia完成签到 ,获得积分10
2分钟前
2分钟前
黑山路老军医完成签到,获得积分20
2分钟前
2分钟前
燕晓啸发布了新的文献求助50
2分钟前
su完成签到 ,获得积分10
2分钟前
2分钟前
优雅草丛发布了新的文献求助10
2分钟前
2分钟前
2分钟前
量子星尘发布了新的文献求助10
2分钟前
何pulapula发布了新的文献求助10
2分钟前
quantumdot完成签到,获得积分10
2分钟前
无限的千凝完成签到 ,获得积分10
2分钟前
LOST完成签到 ,获得积分10
2分钟前
科研通AI2S应助科研通管家采纳,获得10
2分钟前
喵了个咪完成签到 ,获得积分10
3分钟前
MiSD完成签到,获得积分10
3分钟前
高分求助中
【提示信息,请勿应助】关于scihub 10000
A new approach to the extrapolation of accelerated life test data 1000
Coking simulation aids on-stream time 450
北师大毕业论文 基于可调谐半导体激光吸收光谱技术泄漏气体检测系统的研究 390
Phylogenetic study of the order Polydesmida (Myriapoda: Diplopoda) 370
Robot-supported joining of reinforcement textiles with one-sided sewing heads 360
Novel Preparation of Chitin Nanocrystals by H2SO4 and H3PO4 Hydrolysis Followed by High-Pressure Water Jet Treatments 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 冶金 细胞生物学 免疫学
热门帖子
关注 科研通微信公众号,转发送积分 4015509
求助须知:如何正确求助?哪些是违规求助? 3555418
关于积分的说明 11318049
捐赠科研通 3288665
什么是DOI,文献DOI怎么找? 1812284
邀请新用户注册赠送积分活动 887882
科研通“疑难数据库(出版商)”最低求助积分说明 812012