AI-driven synthetic data generation for accelerating hepatology research: A study of the United Network for Organ Sharing (UNOS) database

器官共享联合网络 肝病学 复制 医学 数据共享 数据库 计算机科学 合成数据 内科学 肝移植 数据挖掘 统计 人工智能 移植 数学 病理 替代医学
作者
Joseph Ahn,Yung‐Kyun Noh,Mingzhao Hu,Xiaotong Shen,Douglas A. Simonetto,Patrick S. Kamath,Rohit S. Loomba,Vijay H. Shah
出处
期刊:Hepatology [Wiley]
被引量:1
标识
DOI:10.1097/hep.0000000000001299
摘要

Background and Aims: Clinical hepatology research often faces limited data availability, underrepresentation of minority groups, and complex data-sharing regulations. Synthetic data—artificially generated patient records designed to mirror real-world distributions— offers a potential solution. We hypothesized that diffusion models, a state-of-the-art generative technique, could produce synthetic liver transplant waitlist data from the United Network for Organ Sharing (UNOS) database that maintains statistical fidelity, replicates clinical correlations and survival patterns, and ensures robust privacy protection. Methods: Diffusion models were used to generate synthetic patient cohorts mirroring the UNOS liver transplant waitlist database between years 2019 and 2023. Statistical fidelity was assessed using Maximum Mean Discrepancy (MMD) and Wasserstein distance, correlation analysis, and variable-level metrics. Clinical utility was evaluated by comparing transplant-free survival via Kaplan-Meier curves and the MELD score performance. Privacy was quantified using the Distance to Closest Record (DCR) and attribute disclosure risk assessments. Results: The synthetic dataset was nearly indistinguishable from the original dataset (MMD=0.002, standardized Wasserstein distance<1.0), preserving clinically relevant correlations and survival patterns as evidenced by similar median survival times (110 vs. 101 days) and 5-year survival rates (22.2% vs. 22.8%). MELD-based 90-day mortality prediction was maintained (original AUC=0.839 vs. synthetic AUC=0.844). Privacy metrics indicated no identifiable patient matches, and mean DCR values ensured that synthetic individuals were not direct replicas of real patients. Conclusion: AI-generated synthetic data derived from diffusion models can faithfully replicate complex hepatology datasets, maintain key clinical signals, and ensure strong privacy safeguards. This approach can help address data scarcity, enhance model generalizability, foster multi-institutional collaboration, and accelerate progress in hepatology research.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
^O^完成签到,获得积分10
刚刚
刚刚
秦雪芝发布了新的文献求助10
刚刚
刚刚
半夏007发布了新的文献求助10
1秒前
YYYHHH完成签到,获得积分10
1秒前
minsu完成签到,获得积分10
1秒前
1秒前
迷路的斌发布了新的文献求助10
2秒前
所所应助枝芽采纳,获得10
2秒前
3秒前
3秒前
4秒前
今后应助刘玄德采纳,获得10
4秒前
Aprial完成签到,获得积分10
4秒前
4秒前
5秒前
代代发布了新的文献求助10
6秒前
mal龙发布了新的文献求助10
6秒前
6秒前
哈尼发布了新的文献求助10
6秒前
jilgy发布了新的文献求助10
6秒前
smottom应助Ting采纳,获得10
6秒前
6秒前
7秒前
7秒前
8秒前
Youlu发布了新的文献求助10
8秒前
小马甲应助华桦子采纳,获得10
9秒前
9秒前
orixero应助宸c采纳,获得10
9秒前
李可乐完成签到,获得积分20
10秒前
FashionBoy应助贺雪采纳,获得10
10秒前
科研通AI6.1应助kids采纳,获得10
10秒前
11秒前
深情安青应助一只桶采纳,获得10
11秒前
Orange应助Youlu采纳,获得10
11秒前
tz发布了新的文献求助10
11秒前
研友_LMBAXn发布了新的文献求助10
11秒前
充电宝应助neuarcher采纳,获得30
11秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Kinesiophobia : a new view of chronic pain behavior 2000
Research for Social Workers 1000
Mastering New Drug Applications: A Step-by-Step Guide (Mastering the FDA Approval Process Book 1) 800
The Social Psychology of Citizenship 600
Signals, Systems, and Signal Processing 510
Discrete-Time Signals and Systems 510
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5911226
求助须知:如何正确求助?哪些是违规求助? 6825004
关于积分的说明 15780841
捐赠科研通 5036066
什么是DOI,文献DOI怎么找? 2711092
邀请新用户注册赠送积分活动 1661335
关于科研通互助平台的介绍 1603650