块链
问责
供应链
业务
链条(单位)
计算机科学
计算机安全
营销
政治学
法学
物理
天文
作者
Yue Liu,Dawen Zhang,Boming Xia,Julia Anticev,Tunde Adebayo,Zhenchang Xing,Moses Machao
出处
期刊:Cornell University - arXiv
日期:2024-08-16
标识
DOI:10.48550/arxiv.2408.08536
摘要
In the era of advanced artificial intelligence, highlighted by large-scale generative models like GPT-4, ensuring the traceability, verifiability, and reproducibility of datasets throughout their lifecycle is paramount for research institutions and technology companies. These organisations increasingly rely on vast corpora to train and fine-tune advanced AI models, resulting in intricate data supply chains that demand effective data governance mechanisms. In addition, the challenge intensifies as diverse stakeholders may use assorted tools, often without adequate measures to ensure the accountability of data and the reliability of outcomes. In this study, we adapt the concept of ``Software Bill of Materials" into the field of data governance and management to address the above challenges, and introduce ``Data Bill of Materials" (DataBOM) to capture the dependency relationship between different datasets and stakeholders by storing specific metadata. We demonstrate a platform architecture for providing blockchain-based DataBOM services, present the interaction protocol for stakeholders, and discuss the minimal requirements for DataBOM metadata. The proposed solution is evaluated in terms of feasibility and performance via case study and quantitative analysis respectively.
科研通智能强力驱动
Strongly Powered by AbleSci AI