People of Collections: Facilitators of Interoperability?

元数据 互操作性 万维网 计算机科学 订单(交换) 图书馆学 数据科学 业务 财务
作者
Chloé Besombes,Simon Chagnoux,Gildas Illien
出处
期刊:Biodiversity Information Science and Standards [Pensoft Publishers]
卷期号:3 被引量:1
标识
DOI:10.3897/biss.3.35268
摘要

In March 2019, the Muséum national d’histoire naturelle, Paris (MNHN) launched the datapoc.mnhn.fr project, funded by the French research infrastructures CollEX-Persée and E-recolnat. This proof of concept was imagined and is supported by a group of partners coming from different communities working at the Muséum (specimen collection curators, librarians, researchers, data scientists, publishers). The initial motivation of this team for getting together was to imagine a way to link the massive data produced and preserved in the heterogeneous institutional collection databases and repositories of the Muséum in order to improve global access and visibility for the benefit of end-users as well as data curation processes. After a year of sharing and deliberating, the group concluded that focusing on people’s names and identification, could be a promising way to explore interoperability and alignment solutions in order to match data hosted in the different systems. The project has thus two main goals: first, to improve biodiversity and taxonomic data quality for the qualification of personal identities, publications and scientific names by resolving frequent ambiguities and issues in people’s names assignment ; second, to develop and assess machine-driven linking strategies between specimen and authorship metadata and resources derived from various institutional datasilos of interest to the research community. In order to test this idea and to experiment innovative data computing and visualization technologies, all parties involved in the project agreed to develop a proof of concept focused on a dataset of 500 names of major MNHN naturalists from its foundation until nowadays. This proof of concept will consist in building a structured authority file for people's names, which could be shared by all services producing and using biodiversity data at MNHN, as well as reusable as open data by external stakeholders and international partners. This structured file will strengthen data and databases production and maintenance workflows, but could also help improving the quality of end-user experience by allowing individuals or machines to match, link or otherwise compute and analyse data that is still difficult to handle because of the diversity of IT applications and limited standardisation practises. It is key to the project that this structured file should somehow comply with international interoperability and semantic web standards so to facilitate global access and data exchanges with similar institutions around the world. Linked datasets and related resources derived from this work will be displayed on a public website designed for researchers as well as for the public via diverse applications and formats (API, RDF). The project will be run from April 2019 to April 2020 by the core team of partners who initiated it, with the support of a private IT and data computing service called Logilab. Some of the challenges of this project include finding an efficient way for building the structured file and then succeed in aligning and disambiguising names already present existing databases. A way to approach this issue is to confront and consolidate MNHN biodiversity datasets with external repositories by using people identifiers systems like ISNI, VIAF, IdREF, which are already familiar to libraries, archives and other cultural institutions. How can those various people identifiers systems be profitable to parse MNHN "people of collections" and help disambiguise them? Is there a particular people identifier system which will prove to be most relevant for all types of collections? Which parsing method will give the best results, and how could it scale up and possibly be reused by other institutions or even future European taxonomic infrastructures? Those are some of the questions the MNHN team is eager to deal with and to share and discuss at the Biodiversity Next Symposium.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
Jasen完成签到 ,获得积分10
2秒前
3秒前
搜集达人应助能干发夹采纳,获得10
3秒前
范式发布了新的文献求助10
4秒前
斯文败类应助愉博采纳,获得10
5秒前
小羊闲庭信步完成签到,获得积分10
6秒前
6秒前
Ricky完成签到,获得积分10
7秒前
林小雨发布了新的文献求助10
8秒前
敏敏完成签到 ,获得积分10
8秒前
Lucille发布了新的文献求助10
8秒前
帅气白云完成签到,获得积分10
8秒前
范式完成签到,获得积分10
9秒前
Serendipity举报zsyzxb求助涉嫌违规
9秒前
10秒前
10秒前
11秒前
12秒前
12秒前
如意2023完成签到,获得积分10
12秒前
彳亍完成签到 ,获得积分10
12秒前
13秒前
Goes完成签到,获得积分10
14秒前
14秒前
爆米花应助王子娇采纳,获得10
14秒前
愉博发布了新的文献求助10
14秒前
15秒前
15秒前
gougou发布了新的文献求助10
15秒前
牛爱花发布了新的文献求助10
15秒前
fanpengzhen发布了新的文献求助10
16秒前
上官若男应助想学采纳,获得10
17秒前
18秒前
18秒前
笑眯眯发布了新的文献求助10
18秒前
18秒前
qianduo应助六子采纳,获得10
19秒前
19秒前
20秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Kelsen’s Legacy: Legal Normativity, International Law and Democracy 1000
Dynamika przenośników łańcuchowych 600
Recent progress and new developments in post-combustion carbon-capture technology with reactive solvents 600
An Estimate of the Nonflavonoid Phenols in Wines 500
Reduction of corrosion rates in the primary circuit of pressurized water reactors in order to limit radioactive deposits 500
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3538111
求助须知:如何正确求助?哪些是违规求助? 3115812
关于积分的说明 9322875
捐赠科研通 2813827
什么是DOI,文献DOI怎么找? 1546192
邀请新用户注册赠送积分活动 720435
科研通“疑难数据库(出版商)”最低求助积分说明 711952