发布文献求助

Efficient and Scalable Alignment-Free Distributed Genotyping of SNPs and Short Indels

索引基因分型单核苷酸多态性计算生物学可扩展性计算机科学 SNP基因分型遗传学生物基因型基因数据库

作者

Lorenzo Di Rocco,Umberto Ferraro Petrillo

标识

DOI：10.1109/tcbbio.2025.3525547

摘要

The growing volume of sequencing data and the ever-larger size of variants databases challenge genotyping procedures to handle massive genomics datasets efficiently. Recent alignment-free solutions leverage exclusively on the k-mers counts to speed up the analysis, but have to trade off the time gain against the memory requirements, to make the elaborations possible on a single workstation. In this paper, we present SparkGeno+, a novel alignment-free (AF) distributed pipeline for the fast and accurate genotyping of Single Nucleotide Polymorphisms (SNPs) and indels on a large scale. Starting from a previous pipeline, we identified and evaluated the performance bottlenecks that arise when performing genotyping using a standard AF approach, to develop and implement several innovations to better exploit the resources of a distributed system. The effectiveness of our proposal has been validated through an experimental analysis on widely studied datasets. The results show that the accuracy of SparkGeno+ matches the one of state-of-the-art alignment-free tools like Vargeno and MALVA. Moreover, the time performance of SparkGeno+ scales well with the number of computing units, thus allowing execution times that are in order of growth smaller than those of classical genotyping tools. This indicates SparkGeno+ to be a promising solution for large-scale genotyping applications.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

活动

『应助活动周』获奖名单已公布 🔥 (2025-4-2)

更新

『中科院2025期刊分区』已更新 (2025-3-23)

更新

『即时热点』模块已上线 (2025-2-28)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 赘婿的应助被熊大对熊二说熊要有个熊样采纳，获得10

1秒前; DRHCE发布了新的文献求助20

1秒前; 泥花完成签到，获得积分10

2秒前; 香蕉觅云上传了应助文件

2秒前; 研友_Lw4Ngn完成签到，获得积分10

2秒前; cxr完成签到，获得积分10

3秒前; FashionBoy上传了应助文件

3秒前; 。。。完成签到，获得积分20

4秒前; 不吃不吃卷心菜关注了科研通微信公众号

6秒前; 吃饭必加葱完成签到，获得积分10

6秒前; 橘子完成签到，获得积分10

6秒前; 挽倾颜发布了新的文献求助10

8秒前; Cactus上传了应助文件

8秒前; 香蕉觅云上传了应助文件

8秒前; 彭于晏的应助被J11采纳，获得10

9秒前; 美丽晓蓝发布了新的文献求助10

9秒前; lzz完成签到，获得积分10

10秒前; 于水清发布了新的文献求助10

13秒前; 隐形曼青上传了应助文件

14秒前; 慕青的应助被川ccc采纳，获得10

15秒前; Wuhuhu上传了应助文件

16秒前; JamesPei的应助被健壮的芷容采纳，获得10

17秒前; 赘婿上传了应助文件

18秒前; 赘婿上传了应助文件

18秒前; 科研通AI5的应助被XX采纳，获得10

18秒前; Owen上传了应助文件

19秒前; 七月发布了新的文献求助10

19秒前; 柔弱紊完成签到，获得积分10

20秒前; wanci上传了应助文件

20秒前; Paddi完成签到，获得积分10

20秒前; 程子完成签到，获得积分10

21秒前; 漂亮幻莲发布了新的文献求助10

23秒前; 美丽晓蓝完成签到，获得积分10

23秒前; 热心书易完成签到，获得积分10

24秒前; 柔弱紊发布了新的文献求助10

25秒前; 务实青亦发布了新的文献求助10

25秒前; 科研通AI5的应助被轻松幼丝采纳，获得10

25秒前; 王佳迅发布了新的文献求助10

26秒前; Cactus上传了应助文件

27秒前; 打打的应助被美丽晓蓝采纳，获得10

29秒前

高分求助中: All the Birds of the World 4000; Production Logging: Theoretical and Interpretive Elements 3000; Les Mantodea de Guyane Insecta, Polyneoptera 2000; Am Rande der Geschichte : mein Leben in China / Ruth Weiss 1500; CENTRAL BOOKS: A BRIEF HISTORY 1939 TO 1999 by Dave Cope 1000; Machine Learning Methods in Geoscience 1000; Resilience of a Nation: A History of the Military in Rwanda 888

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3737471; 求助须知：如何正确求助？哪些是违规求助？ 3281244; 关于积分的说明 10023902; 捐赠科研通 2997978; 什么是DOI，文献DOI怎么找？ 1644908; 邀请新用户注册赠送积分活动 782421; 科研通“疑难数据库（出版商）”最低求助积分说明 749792

今日热心研友

科研小民工

剑指东方是为谁

昏睡的蟠桃

默默地读文献

痴情的博超

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通