生物
基因组
进化生物学
注释
转座因子
基因组学
计算生物学
比较基因组学
遗传学
基因
作者
John S. Sproul,Scott Hotaling,Jacqueline Heckenhauer,Ashlyn Powell,Donald M. Marshall,Amanda M. Larracuente,Joanna L. Kelley,Steffen U. Pauls,Paul B. Frandsen
出处
期刊:Genome Research
[Cold Spring Harbor Laboratory]
日期:2023-09-22
卷期号:: gr.277387.122-gr.277387.122
标识
DOI:10.1101/gr.277387.122
摘要
Repetitive elements (REs) are integral to the composition, structure, and function of eukaryotic genomes, yet remain understudied in most taxonomic groups. We investigated REs across 601 insect species and report wide variation in REs dynamics across groups. Analysis of associations between REs and protein-coding genes revealed dynamic evolution at the interface between REs and coding regions across insects, including notably elevated RE-gene associations in lineages with abundant long interspersed nuclear elements (LINEs). We leveraged this large, empirical data set to quantify impacts of long-read technology on RE detection and investigate fundamental challenges to RE annotation in diverse groups. In long-read assemblies we detected ~36% more REs than short-read assemblies, with long terminal repeats (LTRs) showing 162% increased detection, while DNA transposons and LINEs showed less respective technology-related bias. In most insect lineages, 25-85% of repetitive sequences were unclassified; following automated annotation, compared to only ~13% in Drosophila species. Although the diversity of available insect genomes has rapidly expanded, we show the rate of community contributions to RE databases has not kept pace, preventing efficient annotation and high-resolution study of REs in most groups. We highlight the tremendous opportunity and need for the biodiversity genomics field to embrace REs and suggest collective steps for making progress towards this goal.
科研通智能强力驱动
Strongly Powered by AbleSci AI