索引
生物
遗传学
基因组
全基因组测序
倍性
INDEL突变
基因组进化
参考基因组
同源重组
计算生物学
单核苷酸多态性
基因
基因型
作者
Dale Muzzey,Katja Schwartz,Jonathan S. Weissman,Gavin Sherlock
出处
期刊:GenomeBiology.com (London. Print)
[Springer Nature]
日期:2013-01-01
卷期号:14 (9): R97-R97
被引量:131
标识
DOI:10.1186/gb-2013-14-9-r97
摘要
Candida albicans is a ubiquitous opportunistic fungal pathogen that afflicts immunocompromised human hosts. With rare and transient exceptions the yeast is diploid, yet despite its clinical relevance the respective sequences of its two homologous chromosomes have not been completely resolved. We construct a phased diploid genome assembly by deep sequencing a standard laboratory wild-type strain and a panel of strains homozygous for particular chromosomes. The assembly has 700-fold coverage on average, allowing extensive revision and expansion of the number of known SNPs and indels. This phased genome significantly enhances the sensitivity and specificity of allele-specific expression measurements by enabling pooling and cross-validation of signal across multiple polymorphic sites. Additionally, the diploid assembly reveals pervasive and unexpected patterns in allelic differences between homologous chromosomes. Firstly, we see striking clustering of indels, concentrated primarily in the repeat sequences in promoters. Secondly, both indels and their repeat-sequence substrate are enriched near replication origins. Finally, we reveal an intimate link between repeat sequences and indels, which argues that repeat length is under selective pressure for most eukaryotes. This connection is described by a concise one-parameter model that explains repeat-sequence abundance in C. albicans as a function of the indel rate, and provides a general framework to interpret repeat abundance in species ranging from bacteria to humans. The phased genome assembly and insights into repeat plasticity will be valuable for better understanding allele-specific phenomena and genome evolution.
科研通智能强力驱动
Strongly Powered by AbleSci AI