单倍型
遗传学
生物
块(置换群论)
基因座(遗传学)
核苷酸多型性
计算机科学
等位基因
数学
基因
几何学
作者
Nebojša Jojić,Vladimir Jojic,David Heckerman
出处
期刊:Cornell University - arXiv
日期:2004-07-07
卷期号:: 286-292
被引量:8
标识
DOI:10.5555/1036843.1036878
摘要
Haplotypes, the global patterns of DNA sequence variation, have important implications for identifying complex traits. Recently, blocks of limited haplotype diversity have been discovered in human chromosomes, intensifying the research on modelling the block structure as well as the transitions or co-occurrence of the alleles in these blocks as a way to compress the variability and infer the associations more robustly. The haplotype block structure analysis is typically complicated by the fact that the phase information for each SNP is missing, i.e., the observed allele pairs are not given in a consistent order across the sequence. The techniques for circumventing this require additional information, such as family data, or a more complex sequencing procedure. In this paper we present a hierarchical statistical model and the associated learning and inference algorithms that simultaneously deal with the allele ambiguity per locus, missing data, block estimation, and the complex trait association. While the block structure may differ from the structures inferred by other methods, which use the pedigree information or previously known alleles, the parameters we estimate, including the learned block structure and the estimated block transitions per locus, define a good model of variability in the set. The method is completely data-driven and can detect Chron's disease from the SNP data taken from the human chromosome 5q31 with the detection rate of 80% and a small error variance.
科研通智能强力驱动
Strongly Powered by AbleSci AI