马赛克
组学
计算生物学
计算机科学
地理
生物
数据科学
生物信息学
考古
作者
Xuhua Yan,Kok Siong Ang,Kok Siong Ang,Lynn van Olst,Alex Edwards,Thomas Watson,Ruiqing Zheng,Min Li,Rong Fan,Jinmiao Chen,David Gate,Jinmiao Chen
标识
DOI:10.1101/2024.10.02.616189
摘要
Abstract With the advent of spatial multi-omics, mosaic integration of diverse datasets with partially overlapping modalities enables construction of comprehensive multi-modal spatial atlases from heterogeneous sources. Here, we present SpaMosaic, a tool that employs contrastive learning and graph neural networks to build a modality-agnostic and batch-corrected latent space for spatial domain identification and missing modality imputation. We systematically benchmarked SpaMosaic against existing integration methods using simulated data and experimentally acquired datasets spanning RNA and protein abundance, chromatin accessibility, and histone modifications from brain, embryo, tonsil, and lymph node tissues. SpaMosaic consistently outperformed other methods in identifying coherent spatial domains by reducing noise and mitigating batch effects across diverse technologies and developmental stages. Computationally, SpaMosaic is highly scalable, capable of integrating over 100 sections and processing a single section with more than 800,000 spots. Beyond robust integration, the unified latent space generated by SpaMosaic enables accurate imputation of missing modalities. In a mosaic mouse brain dataset, the imputed histone modifications not only recapitulated expected transcriptome-epigenome correlations but also uncovered more region-specific regulatory links compared to the measured chromatin accessibility data, demonstrating the ability to infer relationships between modalities without co-profiling. In summary, SpaMosaic provides a versatile framework for unifying the rapidly accumulating heterogeneous spatial omics data into comprehensive biological atlases.
科研通智能强力驱动
Strongly Powered by AbleSci AI