计算机科学
代表(政治)
蒸馏
自然语言处理
人工智能
对比分析
机器学习
化学
语言学
色谱法
政治学
政治
哲学
法学
作者
Jinjia Feng,Zhen Wang,Zhewei Wei,Yaliang Li,Bolin Ding,Hongteng Xu
标识
DOI:10.1145/3627673.3679725
摘要
With the increasing application of deep learning to solve scientific problems in biochemistry, molecular federated learning has become popular due to its ability to offer distributed privacy-preserving solutions. However, most existing molecular federated learning methods rely on joint training with public datasets, which are difficult to obtain in practice. These methods also fail to leverage multi-modal molecular representations effectively. To address the above issues, we propose a novel framework, Federated Heterogeneous Contrastive Distillation (FedHCD), which enables to jointly train global models from clients with heterogeneous data modalities, learning tasks, and molecular models. To aggregate data representations of different modalities in a data-free manner, we design a global multi-modal contrastive strategy to align the representation of clients without public dataset. Utilizing intrinsic characteristics of molecular data in different modalities, we tackle the exacerbation of local model drift and data Non-IIDness caused by multi-modal clients. We introduce a multi-view contrastive knowledge transfer to extract features from atoms, substructures, and molecules, solving the issue of information distillation failure due to dimensional biases in different data modalities. Our evaluations on eight real-world molecular datasets and ablation experiments show that FedHCD outperforms other state-of-the-art FL methods, irrespective of whether or not they use public datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI