Current Issue Cover
自适应异构联邦学习

黄文柯, 叶茫, 杜博(1. 武汉大学计算机学院, 武汉 430072)

摘 要
目的 模型异构联邦学习由于允许参与者在不损害隐私的情况下独立设计其独特模型而受到越来越多的关注。现有的方法通常依赖于公共共享的相关数据或全局模型进行通信,极大地限制了适用性。且每个参与者的私有数据通常以不同的分布收集,导致数据异构问题。为了同时处理模型异构和数据异构,本文提出了一种新颖的自适应异构联邦学习方法。方法 给定一个随机生成的输入信号(例如,随机噪声),自适应异构联邦学习直接通过对齐输出逻辑层分布来实现异构模型之间的通信,实现协作知识共享。主要优势是在不依赖额外相关数据收集或共享模型设计的情况下解决了模型异构问题。为了进一步解决数据异构问题,本文提出了在模型和样本层面上进行自适应权重更新。因此,自适应异构联邦学习(adaptive heteogeneous federated learning,AHF)允许参与者通过模型输出在无关数据上的差异和强调“有意义”的样本来学习丰富多样的知识。结果 通过在不同的联邦学习任务上使用随机噪声输入进行通信,进行了广泛的实验,显示出比竞争方法更高的域内精确度和更好的跨域泛化性能。结论 本文方法提供了一个简单而有效的基准,为异构联邦学习的未来发展奠定基础。
关键词
Adaptive heterogeneous federated learning

Huang Wenke, Ye Mang, Du Bo(School of Computer Science, Wuhan University, Wuhan 430072, China)

Abstract
Objective The current development of deep learning has caused significant changes in numerous research fields and has had profound impacts on every aspect of societal and industrial sectors, including computer vision, natural language processing, multi-modal learning, and medical analysis. The success of deep learning heavily relies on large-scale data. However, the public and scientific communities have become increasingly aware of the need for data privacy. In the real world, data are commonly distributed among different entities such as edge devices and companies. With the increasing emphasis on data sensitivity, strict legislation has been proposed to govern data collection and utilization. Thus, the traditional centralized training model, which requires data aggregation, is unusable in the practical setting. In response to such real-world challenges, federated learning (FL) has emerged as a popular research field because it can train a global model for different participants without centralizing data owned by the distributed parties. FL is a privacy-preserving multiparty collaboration model that adheres to privacy protocols without data leakage. Typically, FL requires clients to share a global model architecture for the central server to aggregate parameters from participants and then redistributes the global model (averaged parameters). However, this prerequisite largely restricts the flexibility of the client model architecture. In recent years, the concept of objective model heterogeneous FL has garnered substantial attention because it allows participants to independently design unique models in FL without compromising privacy. Specifically, participants may need to design special model architecture to ease the communication burden or refuse to share the same architecture due to intellectual property concerns. However, existing methods often rely on publicly shared related data or a global model for communication, limiting their applicability. In addition, FL is proposed to handle privacy concerns in the distributed learning environment. A pioneering FL method trains a global model by aggregating local model parameters. However, its performance is impeded due to decentralized data, which results in non-i.i.d distribution (called data heterogeneity). Each participant optimizes toward the local empirical risk minimum, which is inconsistent with the global direction. Therefore, the average global model has a slow convergence speed and achieves limited performance improvement.Method Model heterogeneity largely impedes the local model section flexibility, and data heterogeneity hinders federated performance. To address model and data heterogeneity, this paper introduces a groundbreaking approach called adaptive heterogeneous federated (AHF) learning, which employs a unique strategy by utilizing a randomly generated input signal, such as random noise and public unrelated samples, to facilitate direct communication among heterogeneous model architectures. This task is achieved by aligning the output logit distributions, fostering collaborative knowledge sharing among participants. The primary advantage of AHF is its ability to address model heterogeneity without depending on additional related data collection or shared model design. To further enhance AHF’s effectiveness in handling data heterogeneity, the paper proposes adaptive weight updating on both model and sample levels, which enables AHF participants to acquire rich and diverse knowledge by leveraging dissimilarities in model output on unrelated data while emphasizing the importance of meaningful samples.Result Empirical validation of the proposed AHF method is conducted through a meticulous series of extensive empirical experiments. Random noise inputs are employed in the context of two distinct federated learning tasks: Digits and Office-Caltech scenarios. Specifically, our solution presents the stable generalization performance on the more challenging scenario, Office-Caltech. Notably, when a larger domain gap exists among private data, AHF achieves higher overall generalization performance on these different unrelated data samples and obtains stable improvements on most unseen private data. By contrast, competing methods achieve limited generalization performance in the Office-Caltech scenario. The empirical findings validate our solution’s ability, showcasing a marked improvement in within-domain accuracy and demonstrating superior cross-domain generalization performance compared with existing methodologies.Conclusion In summary, the AHF learning method, as extensively examined in this thorough investigation, not only presents a straightforward yet remarkably efficient foundation for future progress in the domain of federated learning but also emerges as a transformative paradigm in comprehensively addressing model and data heterogeneity. AHF not only lays the groundwork for more resilient and adaptable FL models but also serves as a guide for the transformation of collaborative knowledge sharing in the upcoming era of machine learning. Studying AHF is more than an exploration of an innovative FL methodology; it provides numerous opportunities that arise given the complexities of model and data heterogeneity in the development of machine learning models.
Keywords

订阅号|日报