Publications

Federated domain generalization: A survey (Early Access)
Journal

Ying Li, Xingwei Wang, Rongfei Zeng, Praveen Kumar Donta, Ilir Murturi, Min Huang, Schahram Dustdar.

Federated Learning

Federated Learning Domain Generalization

Proceedings of the IEEE

2025

Machine learning typically relies on the assumption that training and testing distributions are identical and that data is centrally stored for training and testing. However, in real-world scenarios, distributions may differ significantly and data is often distributed across different devices, organizations, or edge nodes. Consequently, it's to develop models capable of effectively generalizing across unseen distributions in data spanning various domains. In response to this challenge, there has been a surge of interest in federated domain generalization in recent years. Federated domain generalization synergizes federated learning and domain generalization techniques, facilitating collaborative model development across diverse source domains for effective generalization to unseen domains, all while maintaining data privacy. However, generalizing the federated model under domain shifts remains a complex, underexplored issue. This paper provides a comprehensive survey of the latest advancements in this field. Initially, we discuss the development process from traditional machine learning to domain adaptation and domain generalization, leading to federated domain generalization as well as provide the corresponding formal definition. Subsequently, we classify recent methodologies into four distinct categories: federated domain alignment, data manipulation, learning strategies, and aggregation optimization, detailing appropriate algorithms for each. We then overview commonly utilized datasets, applications, evaluations, and benchmarks. Conclusively, this survey outlines potential future research directions.

Three-tier storage framework based on TBchain and IPFS for protecting IoT security and privacy
Journal

Ying Li, Yaxin Yu, Xingwei Wang.

Blockchain

Blockchain Security and Privacy

ACM Transactions on Internet Technology

2023

Recently, most of the Internet of things (IoT) infrastructures are highly centralized with single points of failure, which results in serious security and privacy issues of IoT data. Fortunately, blockchain technique can provide a decentralized and secure IoT framework to deal with security issues based on the characteristics of decentralization, non-tampering, openness, transparency, and traceability. However, the blockchain consensus protocol guarantees the safety and reliability of data, but it also brings problems such as scalability limitations and poorstorageextensibility, resulting in the inability to directly integrate blockchain and the IoT in existing conditions. In this article, a private three-tier local blockchain, Three-tier architecture Blockchain (TBchain), is proposed to solve the problem by splitting part of the transactions in the public blockchain and locking theminahigher-level blockchain TBchain. Additionally, the private blockchain TBchain is connected to the public blockchain to build a hierarchical blockchain network to provide privacy protection for the IoT data stored on the blockchain. Finally, we implement an IoT framework based on TBchain and the InterPlanetary File System (IPFS) to realize the decentralized IoT, which guarantees the user’s access control right to personal data. Experimental results show that the IoT framework based on TBchain and IPFS realizes the user’s access control right to personal data by verifying in advance to ensure the confidentiality and security of shared data, and improves the security and privacy of IoT data and transactions. Moreover, we prove that the scalability and storage extensibility of the blockchain is positively correlated with the number of data blocks in TBchain.

Communication-Efficient Federated Learning for Heterogeneous Clients
Journal

Ying Li, Xingwei Wang, Haodong Li, Praveen Kumar Donta, Min Huang, Schahram Dustdar.

Federated Learning

Federated Learning Communication-Efficient Heterogeneous clients

ACM Transactions on Internet Technology

2025

Federated learning stands out as a promising approach within the domain of edge computing, providing a framework for collaborative training on distributed datasets without necessitating data sharing. However, federated learning involves the frequent transmission of machine learning model updates between the server and clients, resulting in high communication costs. Additionally, heterogeneous clients can further complicate the Federated Learning process and deteriorate performance. To address these challenges, we propose Adaptive Self-Knowledge Distillation-based Quality- and Reputation-Aware Cross-Device Federated Learning (ASDQR)- an efficient communication and inference framework designed for heterogeneous clients. ASDQR initiates the process by selecting high-reputation and high-quality clients to be involved in federated learning, significantly impacting communication efficiency and inference effectiveness. ASDQR also introduces a model of adaptive local self-knowledge distillation that incorporates multiple local personalized historical knowledge for more accurate inference, allowing the historical level to be dynamically adjusted across time. Finally, we present an inference-effective aggregation scheme that assigns higher weights to important and reliable local model updates based on clients’ contribution degrees when performing global model aggregation. ASDQR consistently outperforms baseline methods across all datasets and communication rounds, achieving 9.0% higher accuracy than FedAvg, 6.59% higher than MOON, 0.29% higher than FedProx, 0.2% higher than PFedSD, and 0.08% higher than FedMD on the MNIST dataset at 100 communication rounds. Similar improvements are observed on CIFAR, HAR, and WISDM datasets, demonstrating the robustness and efficiency of ASDQR in federated learning with non-IID data.

VARF: An Incentive Mechanism of Cross-Silo Federated Learning in MEC
Journal

Ying Li, Xingwei Wang, Rongfei Zeng, Mingzhou Yang, Kexin Li, Min Huang, Schahram Dustdar.

Federated Learning

Cross-silo federated learning long-term multi-access edge computing repeated game

IEEE Internet of Things Journal

2023

Cross-silo federated learning (FL) is a privacypreserving distributed machine learning where organizations acting as clients cooperatively train a global model without uploading their raw local data. Recently, the cross-silo FL in multiaccess edge computing (MEC) is used in increasing industrial applications. Most existing research on cross-silo FL pays attention to the performance aspect, ignoring the incentive mechanism for high-quality client selection and long participation in model training for efficient and stable FL, which has prevented the widespread adoption of cross-silo FL in MEC. In this article, we propose an incentive mechanism with quality-Aware and reputation-Aware based on the infinitely repeated game for cross-silo FL named VARF. VARF selects high-quality and highreputation edge nodes (ENs) as candidates for model training in the cross-silo FL by a heuristic algorithm and then motivates the selected ENs to actively contribute their resources. VARF also models the long-term behavior of ENs in cross-silo FL as an infinitely repeated game and derives a stable and long-term cooperative strategy for clients while maximizing the amount of local data for model learning in cross-silo FL. Extensive simulations with real-world data sets demonstrate that the performance of VARF is more beneficial than other benchmarks. Meanwhile, experimental results show that cloud platforms (CPs) and ENs eventually form a long and stable cooperative relationship under the trigger strategy.

KDN-FLB: Knowledge-Defined Networking Through Federated Learning and Blockchain
Journal

Ying Li, Praveen Kumar Donta, Xingwei Wang, Ilir Murturi, Min Huang, Schahram Dustdar.

Federated Learning, Knowledge-Defined Networking

Cross-silo federated learning long-term multi-access edge computing repeated game

IEEE Computer

2025

In this article, we explore the opportunities and benefits of integrating federated learning and blockchain (FLB) technologies to build an adaptable and secure knowledge-defined networking (KDN) system. Our aim is to enhance network performance by ensuring self-learning, self-adapting, and self-adjustment capabilities in dynamic and decentralized network environments. The proposed conceptual architecture, KDN-FLB, also strategically addresses critical challenges in knowledge sharing and privacy preservation within network environments. We discuss the constituents, architecture, processes, and use cases of KDN-FLB in contemporary networking applications. Additionally, we analyze the benefits, challenges, and future prospects associated with KDN-FLB, making it more intelligent for large-scale, dynamic, and decentralized network environments.

Federated Learning for Internet of Things
Chapter

Ying Li, Qiyang Zhang, Xingwei Wang, Rongfei Zeng, Haodong Li, Ilir Murturi, Schahram Dustdar, Min Huang.

Federated Learning

Cross-silo federated learning long-term multi-access edge computing repeated game

Learning Techniques for the Internet of Things

2023

Cross-silo federated learning (FL) is a privacypreserving distributed machine learning where organizations acting as clients cooperatively train a global model without uploading their raw local data. Recently, the cross-silo FL in multiaccess edge computing (MEC) is used in increasing industrial applications. Most existing research on cross-silo FL pays attention to the performance aspect, ignoring the incentive mechanism for high-quality client selection and long participation in model training for efficient and stable FL, which has prevented the widespread adoption of cross-silo FL in MEC. In this article, we propose an incentive mechanism with quality-Aware and reputation-Aware based on the infinitely repeated game for cross-silo FL named VARF. VARF selects high-quality and highreputation edge nodes (ENs) as candidates for model training in the cross-silo FL by a heuristic algorithm and then motivates the selected ENs to actively contribute their resources. VARF also models the long-term behavior of ENs in cross-silo FL as an infinitely repeated game and derives a stable and long-term cooperative strategy for clients while maximizing the amount of local data for model learning in cross-silo FL. Extensive simulations with real-world data sets demonstrate that the performance of VARF is more beneficial than other benchmarks. Meanwhile, experimental results show that cloud platforms (CPs) and ENs eventually form a long and stable cooperative relationship under the trigger strategy.

High Trusted Cloud Storage Model Based on TBChain Blockchain (in Chinese)
Journal

LI Ying, YU Ya-xin, ZHANG Hong-yu, and LI Zhen-guo.

Blockchain

Tiered Hierarchical Blockchain Highly Trusted Cloud Storage Scalability Storage Scalability Metadata

Computer Science

2020

Data stored in the cloud can be illegally stolen or tampered with, exposing users’ data to confidentiality threats. In order to store mass data more safely and efficiently, this paper proposes a storage model CBaaS (Cloud and Blockchain as a Service) that supports the combination of index, traceability, and verifiability of Cloud storage and Blockchain, which can enhance the credibility of data in the Cloud. Secondly, blockchain consensus protocol leads to low throughput and slow processing speed of transactions, which seriously restricts the development of decentralized applications. Based on this, this paper implements a three-tier architecture Blockchain model TChain, which improves the scalability of the Blockchain and the throughput of transactions in the blockchain by dividing apart part of the blockchain and locking it in the block of a higher-level blockchain. Next, due to the demand of decentralization, blockchain occupies a large amount of storage space of massive nodes, which greatly limits the development and application of the database system based on blockchain technology. Part of the transaction is stored locally through TChain, which increases the scalability of blockchain capacity. The ETag in the cloud storage object metadata is used to identify the contents of an Object and can be used to check if the contents of the Object have changed. By storing the object metadata in the cloud storage on the blockchain, the ETag value can be used to check whether the content of the Object changes, and the data on the blockchain cannot be tampered with to verify whether the data stored on the cloud is safe and improve the reliability of the data stored on the cloud. The experimental results show that the TChain model improves the scalability and storage capacity scalability of the blockchain, and the CBaaS model also improves the reliability of data stored in the cloud.

Joint optimization of multi-dimensional resource allocation and task offloading for QoE enhancement in Cloud-Edge-End collaboration
Journal

Chao Zeng, Xingwei Wang, Rongfei Zeng, Ying Li, Jianzhi Shi, Min Huang

Edge Computing

Cloud–edge–end Collaboration Task Offloading Resources Allocation Quality of Experience Multi-agent Reinforcement Learning

Future Generation Computer Systems

2024

Cloud-Edge-End Collaboration (CEEC) computing architecture inherits many merits from both edge computing and cloud computing and thus is considered as a promising candidate for future network services. In CEEC, user’s QoE is impacted by offload performance which should consider offload strategy, computational resources and network status simultaneously. However, previous offload optimization studies neglect the joint consideration of dependent task offloading, computational resources and channel resources, which may not produce potential performance improvement. In this paper, we investigate the joint optimization of dependent task offloading, computational resource allocation, user transmission power control, and channel resource allocation in the CEEC scenario, with the goal of maximizing user’s QoE. Initially, a new QoE metric is defined to capture the impacts of delay and energy consumption on user’s QoE. Following this definition, we formulate the joint optimization problem as a Mixed Integer Nonlinear Programming (MINLP) problem and introduce a method of multi-agent deep reinforcement learning to solve our MINLP problem with high computation complexity. Extensive experiments are performed, and experimental results show that our proposed scheme outperforms baselines in a series of metrics.

Pre-training enhanced unsupervised contrastive domain adaptation for industrial equipment remaining useful life prediction
Journal

Haodong Li, Peng Cao, Xingwei Wang, Ying Li, Bo Yi, Min Huang

Industrial intelligence

Domain Adaptation Contrastive Learning Industrial Intelligence

Advanced Engineering Informatics

2024

An essential task in industrial intelligence is to accurately predict the remaining useful life(RUL) of industrial equipment, and there has been tremendous progress in RUL prediction based on data-driven methods. However, these methods rely heavily on the data representation ability of the model and the assumption of consistency in data distribution. In practical industrial environments, due to different working conditions, industrial time series data exhibit high-dimensional, dynamic, and noisy characteristics, which often leads to ineffective transferability of trained models from one environment to similar yet unlabeled new environments. To tackle the aforementioned issues, this paper first designed a dual parallel time–frequency feature extraction network for extracting effective time-series features with different dimensions and importance levels. Afterwards, an enhanced pre-training framework is proposed that employs similarity contrast learning to unearth the latent representational information in industrial time-series data. Finally, a domain adaptation method based on momentum-contrast adversarial learning is proposed, which preserves the structural information specific to the target domain during adversarial learning domain-invariant features, mitigating the negative transfer effect. A series of rigorous experiments were conducted on two widely recognized industrial benchmark dataset, focusing on cross-domain scenarios. The results demonstrate that our approach achieves state-of-the-art performance in industrial cross-domain prediction scenarios.

FedCPG: A class prototype guided personalized lightweight federated learning framework for cross-factory fault detection
Journal

Haodong Li, Xingwei Wang, Peng Cao, Ying Li, Bo Yi, Min Huang

Industrial intelligence

Personalized Federated Learning Fault Detection Class Prototype Deep Learning

Computers in Industry

2024

Industrial equipment condition monitoring and fault detection are crucial to ensure the reliability of industrial production. Recently, data-driven fault detection methods have achieved significant success, but they all face challenges due to data fragmentation and limited fault detection capabilities. Although centralized data collection can improve detection accuracy, the conflicting interests brought by data privacy issues make data sharing between different devices impractical, thus forming the problem of industrial data silos. To address these challenges, this paper proposes a class prototype guided personalized lightweight federated learning framework(FedCPG). This framework decouples the local network, only uploading the backbone model to the server for model aggregation, while employing the head model for local personalized updates, thereby achieving efficient model aggregation. Furthermore, the framework incorporates prototype constraints to steer the local personalized update process, mitigating the effects of data heterogeneity. Finally, a lightweight feature extraction network is designed to reduce communication overhead. Multiple complex industrial data distribution scenarios were simulated on two benchmark industrial datasets. Extensive experiments have demonstrated that FedCPG can achieve an average detection accuracy of 95% in complex industrial scenarios, while simultaneously reducing memory usage and the number of parameters by 82%, surpassing existing methods in most average metrics. These findings offer novel perspectives on the application of personalized federated learning in industrial fault detection.

An Adaptive Federated Domain Generalization Framework for Consumer Electronics Manufacturing Equipment Cross-Factory Fault Detection
Journal

Haodong Li, Xingwei Wang, Ying Li, Bo Yi, Peng Cao, Min Huang, Keqin Li

Industrial intelligence

Federated Domain Generalization SharpnessAware Minimization Fault Detection

IEEE Transactions on Consumer Electronics

2025

Reliability and operational efficiency of equipment are crucial in the manufacturing of consumer electronics. Existing fault detection methods often face limitations such as dataset dependence, poor scenario generalization, and data privacy issues when addressing the complex and diverse operating conditions in product manufacturing. To address these issues, this paper proposes a cross-factory fault detection framework for consumer electronics production equipment based on adaptive federated domain generalization. This framework reconsiders the limitations of Sharpness-Aware Minimization(SAM) and, by jointly considering local personalization and global generalization objectives, designs an adaptive weighting scheme to balance the trade-off between loss minimization and sharpness during optimization, thereby improving the model’s robustness and accuracy under various working conditions. Then, A parameter momentum aggregation scheme is proposed on the server side to incorporate historical gradient information, reducing client drift impact and improving model convergence and stability. Finally, extensive scenario experiments were conducted on two public datasets. The results indicate that the proposed framework achieves an average improvement of 22.5% in fault detection accuracy over the baseline model across varying operating conditions and data distribution scenarios, demonstrating its effectiveness in addressing the challenges of complex condition variations and data privacy in consumer electronics manufacturing.

A personalized federated cloud-edge collaboration framework via cross-client knowledge distillation
Journal

Shining Zhang, Xingwei Wang, Rongfei Zeng, Chao Zeng, Ying Li, Min Huang

Federated Learning

Personalized Federated Learning Knowledge Distillation Non-IID Data Cloud–edge Computing

Future Generation Computer Systems

2024

As an emerging distributed machine learning paradigm, federated learning has been extensively used in the domain of cloud–edge computing to collaboratively train models without uploading their raw data. However, the existing federated learning methods make an effort to train a single optimal model that encompasses all participating clients. These methods may perform poorly on some clients due to variations in data distribution and limited data availability of clients. Moreover, assigning weights to clients merely based on the quantity of the client data neglects the inter-client correlation. In this paper, we propose a personalized federated learning framework with cross-client knowledge distillation called FedCD. FedCD is composed of a local model training strategy with cross-client co-personalized knowledge fusion and a global model weighted aggregation mechanism via peer correlation. In the local model training strategy, FedCD fuses similar personalized knowledge from all clients to guide the lcoal training of the client. In the global model weighted aggregation mechanism, the server assigns weights to clients based on their influence among clients. Extensive experiments conducted on various datasets demonstrate that FedCD significantly improves the test accuracy by approximately 0.18%–16.65% compared to the baseline methods.

Assessing the imperative of conditioning factor grading in machine learning-based landslide susceptibility modeling: A critical inquiry
Journal

Taorui Zeng, Bijing Jin, Thomas Glade, Yangyi Xie, Ying Li, Yuhang Zhu, Kunlong Yin

Geology

Landslide Susceptibility Modeling Machine Learning Model Factor Grading Standardization Guidance Three Gorges Reservoir Area

CATENA

2024

Current machine learning approaches to landslide susceptibility modeling often involve grading conditioning factors, a method characterized by substantial subjectivity and randomness. The necessity and rationality of such grading have sparked continued debate. Recognizing the potential profound impact of this grading on the results of models, we conducted an in-depth study focusing on four townships within the Wanzhou section of the Three Gorges Reservoir area. A comprehensive assessment was conducted using three traditional machine learning models, five ensemble learning models, and four deep learning models to evaluate the implications of continuous factor grading. Three grading strategies were explored: non-grading, equal intervals, and natural breaks. Further investigation was conducted to determine how various grade levels (e.g., 4, 6, 8, 12, 16, 20) affect model efficacy. Our analysis reveals that the Support Vector Machine (SVM) model performs optimally with an 8-level grading using natural breaks. In contrast, a decision tree (DT) and its associated ensemble models are more effective without grading. For Multi-Layer Perceptron Neural Network (MLPNN) and Convolutional Neural Networks (CNN) models, a natural breaks grading exceeding 8 levels is advisable. Gated Recurrent Unit (GRU) and Deep Neural Networks (DNN) models benefit from an equidistant grading strategy of over 12 levels, while Long Short-Term Memory Neural Networks (LSTM) models thrive with an equidistant grading surpassing 16 levels. This study is pioneering in introducing grading guidelines for machine learning models in landslide susceptibility modeling. Our findings offer invaluable insights for future research, setting a path towards more standardized practices in this field. This enhances the bridge between theoretical knowledge and its real-world application, promoting a more rigorous and systematic grading approach and advancing the standardization of landslide susceptibility modeling.

Intelligence Inference on IoT Devices
Chapter

Qiyang Zhang, Ying Li, Dingge Zhang, Ilir Murturi, Victor Casamayor Pujol, Schahram Dustdar, Shangguang Wang

Edge Computing

Intelligent Inference IoT Edge Intelligence

Learning Techniques for the Internet of Things

2023

With the rapid advancement of artificial intelligence (AI), the proliferation of deep neural networks (DNNs) has ushered in a transformative era, revolutionizing modern lifestyles and enhancing production efficiency. However, the substantial computational and data requirements generated by Internet of Things (IoT) devices present a significant bottleneck, rendering traditional cloud-based computing models inadequate for real-time processing tasks. In response to these challenges, developers have increasingly turned to cloud offloading as a solution, despite the high infrastructure costs and heavy reliance on network conditions associated with this approach. Meanwhile, the emergence of SoCs has enabled on-device execution, particularly on high-tier platforms capable of effectively handling SOTA DNNs. This chapter offers a comprehensive review of intelligent inference approaches, with a specific emphasis on reducing inference time and minimizing transmitted bandwidth between IoT devices and the cloud. The review encompasses various aspects, including the background of inference, hardware architectures supporting inference, a diverse range of intelligent applications, inference libraries tailored for IoT devices, and different types of inference techniques for applications. Additionally, this work addresses the current challenges in intelligent inference, discusses future development trends, and provides future research directions.