A HYBRID SECURE AGGREGATION AND DIFFERENTIAL PRIVACY FRAMEWORK FOR COMMUNICATION-EFFICIENT BIG DATA ANALYTICS

Authors:

Subhajit Roy,Rupak Chakraborty,Tapan Chowdhury,

DOI NO:

https://doi.org/10.26782/jmcms.2026.06.00003

Keywords:

Federated Learning,Differential Privacy,Secure Aggregation,Privacy-Preserving Machine Learning,Gradient Compression,Distributed Learning,Big Data Analytics,

Abstract

Federated Learning (FL) is a type of distributed learning where several clients (clusters) train a machine learning model without sharing the raw data directly. Nevertheless, there are still three significant issues for practical FL systems: privacy leakage through sharing the model update, heavy communication burden, and unstable learning performance when the data distribution is not IID. To enhance the protection of privacy and communication efficiency in distributed big data analytics, the authors introduce a Privacy-Preserving Federated Learning (PP-FL) framework combining Differential Privacy (DP), Secure Aggregation (SA), and Adaptive Gradient Compression (AGC). Differential Privacy implies that calibrated Gaussian noise is added to local updates, and Secure Aggregation ensures that the central entity cannot see individual updates from the clients. Adaptive Gradient Compression cuts down on communicating the most important components of a gradient. The proposed framework is tested with the high-resolution MNIST dataset and CIFAR-10 dataset in non-IID federated settings. The experimental results demonstrate that PP-FL has a significant boost in reducing communication cost compared to standard FedAvg and DP-FedAvg. The results also illustrate a clear trade-off between privacy and utility, as the addition of noise (differential privacy) or departure from an IID data distribution or extreme compression can negatively impact classification accuracy. This behaviour is explained by means of revised studies that consist of a component-wise interpretation, observation of the gradient norm, and evaluation of the compression ratio. The overall results suggest that the proposed PP-FL system can be employed to implement communication-efficient and privacy-preserving federated learning, and that careful tuning of noise parameters and compression parameters is essential to ensure the stability of learning.

Refference:

I. Abadi, Martin, et al. “Deep Learning with Differential Privacy.” Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 308–318. 10.1145/2976749.2978318.
II. Alistarh, Dan, et al. “QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding.” Advances in Neural Information Processing Systems, 2017, pp. 1709–1720. https://proceedings.neurips.cc/paper/2017/hash/6c340f25839e6acdc73414517203f5f0-Abstract.html.
III. Bernstein, Jeremy, et al. “signSGD: Compressed Optimisation for Non-Convex Problems.” Proceedings of the 35th International Conference on Machine Learning, 2018. https://proceedings.mlr.press/v80/bernstein18a.html.
IV. Blanchard, Peva, et al. “Machine Learning with Adversaries: Byzantine Tolerant Gradient Descent.” Advances in Neural Information Processing Systems, 2017, pp. 119–129. https://proceedings.neurips.cc/paper/2017/hash/f4b9ec30ad9f68f89b29639786cb62ef-Abstract.html.
V. Bonawitz, Keith, et al. “Practical Secure Aggregation for Privacy-Preserving Machine Learning.” Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 1175–1191. 10.1145/3133956.3133982.
VI. Geyer, Robin C., Tassilo Klein, and Moin Nabi. “Differentially Private Federated Learning: A Client Level Perspective.” NeurIPS Workshop on Machine Learning on the Phone and Other Consumer Devices, 2017. https://arxiv.org/abs/1712.07557.
VII. Kairouz, Peter, et al. “Advances and Open Problems in Federated Learning.” Foundations and Trends in Machine Learning, vol. 14, no. 1–2, 2021, pp. 1–210. 10.1561/2200000083.
VIII. Karimireddy, Sai Praneeth, et al. “SCAFFOLD: Stochastic Controlled Averaging for Federated Learning.” Proceedings of the 37th International Conference on Machine Learning, 2020, pp. 5132–5143. https://proceedings.mlr.press/v119/karimireddy20a.html.
IX. Krizhevsky, Alex. “Learning Multiple Layers of Features from Tiny Images.” University of Toronto, 2009. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf.
X. LeCun, Yann, Corinna Cortes, and Christopher J. C. Burges. “MNIST Handwritten Digit Database.” n.d. http://yann.lecun.com/exdb/mnist/.
XI. Li, Qinbin, et al. “A Survey on Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection.” IEEE Transactions on Knowledge and Data Engineering, 2022. https://doi.org/10.1109/TKDE.2021.3124599.
XII. Li, Tian, et al. “Federated Learning: Challenges, Methods, and Future Directions.” IEEE Signal Processing Magazine, vol. 37, no. 3, 2020, pp. 50–60. 10.1109/MSP.2020.2975749.
XIII. Li, Tian, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. “Federated Optimization in Heterogeneous Networks.” Proceedings of Machine Learning and Systems, 2020. https://proceedings.mlsys.org/paper/2020/hash/1f5fe83998a09396ebe6477d9475ba0c-Abstract.html.
XIV. Mahan, H. Brendan, et al. “Communication-Efficient Learning of Deep Networks from Decentralized Data.” Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017. https://proceedings.mlr.press/v54/mcmahan17a.html.
XV. McMahan, H. Brendan, Daniel Ramage, Kunal Talwar, and Li Zhang. “Learning Differentially Private Recurrent Language Models.” International Conference on Learning Representations, 2018. https://arxiv.org/abs/1710.06963.
XVI. Reddi, Sashank J., et al. “Adaptive Federated Optimization.” International Conference on Learning Representations, 2021. https://arxiv.org/abs/2003.00295.
XVII. Truex, Stacey, et al. “A Hybrid Approach to Privacy-Preserving Federated Learning.” 2018. https://arxiv.org/abs/1812.03224.
XVIII. Wang, Qiong, et al. “Fast-Adapting and Privacy-Preserving Federated Recommender System.” The VLDB Journal, vol. 31, 2022, pp. 877–896. 10.1007/s00778-021-00700-6.
XIX. Wang, Zhe, et al. “An Adaptive Differential Privacy Method Based on Federated Learning.” 2024. https://arxiv.org/abs/2408.08909.
XX. Xu, Guangquan, Zhenzhe Zhou, and Jin Dong. “A Blockchain-Based Federated Learning Scheme for Data Sharing in Industrial Internet of Things.” IEEE Internet of Things Journal, 2023. 10.1109/JIOT.2023.3298196.
XXI. Yang, Qiang, Yang Liu, Tianjian Chen, and Yongxin Tong. “Federated Machine Learning: Concept and Applications.” ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, 2019. 10.1145/3298981.
XXII. Yin, Dong, Yudong Chen, Kannan Ramchandran, and Peter Bartlett. “Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates.” Proceedings of the 35th International Conference on Machine Learning, 2018. https://arxiv.org/abs/1803.01498.

View Download