Federated Learning for Intrusion Detection Under Class Imbalance: A Multi-Domain Ablation Study with Per-Client SMOTE

Demirbaş Paray, Atike; AYDOS, MURAT

doi:10.3390/app16020801

Federated Learning for Intrusion Detection Under Class Imbalance: A Multi-Domain Ablation Study with Per-Client SMOTE

Demirbaş Paray A., AYDOS M.

Applied Sciences (Switzerland), cilt.16, sa.2, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 16 Sayı: 2
Basım Tarihi: 2026
Doi Numarası: 10.3390/app16020801
Dergi Adı: Applied Sciences (Switzerland)
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, Directory of Open Access Journals
Anahtar Kelimeler: autoencoder, class imbalance, federated learning, intrusion detection system, multi-domain learning, SMOTE
Hacettepe Üniversitesi Adresli: Evet

Özet

Federated learning (FL) enables privacy-preserving collaboration for Network Intrusion Detection Systems (NIDSs), but its effectiveness under heterogeneous traffic, severe class imbalance, and domain shift remains insufficiently characterized. We evaluate FL in two settings: (i) single-domain training on CICIDS-2017, InSDN/OVS, and 5G-NIDD with cross-domain testing, and (ii) multi-domain training that learns a unified model across enterprise and Software-Defined Network (SDN) traffic. Using consistent preprocessing and controlled ablations over balancing strategy, loss function, and client sampling, we find that dataset structure (class separability) largely determines single-domain FL gains. On datasets with lower separability, FL with Per-Client Synthetic Minority Over-sampling Technique (SMOTE) substantially improves Macro-F1 over centralized baselines, while well-separated datasets show limited benefit. However, single-domain models degrade sharply under domain shift, showing substantial degradation in cross-domain transfer. To mitigate this, we combine multi-domain FL with AutoEncoder pretraining and achieve 77% Macro-F1 across environments, demonstrating that FL can learn domain-invariant representations when trained on diverse traffic sources. Overall, our results indicate that Per-Client SMOTE is the preferred balancing strategy for federated NIDS, and that multi-domain training is often necessary when deployment environments differ from training data.