ADAPTIVE CDN NODE SELECTION IN DYNAMIC ICT SYSTEMS USING ONLINE CONTROLLED EXPERIMENTS AND CHANGE DETECTION MULTI ARMED BANDIT ALGORITHMS

Yuriy Zanichkovskyy; Volodymyr Fast; Andriy Masiuk

Modern info‑communication technology (ICT) infrastructures such as content‑delivery networks (CDNs) must continuously tune low‑level parameters to deliver high performance under variable and non‑stationary network conditions. This paper investigates how online controlled experiments—including classical A/B tests and adaptive multi‑armed bandit (MAB) algorithms—can be used to optimise CDN node selection. We formalise the optimisation problem as minimising a network‑performance objective of average latency, one of key metrics used to measure network performance. After reviewing prior work on A/B testing and MABs, we propose a new Change‑Detected Upper‑Confidence‑Bound (CD‑UCB) algorithm that couples the classical UCB arm‑selection rule with a cumulative‑sum (CUSUM) change‑detection statistic. The CD‑UCB algorithm rapidly resets its estimates when performance shifts, enabling faster adaptation to non‑stationary environments. A simulation of CDN node selection with three nodes having different latency distributions is used to compare four approaches: simple A/B testing, sequential A/B testing with early stopping, standard UCB, and the proposed CD‑UCB. Each algorithm is evaluated using cumulative regret , rolling average latency, rolling throughput and the percentage of requests sent to the optimal node. In a stationary setting, all methods eventually identify the best node, but MAB‑based approaches converge faster and exhibit lower regret. When the environment changes abruptly, simple and sequential A/B tests fail to adapt and incur high regret, whereas standard UCB adapts slowly. CD‑UCB detects changes quickly and nearly matches the instantaneous optimal policy, achieving the lowest cumulative regret and closely tracking the true optimal latency. The results demonstrate that adaptive MAB algorithms with change detection are better suited than static A/B tests for optimising dynamic ICT infrastructures. The study concludes with recommendations for applying MAB‑based online experiments to infrastructure optimisation and suggests future work on multi‑objective optimisation, contextual bandits and evaluations on real‑world testbeds.

[1] Kohavi, R., Longbotham, R., Sommerfield, D. and Henne, R.M., 2009. Controlled experiments on the web: Survey and practical guide, Available at: https://www.researchgate.net/publication/220451900_Controlled_experiment... (Accessed: 14 February 2025),. doi: http://dx.doi.org/10.1007/s10618-008-0114-1.

[2] Kohavi, R. and Thomke, S., 2017. The surprising power of online experiments. Harvard Business Review, Available at: https://hbr.org/2017/09/the-surprising-power-of-online-experiments (Accessed: 14 February 2025)

[3] Lattimore, T., Szepesvári, C., 2020. Bandit Algorithms. Cambridge University Press, pp. 56-67, available at: https://tor-lattimore.com/downloads/book/book.pdf (Accessed 10 February 2025)

[4] Kuleshov V., Precup D., 2014. Algorithms for the multi-armed bandit problem, available at: https://arxiv.org/pdf/1402.6028 (Accessed 10 February 2025) doi: https://doi.org/10.48550/arXiv.1402.6028

[5] Bouneffouf D., Rish I., 2019, A Survey on Practical Applications of Multi-Armed and Contextual Bandits, available at: https://arxiv.org/abs/1904.10040 (Accessed 14 February 2025), doi: https://doi.org/10.48550/arXiv.1904.10040

[6] Burtini G., Loeppky J., Lawrence R., 2015, A Survey of Online Experiment Design with the Stochastic Multi-Armed Bandit, available at: https://arxiv.org/abs/1510.00757 (Accessed 14 February 2025), doi: https://doi.org/10.48550/arXiv.1510.00757

[7] Kathiriya S., Kumar S., Mullapudi M., Data-Driven Design Optimization: A/B Testing in Large -Scale Applications, 2022, International Journal of Science and Research (IJSR), available at https://www.ijsr.net/archive/v11i6/SR24212165200.pdf (Accessed 10 February 2025), doi: http://dx.doi.org/10.21275/SR24212165200

[8] Quin F., Weyns D.,Matthias Galster M., Silva C., 2023,A/B Testing: A Systematic Literature Review, available at https://arxiv.org/abs/2308.04929 (Accessed 10 February 2025), doi: https://doi.org/10.48550/arXiv.2308.04929

[9] Bajpai V., Fabijan A., Extensible Experimentation Platform: Effective A/B Test Analysis at Scale, 2025, available at https://www.researchgate.net/publication/388630852_Extensible_Experimentation_Platform_Effective_AB_Test_Analysis_at_Scale (Accessed: 14 February 2025), doi: http://dx.doi.org/10.13140/RG.2.2.18138.86722

[10] Wang J., Zuckerman S., Lorenzo-Del-Castillo J.,Evaluation of Multi-Armed Bandit algorithms for efficient resource allocation in Edge platforms, 2024, IECCONT workshop, available at https://hal.science/hal-04809306v1/file/MAB_IoT_Europar_2024.pdf, (Accessed 10 February 2025)

[11] Bonnefoi R., Lilian Besson L., Moy C., Kaufmann E., Palicot J., 2018, Multi-Armed Bandit Learning in IoT Networks, Available at: https://arxiv.org/abs/1807.00491 (Accessed: 14 February 2025), doi: https://doi.org/10.48550/arXiv.1807.00491

[12] Lai, C.-H., Shen, L.-H. and Feng, K.-T. (2023) Intelligent Load Balancing and Resource Allocation in O-RAN: A Multi-Agent Multi-Armed Bandit Approach. arXiv preprint arXiv:2303.14355. Available at: https://arxiv.org/abs/2303.14355 (Accessed: 14 February 2025), doi: https://doi.org/10.48550/arXiv.2303.14355

[13] Russo, G., Cardellini, V. and Lo Presti, F. (2024) ‘Towards a Multi-Armed Bandit Approach for Adaptive Load Balancing in Function-as-a-Service.’ 2024 International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS) Workshops. Available at: https://2024.acsos.org/details/acsos-2024-workshops/3/Towards-a-Multi-Ar... (Accessed: Accessed 10 February 2025).

[14] Georgiev G., Statistical Methods in Online A/B Testing, 2019

[15] Bubeck S., Nicolò Cesa-Bianchi N., 2012, Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems, available at: https://arxiv.org/abs/1204.5721 (Accessed 10 February 2025), doi: https://doi.org/10.48550/arXiv.1204.5721

[16] Guha, S., Munagala K., Stochastic Regret Minimization via Thompson Sampling, 2014, Journal of Machine Learning Research, available at: https://www.researchgate.net/publication/289549801_Stochastic_regret_min... (Accessed 10 February 2025),

[17] Han Q., Khamaru K., Zhang C., 2024, UCB algorithms for multi-armed bandits: Precise regret and adaptive inference, available at: https://arxiv.org/abs/2412.06126 (Accessed 10 February 2025), doi: https://doi.org/10.48550/arXiv.2412.06126

[18] Lai C., Shen L., Feng K., 2023, Intelligent Load Balancing and Resource Allocation in O-RAN: A Multi-Agent Multi-Armed Bandit Approach, available at: https://arxiv.org/abs/2303.14355 (Accessed 10 February 2025), doi: https://doi.org/10.48550/arXiv.2303.14355

[19] Tache M., Păscuțoiu O., Borcoci E., Optimization Algorithms in SDN: Routing, Load Balancing, and Delay Optimization, 2024, available at: https://www.mdpi.com/2076-3417/14/14/5967 (Accessed 10 February 2025), doi: https://doi.org/10.3390/app14145967

[20] Santana P., Moura J., 2023, A Bayesian Multi-Armed Bandit Algorithm for Dynamic End-to-End Routing in SDN-Based Networks with Piecewise-Stationary Rewards, available at: https://www.mdpi.com/1999-4893/16/5/233 (Accessed 10 February 2025), doi: https://doi.org/10.3390/a16050233

[21] Benamor A.,Habachi O., Kammoun I., Cances. J.,Multi-Armed Bandit Framework for Resource Allocation in Uplink NOMA Networks, 2023, IEEE Conference on Wireless Communications and Networking, available at https://ieeexplore.ieee.org/document/10118826 (Accessed 10 February 2025)

[22] Gao Q., Xie Z., Multi-Armed Bandit-Based User Network Node Selection, 2024, MDPI Sensors, available at https://www.mdpi.com/1424-8220/24/13/4104, (Accessed 10 February 2025)

[23] Duplyakin D., Ricci R., Maricq A., Wong G., Duerig J., Eide E., Stoller L., Hibler M., Johnson D., Webb K., Akella A., Wang K., Ricart G., Landweber L., Elliott C., Zink M., Cecchet E., Kar S., Mishra P., The Design and Operation of CloudLab, 2019, available at https://www.usenix.org/system/files/atc19-duplyakin_0.pdf (Accessed: 18 May 2025), doi: https://doi.org/10.5555/3358807.3358809

[24] Drugan M.M., Nowé A., Designing Multi-Objective Multi-Armed Bandits Algorithms: A Study, 2013, available at https://ieeexplore.ieee.org/document/6707036 (Accessed: 18 May 2025), doi: https://doi.org/10.1109/IJCNN.2013.6707036

[25] Li Z., Ai Q., Managing Considerable Distributed Resources for Demand Response: A Resource Selection Strategy Based on Contextual Bandit, 2023, available at https://www.mdpi.com/2079-9292/12/13/2783 (Accessed: 18 May 2025), doi: https://doi.org/10.3390/electronics12132783