The impact of activation functions on LTSM server load prediction accuracy: machine learning approach

O. M. Chaban; P. P. Pukach; Павло Пукач; V. R. Hladun

The continuously growing number of users and their requests to the server demands substantial resources to ensure fast responses without delays. However, server load is inherently unevenly distributed throughout the day, week, or month. Accurately predicting the required resources and dynamically managing their allocation is crucial, as it can lead to significant cost savings in server maintenance without compromising the user experience. This study investigates the influence of activation function choice on the forecasting accuracy of Long Short-Term Memory (LSTM) neural networks applied to real-world server request data. A dataset of incoming server requests was collected and aggregated into 20-minute intervals over 16 consecutive days. Several activation functions—including ReLU, Swish, and Softplus—were evaluated using mean squared error (MSE) as the primary performance metric. Each model configuration was trained six times to ensure statistical reliability, and the results were taken from one of the most stable runs. The experiments demonstrate that the selection of activation function has a significant impact on prediction accuracy: Swish and ReLU achieved the lowest MSE values, reducing error by up to $6.6-12.3$% and $10.5-16.3$%, respectively, compared to the baseline. Although the sigmoid function yielded the lowest test loss, further analysis revealed that this outcome was misleading: the model systematically underestimated peak loads, resulting in lower error values but poor predictive fidelity with respect to actual server load dynamics. These findings validate the hypothesis that activation function choice is a critical factor in optimizing LSTM-based forecasting models for server load prediction.

server load prediction

машинне навчання

функція активації

довга короткочасна пам’ять