ML Infrastructure

DATA PREPARATION STRATEGIES IN KUBEFLOW FOR CLOUD-NATIVE AI SYSTEMS

This article presents the main findings from an in-depth study of data preparation strategies using Kubeflow in cloud-native AI systems deployed on Azure Kubernetes Service. The results demonstrate that integrating Kubeflow Pipelines with Azure-native tools enables scalable and automated processing of large datasets, significantly improving training efficiency and model accuracy. The use of TensorFlow Data Validation proved effective in detecting schema anomalies and data drift, enhancing data reliability across iterative ML workflows.