Generative AI for Performance Engineering: Tailoring Llama-3 for Bottleneck Classification and Optimization Recommendations
This paper presents a novel approach to software performance analysis by integrating traditional profiling techniques with a fine-tuned large language model (LLM), based on the Llama-3 model. Addressing the challenges of manual profiling – such as overwhelming data volumes and the high expertise required to interpret performance metrics – the study introduces a lightweight AI-powered profiler trained on structured JSON-based profiling logs and code samples. The model is fine-tuned using parameter-efficient methods (LoRA and QLoRA) to classify performance bottlenecks (