AUTOMATED DETECTION AND COMPARATIVE ANALYSIS OF SQL ANTIPATTERNS IN TEXT-TO-SQL DATASETS
This study examines the quality of SQL queries in Text-to-SQL datasets, focusing on antipatterns in gold-standard queries used to train and evaluate large language models (LLMs). While such queries may be syntactically valid and executable, they can still contain structural flaws that affect correctness, portability, and benchmark reliability. Since these flaws do not prevent successful execution, they often go undetected by conventional validation methods — creating what we call a correctness paradox.