SQL Query Explainability using Natural Language Generation
This work is rooted in a larger project aimed at developing a dialogue system that helps increase transparency of database query outputs for non-expert SQL users. Previously, I’ve discussed processes for building a training set using few-shot prompting and a hand-annotated set of commented queries. Additionally, I’ve discussed test set results from LLMs (such as ChatGPT and Llama). This presentation will shift focus to the content of the natural language.
I’ll discuss the development of comment guidelines and the need for guidelines in standardizing the evaluation. Comment guidelines should ideally provide transparency in what constitutes a “good” comment. Comments should also 1) reflect certain properties of the relational database structure, 2) prioritize semantic fidelity to the query and 3) align with the user language wherever appropriate. The comment guidelines use these core elements to outline how generated natural language can increase explainability of database queries. Our methods will be compared to approaches that leverage templated or rule-based systems of explainability.