Clippers 2/4: Sam Stevens on DSPy, compiling prompts, and similar work

As language models continue to evolve, the complexity of prompt engineering has grown in parallel. My talk examines the fundamental insights of DSPy through the lens of plib, a minimalist implementation that highlights key principles often overlooked in current LLM research. I argue that automated few-shot example selection can match or exceed carefully crafted zero-shot prompts, challenging the conventional wisdom of prompt engineering. The framework introduces a novel perspective on compute scaling in language models, suggesting “prompt compilation” as a fourth axis alongside pre-training, post-training, and inference-time computation. By treating prompt optimization as a reinforcement learning problem with verifiable rewards, plib offers a systematic approach to example selection. I argue that this style of thinking enables the decomposition of complex language tasks into modular sub-programs, a capability that proves challenging with traditional prompting methods. I will illustrate how many contemporary developments in LLM applications are natural extensions of principles already present in DSPy’s design, arguing for a renewed examination of these foundational ideas in the context of modern language model development.