Member-only story

LLM in a Loop: Improving outputs with evals

An overview with example (Python) code

11 min readApr 15, 2025

This is the 4th article in a series on AI agents. Although we typically improve an LLM’s performance on a task via prompt engineering, this approach has a core limitation. Namely, it relies on a model’s ability to accurately execute a task in a single shot. Here, I’ll discuss another approach to improving an LLM system via feedback loops. I’ll start with an overview of key concepts and then review an example with Python code.

Image by author.

AI systems like ChatGPT are powerful for two main reasons. First, they can execute arbitrary tasks based on natural language inputs. Second, users can provide feedback in real time to refine model outputs.

For example, you might have ChatGPT rewrite your resume. If its initial response isn’t what you want, you can give it specific feedback and have it try again.

After a few turns of conversation, you eventually get an improved resume that’ll help you land your dream job.

The Power of Iteration

--

--

Shaw Talebi
Shaw Talebi

Responses (4)