Self-Learning Agent

A self-learning agent improves its own skills over time — automatically. It runs a skill, evaluates the output against a goal, rewrites the skill to do better, and repeats. Combined with QMD memory, it also accumulates knowledge across sessions so each run starts smarter than the last.

The problem

Skills drift. A skill that worked well three months ago may produce mediocre output today because your standards have changed, new tools are available, or the skill was never quite right to begin with. Manually refining skills is slow. A self-learning loop automates the iteration.

What you need

The Self-Improving Agent skill installed
A skill you want to improve (or start fresh with any prompt-based workflow)
Optionally: QMD memory for cross-session learning

Install the Self-Improving Agent skill if you haven't:

npx clawhub@latest install self-improving-agent

How the loop works

The self-improving agent follows a tight cycle:

Run — executes your skill with the test input
Evaluate — scores the output against your stated goal
Rewrite — edits the skill file to address weaknesses
Repeat — runs again with the improved skill

After max-iterations cycles, it keeps the best-performing version and shows you what changed.

Steps

Pick a skill to improve

Choose any skill that produces output you can evaluate — a research brief, a code review checklist, a morning summary format. The clearer your goal, the better the loop performs.

For this recipe we'll use the competitor-brief skill as an example, but the pattern applies to any skill.

Define your goal

Your goal needs to be concrete enough that your Claw can score output against it. Vague goals produce mediocre improvements.

Weak goal	Strong goal
"Make it better"	"All key information in under 300 words with no filler sentences"
"More accurate"	"Every factual claim should be sourced; flag anything unverified"
"More useful"	"Output should include a recommended action and a risk to watch"

Run the self-improving loop

Use the self-improving-agent skill on my competitor-brief skill.
Goal: the brief should be more concise — all key information in under 300 words.
Test input: research Notion as a competitor.
Max iterations: 3.

Your Claw will run the loop and report what it changed after each iteration.

Save the improved skill

After the loop completes, your Claw will have updated the skill file in place. Confirm the result looks right:

Show me the current version of my competitor-brief skill.

If you're happy with it, it's already saved. If not, tell your Claw what to adjust manually.

(Optional) Save learning to QMD memory

If you have QMD set up, save what the loop learned so future improvement runs start from a stronger baseline:

Save a summary of what the self-improving agent changed in the competitor-brief skill to the workspace collection. Include the before/after goal and what was adjusted.

What to expect

After a 3-iteration loop, a well-defined goal typically produces a noticeably tighter skill. You'll see the word count drop, structure improve, or output format tighten — depending on what you asked for.

The loop won't always converge perfectly. Sometimes it overshoots (strips too much) or stalls (makes the same tweak repeatedly). When that happens, add one specific constraint to the goal and re-run.

Patterns for ongoing self-learning

Weekly skill audit:

Run the self-improving-agent on my morning-brief skill.
Goal: the brief should surface the 3 most time-sensitive items first, before the full summary.
Test input: use yesterday's actual brief data.
Max iterations: 2.

After collecting feedback:

I've noticed the competitor briefs are too promotional. Run the self-improving-agent.
Goal: "Our differentiation" section should be honest — acknowledge weaknesses, not just strengths.
Test input: research Linear as a competitor.
Max iterations: 2.

New skill from scratch:

Create a new skill called "pr-review-checklist" that reviews a GitHub PR and outputs a structured checklist.
Then run the self-improving-agent on it.
Goal: the checklist should cover security, performance, and test coverage — one line each.
Test input: [paste a real PR diff].
Max iterations: 3.

Tips

Keep max-iterations low (2–3) at first. Each iteration costs tokens. Start with 2, review the result, then run again if you want further refinement.

One goal per run. Don't ask the loop to improve conciseness, accuracy, and formatting at the same time. Pick the most important problem and fix that first.

Back up your skill before iterating. The loop rewrites the skill file in place. Copy it first if you want a fallback:

Copy my competitor-brief skill to ~/skills/competitor-brief-backup.md before running the loop.

The self-learning pattern works especially well for skills you run on a schedule. A morning brief skill that runs daily and self-improves weekly will be noticeably better within a month than one that never changes.

See the Self-Improving Agent skill page for full documentation on evaluation criteria and iteration options.

ZenBin — Publish & Message Overview