Ralph Wiggum Integration
Iterative AI loops that self-correct until the task is actually complete.
The Problem
Single-shot AI attempts often fail on complex tasks. Ask Claude to implement a feature in one go, and you might get something that's 80% right. But that 20% means broken tests, missed edge cases, or subtle bugs.
You end up going back and forth, manually iterating until it works. That defeats the purpose of autonomous development.
The Solution
Ralph Wiggum feeds the same prompt repeatedly. Each iteration:
- Claude sees its previous work in the files
- Runs tests and sees what's failing
- Fixes issues and improves the implementation
- Repeats until the task is actually complete
Think of it like automated code review cycles. Junior dev writes code, tests fail, they fix it, tests pass, PR merged. Ralph does this loop automatically.
Configuration
Enable Ralph in your .sugar/config.yaml:
sugar:
ralph:
enabled: true
# Safety: max iterations before giving up
max_iterations: 10
iteration_timeout: 300 # 5 min per iteration
# How Claude signals "I'm done"
completion_promise: "DONE"
# IMPORTANT: Require explicit completion criteria
require_completion_criteria: true
# Run quality gates between iterations
quality_gates_enabled: true
stop_on_gate_failure: false # Keep trying even if tests fail Usage
Via CLI
# Add a task with Ralph enabled
sugar add "Implement rate limiting" --ralph --max-iterations 10
# Sugar will iterate until tests pass Via Task Queue
# Set ralph_enabled in task context
{
"type": "feature",
"title": "Implement rate limiting",
"context": {
"ralph_enabled": true,
"max_iterations": 15
}
} Intelligent Triage (New in v3.3)
Not sure if your task needs Ralph? Use --triage to let Sugar decide:
# Let Sugar analyze and decide
sugar add "Refactor auth to repository pattern" --triage
# Sugar analyzes:
# - Task complexity (keywords, scope, risk factors)
# - Codebase capabilities (test frameworks, linters, CI)
# - Generates completion criteria automatically
# Output:
# 🔍 Triage: Ralph recommended (78% confidence)
# Completion: "All tests pass, no lint errors"
# ✓ Task added with Ralph mode enabled How Triage Works
Triage analyzes your task across multiple dimensions:
- Complexity scoring - Keywords like "refactor", "migrate", "integrate" suggest higher complexity
- Scope detection - Multi-file changes, database migrations, API changes increase score
- Risk factors - Security, authentication, payment systems get flagged
- Codebase scanning - Detects pytest, jest, eslint, mypy, GitHub Actions, etc.
If confidence is 60%+, Ralph mode is auto-enabled. Simple tasks like typo fixes run single-pass.
Triage Examples
# Simple task - triage recommends single-pass
sugar add "Fix typo in README" --triage
# 🔍 Triage: Single-pass recommended (low complexity)
# Complex refactor - triage enables Ralph
sugar add "Migrate from REST to GraphQL" --type refactor --triage
# 🔍 Triage: Ralph recommended (85% confidence)
# ✓ Task added with Ralph mode enabled
# Bug with tests - triage detects test framework
sugar add "Fix race condition in worker pool" --type bug_fix --triage
# 🔍 Triage: Ralph recommended (72% confidence)
# Detected: pytest, mypy
# Completion: "All tests pass, type checks clean" Prompt Format
For Ralph to know when a task is complete, your prompt needs completion criteria:
Implement rate limiting for the API.
Requirements:
- 100 requests/minute per IP
- Redis-backed for distributed support
- 429 response when exceeded
When complete:
- All rate limit tests pass
- Integration test verifies Redis
- Output: <promise>DONE</promise> The <promise> tag is how Claude signals completion. Without it, Ralph doesn't know when to stop.
How It Works
When to Use Ralph
| Task Type | Without Ralph | With Ralph |
|---|---|---|
| Simple bug fix | Works fine | Overkill |
| New feature | Hit or miss | Iterates until working |
| Complex refactor | Often breaks things | Self-corrects |
| TDD implementation | Tests often fail | Keeps going until green |
| Flaky test debugging | Might give up | Tries multiple approaches |
Safety Features
Completion Criteria Required
Sugar validates prompts BEFORE starting Ralph loops. Tasks must include:
- A
<promise>tag (completion signal), OR - A
--max-iterationsflag (safety limit)
Without valid criteria, the task is rejected. This prevents infinite loops.
Max Iterations
Default safety limit is 10 iterations. For complex tasks:
sugar add "Complex refactor" --ralph --max-iterations 25 Stuck Detection
Ralph detects when tasks are stuck and stops automatically if Claude outputs phrases like:
- "cannot proceed"
- "blocked by"
- "need human intervention"
Example Prompts
Bug Fix
Debug and fix the memory leak in the WebSocket handler.
Context:
- Memory grows unbounded after ~1000 connections
- Suspected: event listeners not cleaned up
When complete:
- Memory stable under 1-hour load test
- All WebSocket tests pass
- Output: <promise>MEMORY LEAK FIXED</promise> TDD Feature
Implement user authentication using TDD.
Requirements:
- JWT-based auth
- Refresh token support
- Rate limiting on login
Approach:
- Write tests FIRST
- Implement until tests pass
- Refactor for clarity
When complete:
- All auth tests pass
- Security audit passes
- Output: <promise>AUTH COMPLETE</promise> Refactoring
Refactor UserService to use repository pattern.
Goals:
- Extract data access to UserRepository
- UserService depends on interface only
- Maintain 100% test coverage
When complete:
- All existing tests pass
- New repository tests added
- No direct DB calls in UserService
- Output: <promise>REFACTOR DONE</promise> API Reference
RalphConfig
from sugar.ralph import RalphConfig
config = RalphConfig(
max_iterations=10, # Safety limit
completion_promise="DONE", # Promise text to detect
require_completion_criteria=True,
min_confidence=0.8,
iteration_timeout=300,
quality_gates_enabled=True,
stop_on_gate_failure=False,
) CompletionCriteriaValidator
from sugar.ralph import CompletionCriteriaValidator
validator = CompletionCriteriaValidator(strict=True)
result = validator.validate(prompt)
if not result.is_valid:
print(result.errors)
print(result.suggestions) RalphWiggumProfile
from sugar.ralph import RalphWiggumProfile, RalphConfig
profile = RalphWiggumProfile(ralph_config=config)
# Validate and process input
result = await profile.process_input({"prompt": prompt})
# Check if should continue iterating
while profile.should_continue():
output = await agent.execute(prompt)
result = await profile.process_output(output)
if result["complete"]:
break
# Get final stats
stats = profile.get_iteration_stats() Learn More
- Original Ralph Wiggum technique by Geoffrey Huntley
- Ralph Orchestrator - Standalone implementation