Ralph Wiggum Integration

Iterative AI loops that self-correct until the task is actually complete.

New in v3.1: Ralph Wiggum integration is built into Sugar. No additional installation required.

The Problem

Single-shot AI attempts often fail on complex tasks. Ask Claude to implement a feature in one go, and you might get something that's 80% right. But that 20% means broken tests, missed edge cases, or subtle bugs.

You end up going back and forth, manually iterating until it works. That defeats the purpose of autonomous development.

The Solution

Ralph Wiggum feeds the same prompt repeatedly. Each iteration:

Claude sees its previous work in the files
Runs tests and sees what's failing
Fixes issues and improves the implementation
Repeats until the task is actually complete

Think of it like automated code review cycles. Junior dev writes code, tests fail, they fix it, tests pass, PR merged. Ralph does this loop automatically.

Configuration

Enable Ralph in your .sugar/config.yaml:

sugar:
  ralph:
    enabled: true

    # Safety: max iterations before giving up
    max_iterations: 10
    iteration_timeout: 300  # 5 min per iteration

    # How Claude signals "I'm done"
    completion_promise: "DONE"

    # IMPORTANT: Require explicit completion criteria
    require_completion_criteria: true

    # Run quality gates between iterations
    quality_gates_enabled: true
    stop_on_gate_failure: false  # Keep trying even if tests fail

Usage

Via CLI

# Add a task with Ralph enabled
sugar add "Implement rate limiting" --ralph --max-iterations 10

# Sugar will iterate until tests pass

Via Task Queue

# Set ralph_enabled in task context
{
  "type": "feature",
  "title": "Implement rate limiting",
  "context": {
    "ralph_enabled": true,
    "max_iterations": 15
  }
}

Intelligent Triage (New in v3.3)

Not sure if your task needs Ralph? Use --triage to let Sugar decide:

# Let Sugar analyze and decide
sugar add "Refactor auth to repository pattern" --triage

# Sugar analyzes:
# - Task complexity (keywords, scope, risk factors)
# - Codebase capabilities (test frameworks, linters, CI)
# - Generates completion criteria automatically

# Output:
# 🔍 Triage: Ralph recommended (78% confidence)
#    Completion: "All tests pass, no lint errors"
# ✓ Task added with Ralph mode enabled

How Triage Works

Triage analyzes your task across multiple dimensions:

Complexity scoring - Keywords like "refactor", "migrate", "integrate" suggest higher complexity
Scope detection - Multi-file changes, database migrations, API changes increase score
Risk factors - Security, authentication, payment systems get flagged
Codebase scanning - Detects pytest, jest, eslint, mypy, GitHub Actions, etc.

If confidence is 60%+, Ralph mode is auto-enabled. Simple tasks like typo fixes run single-pass.

Triage Examples

# Simple task - triage recommends single-pass
sugar add "Fix typo in README" --triage
# 🔍 Triage: Single-pass recommended (low complexity)

# Complex refactor - triage enables Ralph
sugar add "Migrate from REST to GraphQL" --type refactor --triage
# 🔍 Triage: Ralph recommended (85% confidence)
# ✓ Task added with Ralph mode enabled

# Bug with tests - triage detects test framework
sugar add "Fix race condition in worker pool" --type bug_fix --triage
# 🔍 Triage: Ralph recommended (72% confidence)
#    Detected: pytest, mypy
#    Completion: "All tests pass, type checks clean"

Prompt Format

For Ralph to know when a task is complete, your prompt needs completion criteria:

Implement rate limiting for the API.

Requirements:
- 100 requests/minute per IP
- Redis-backed for distributed support
- 429 response when exceeded

When complete:
- All rate limit tests pass
- Integration test verifies Redis
- Output: <promise>DONE</promise>

The <promise> tag is how Claude signals completion. Without it, Ralph doesn't know when to stop.

How It Works

# Iteration 1

Task: "Implement rate limiting"

→ Creates RateLimiter class

→ Tests: 2 passing, 3 failing

# Iteration 2 (same prompt)

→ Sees previous work in files

→ Fixes failing tests

→ Tests: 4 passing, 1 failing

# Iteration 3 (same prompt)

→ Handles edge case

→ Tests: 5 passing, 0 failing

→ Output: <promise>DONE</promise>

→ Task complete!

When to Use Ralph

Task Type	Without Ralph	With Ralph
Simple bug fix	Works fine	Overkill
New feature	Hit or miss	Iterates until working
Complex refactor	Often breaks things	Self-corrects
TDD implementation	Tests often fail	Keeps going until green
Flaky test debugging	Might give up	Tries multiple approaches

Safety Features

Completion Criteria Required

Sugar validates prompts BEFORE starting Ralph loops. Tasks must include:

A <promise> tag (completion signal), OR
A --max-iterations flag (safety limit)

Without valid criteria, the task is rejected. This prevents infinite loops.

Max Iterations

Default safety limit is 10 iterations. For complex tasks:

sugar add "Complex refactor" --ralph --max-iterations 25

Stuck Detection

Ralph detects when tasks are stuck and stops automatically if Claude outputs phrases like:

"cannot proceed"
"blocked by"
"need human intervention"

Example Prompts

Bug Fix

Debug and fix the memory leak in the WebSocket handler.

Context:
- Memory grows unbounded after ~1000 connections
- Suspected: event listeners not cleaned up

When complete:
- Memory stable under 1-hour load test
- All WebSocket tests pass
- Output: <promise>MEMORY LEAK FIXED</promise>

TDD Feature

Implement user authentication using TDD.

Requirements:
- JWT-based auth
- Refresh token support
- Rate limiting on login

Approach:
- Write tests FIRST
- Implement until tests pass
- Refactor for clarity

When complete:
- All auth tests pass
- Security audit passes
- Output: <promise>AUTH COMPLETE</promise>

Refactoring

Refactor UserService to use repository pattern.

Goals:
- Extract data access to UserRepository
- UserService depends on interface only
- Maintain 100% test coverage

When complete:
- All existing tests pass
- New repository tests added
- No direct DB calls in UserService
- Output: <promise>REFACTOR DONE</promise>

API Reference

RalphConfig

from sugar.ralph import RalphConfig

config = RalphConfig(
    max_iterations=10,           # Safety limit
    completion_promise="DONE",   # Promise text to detect
    require_completion_criteria=True,
    min_confidence=0.8,
    iteration_timeout=300,
    quality_gates_enabled=True,
    stop_on_gate_failure=False,
)

CompletionCriteriaValidator

from sugar.ralph import CompletionCriteriaValidator

validator = CompletionCriteriaValidator(strict=True)
result = validator.validate(prompt)

if not result.is_valid:
    print(result.errors)
    print(result.suggestions)

RalphWiggumProfile

from sugar.ralph import RalphWiggumProfile, RalphConfig

profile = RalphWiggumProfile(ralph_config=config)

# Validate and process input
result = await profile.process_input({"prompt": prompt})

# Check if should continue iterating
while profile.should_continue():
    output = await agent.execute(prompt)
    result = await profile.process_output(output)

    if result["complete"]:
        break

# Get final stats
stats = profile.get_iteration_stats()

Learn More

Original Ralph Wiggum technique by Geoffrey Huntley
Ralph Orchestrator - Standalone implementation