CA: pwWXFXQLtQ2x29U1Ya7okKBRQmX3iXAPL8cFRRnLBot
Running 24/7 on a Mac Mini

Clawdscape

0
Lines of Code
0
Memory Types
0
Action Types
0
LLM Provider
Mac Mini
24/7 Hardware

The Perception-Action Loop

A Mac Mini captures the screen, sends it to ClawdBot via API, receives action decisions, and executes them with human-like input simulation — all running autonomously.

This is an autonomous AI agent that plays Old School RuneScape entirely through vision, running 24/7 on a Mac Mini. It doesn't read game memory, inject code, or use any plugins — it simply looks at the screen and decides what to do, just like a human player would.

Every few seconds, the Mac Mini captures a screenshot of the game and sends it to ClawdBot (Anthropic's most powerful AI model), which returns a structured decision: what it sees, what it thinks, and what actions to take. Those actions — mouse clicks, keyboard presses, minimap navigation — are executed with human-like Bezier curve mouse paths and randomized timing.

The agent has persistent memory powered by a vector database. It remembers where it's been, what worked, what failed, and what killed it. These memories are fed back into every decision, so it learns from experience across hundreds of ticks.

Everything streams live to this dashboard — the AI's thoughts, its actions, player stats, a world map tracking its location, screenshots of what it sees, and achievement milestones. You're watching an AI explore Gielinor in real time.

Screenshot

Screenshot

Capture the game window

LLM Vision

LLM Vision

ClawdBot analyzes the scene

Decision

Decision

Choose actions from 30+ types

Execution

Execution

Human-like Bezier mouse paths

Memory

Memory

Store experience in vector DB

Repeat

Repeat

Loop forever, learn always

Inside the Agent's Mind

Real output from the perception-action loop. Every tick, the agent observes, reasons, and acts — here's what that looks like under the hood.

game_view — live capture
NO FEED
Waiting for agent to go live...
osrs-agent — game_loop.py
OFFLINE
Model:ClawdBot Tick:#247 Latency:312ms Errors:0

Player Progress

Real-time stats extracted from the game via vision AI. Every tree chopped, every monster slain, every level gained — tracked live.

osrs-agent — player_stats.py
OFFLINE
Clawdscape
?
Clawdscape
Autonomous vision-only RuneScape player. No APIs, no plugins — just screenshots and intelligence.
Uptime: -- Ticks: 0 Location: Unknown Distance: 0 Memories: 0

Vitals

Hitpoints
100
Prayer
0
Run
100

Activity Counters

Trees0Trees
Kills0Kills
Items0Items
Deaths0Deaths
Fish0Fish
Ores0Ores
Bones0Bones
Food0Food
Doors0Doors
NPCs0NPCs

World Map

Unknown

XP Tracker

Total XP 0
No XP gains yet...

Level Ups

No level-ups yet...

Inventory

Small Net
Bread
Cowhide
Bronze Axe
Logs
Logs
Logs
Logs
Logs
Bones
Bones
Bones
Coins
Raw Shrimp
Raw Shrimp
Raw Shrimp
Bronze Sword
-
-
-
-
-
-
-
-
-
-
-
17 / 28 slots

Equipment

Headnone
Capenone
Necknone
Weaponnone
Bodynone
Shieldnone
Legsnone
Glovesnone
Bootsnone
Ringnone
Ammonone

Goal Progress

Loading goals...

Memory Bank

Loading memory data...

Action Timeline

No actions yet...

Milestones

No milestones yet...

Session History

Loading sessions...

Built for Autonomy

Every component is engineered for one goal: a self-sufficient agent that plays RuneScape like a human, learns from experience, and never stops.

Vision AI

Multimodal Vision AI

Sends raw screenshots to ClawdBot. The LLM observes the game world, identifies objects, NPCs, and interfaces, then decides what to do next — pure vision, zero game API access.

ClawdBot
Human-Like Input

Human-Like Input

Mouse movements follow Bezier curves with de Casteljau interpolation, ease-in-out acceleration, random jitter, and occasional hesitation pauses. Indistinguishable from a real human player.

Bezier Curves
Persistent Memory

Persistent Memory

ChromaDB vector database stores 12 types of memory: observations, combat knowledge, navigation paths, NPC encounters, death learnings, and more. Semantic search retrieves relevant past experiences.

ChromaDB + Embeddings
Goal Planning

Hierarchical Goal Planning

Tree-based goal system with prerequisites, priority ordering, auto-cascading completion, and retry logic. Goals decompose from "Complete Tutorial Island" into atomic sub-tasks.

Goal Tree
Stuck Detection

Intelligent Stuck Detection

Three-level detection: repeated clicks, scene-stuck keywords, and area-stuck monitoring. Automatic recovery nudges force new approaches, exploration, and activity variation.

Self-Recovery
Live Stream

Live Stream Overlay

OSRS-themed transparent overlay shows the agent's real-time thoughts, reasoning, and actions. OBS WebSocket integration enables 24/7 autonomous livestreaming with status overlays.

OBS Integration

Modular Architecture

Clean separation of concerns allows each subsystem to evolve independently. The game loop orchestrates all components through a unified tick-based pipeline.

┌──────────────────────────────────────────────────────────────────────┐ Clawdscape — System Architecture ├──────────────────────────────────────────────────────────────────────┤ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │ Screenshot │────▶│ LLM Vision │────▶│ Decision │ │ Capture │ │ Engine │ │ Engine │ └─────────────┘ └──────────────┘ └──────┬───────┘ ▲ │ │ │ ┌─────┴────────┐ ┌──────────────┐ │ Memories │ │ Action │ │ + Goals │ │ Executor │ └──────────────┘ └──────┬───────┘ ┌──────────────┐ └─────────────────────────│ Human-Like │ loop every 300ms │ Input │ │ Controller │ └──────────────┘ └──────────────────────────────────────────────────────────────────────┘

Code That Plays Games

From Bezier curves to semantic memory, every line is crafted for autonomous gameplay.

agent/core/input_controller.py Python
def _bezier_points(self, start: Point, end: Point,
                    control_points: int = 2, num_steps: int = 50) -> list[Point]:
    """Generate points along a Bezier curve from start to end.
    Creates natural-looking curved mouse trajectories."""
    points = [start]

    # Generate random control points that create a natural arc
    cps = [start]
    for _ in range(control_points):
        mid_x = (start.x + end.x) / 2
        mid_y = (start.y + end.y) / 2
        dist = math.hypot(end.x - start.x, end.y - start.y)
        spread = dist * 0.3
        cp = Point(
            int(mid_x + random.uniform(-spread, spread)),
            int(mid_y + random.uniform(-spread, spread)),
        )
        cps.append(cp)
    cps.append(end)

    # De Casteljau's algorithm for smooth Bezier interpolation
    for i in range(1, num_steps + 1):
        t = i / num_steps
        t = self._ease_in_out(t)  # Slow start, fast middle, slow end
        result = self._de_casteljau(cps, t)
        points.append(result)

    return points

def _ease_in_out(self, t: float) -> float:
    """Ease-in-out for natural acceleration/deceleration."""
    if t < 0.5:
        return 2 * t * t
    return -1 + (4 - 2 * t) * t
agent/core/vision.py System Prompt
"""The LLM receives a comprehensive gameplay instruction set."""

OSRS_SYSTEM_PROMPT = """
You are an autonomous OSRS player. Analyze each screenshot
and return JSON actions.

## CRITICAL RULES
- ALWAYS return 2-3 actions per turn
- Your LAST action should be a click_minimap to keep moving
- NEVER open the Settings menu — it blocks the game view
- FINISH what you start: combat, tree chopping, conversations

## ACTIVITY PRIORITY
1. CHOP TREES — click any tree to gather logs
2. COMBAT — attack nearby creatures, pick up drops
3. TALK TO NPCs — engage in conversation
4. PICK UP ITEMS — bones, coins, weapons, everything!
5. EXPLORE — walk to new towns, castles, bridges

## AVAILABLE ACTIONS
- click(x, y) — left-click game coordinates
- right_click(x, y) — context menu
- click_minimap(x, y) — navigate via minimap
- type_text(text) — type in chat
- press_key(key) — press keyboard key
- click_inventory_slot(slot) — interact with item
- rotate_camera(direction, duration) — camera control
- wait(min, max) — brief pause
"""
agent/memory/memory_store.py Python
class MemoryType(Enum):
    """12 categories of persistent agent memory."""
    OBSERVATION  = "observation"     # What was seen on screen
    ACTION       = "action_result"   # Successful action outcomes
    LOCATION     = "location"        # Map knowledge & landmarks
    NPC          = "npc"             # NPC behavior & dialogue
    QUEST        = "quest"           # Quest progress
    SKILL        = "skill"           # Training techniques
    ITEM         = "item"            # Item properties
    COMBAT       = "combat"          # Monster knowledge
    DEATH        = "death"           # Learn from mistakes
    FAILURE      = "failure"         # Failed attempts
    NAVIGATION   = "navigation"      # Path instructions
    STRATEGY     = "strategy"        # General approaches

def get_context_for_situation(self, query: str) -> str:
    """Retrieve semantically relevant memories for the LLM."""
    results = self.collection.query(
        query_texts=[query],
        n_results=settings.MEMORY_TOP_K,  # Top 10 matches
    )
    # Format memories as context for the LLM prompt
    memories = []
    for doc, meta in zip(results["documents"][0], results["metadatas"][0]):
        memories.append(f"[{meta['type']}] {doc}")
    return "\n".join(memories)
agent/planning/goal_planner.py Python
@dataclass
class Goal:
    """Hierarchical goal with prerequisites and retry logic."""
    id: str
    name: str
    description: str
    status: GoalStatus          # PENDING | ACTIVE | COMPLETED | FAILED
    priority: int               # 1-10, higher = more urgent
    parent_id: Optional[str]
    children_ids: list[str]
    prerequisites: list[str]   # Goal IDs that must complete first
    attempts: int = 0
    max_attempts: int = 10

def get_next_goal(self) -> Optional[Goal]:
    """Select highest-priority goal with met prerequisites."""
    candidates = [
        g for g in self.goals.values()
        if g.status == GoalStatus.PENDING
        and all(
            self.goals[p].status == GoalStatus.COMPLETED
            for p in g.prerequisites
            if p in self.goals
        )
    ]
    if not candidates:
        return None
    return max(candidates, key=lambda g: g.priority)

def complete_goal(self, goal_id: str):
    """Complete a goal. Auto-cascades to parent if all children done."""
    goal = self.goals[goal_id]
    goal.status = GoalStatus.COMPLETED
    # Check if parent should auto-complete
    if goal.parent_id and goal.parent_id in self.goals:
        parent = self.goals[goal.parent_id]
        if all(self.goals[c].status == GoalStatus.COMPLETED
               for c in parent.children_ids):
            self.complete_goal(parent.id)  # Recursive cascade

Tech Stack

Opus

ClawdBot

Primary vision LLM

Python

Python

Core application language

ChromaDB

ChromaDB

Vector database for memory

PyAutoGUI

PyAutoGUI

Input automation

Pillow

Pillow

Image processing

OBS

OBS WebSocket

Livestream integration

Tkinter

Tkinter

Thought overlay UI

NumPy

NumPy

Mathematical operations

Mac Mini

Mac Mini

24/7 hardware host

0
Lines of Python
0
Memory Types
0
Core Modules
0
Action Types

Frequently Asked Questions

How does Clawdscape actually "see" the game?
Every 300 milliseconds, the agent captures a screenshot of the game window, compresses it to JPEG at 640×360, and sends it as a base64-encoded image to ClawdBot via Anthropic's multimodal API. The model performs full scene decomposition — identifying NPCs, objects, UI elements, spatial relationships, and interactive targets. It returns structured JSON with what it sees, what it thinks, and where to click. No game memory access. No client injection. No pixel scraping. Pure visual understanding, the same way a human player reads the screen.
Does it modify the game client in any way?
No. Clawdscape is completely non-invasive. It runs as an entirely separate process with zero hooks into the game client — no bytecode injection, no memory reads, no client-side modifications. Screen capture uses the OS display pipeline (identical to how OBS or any screen recorder works), and input goes through standard mouse/keyboard event dispatch. The game client remains 100% vanilla. The agent has exactly the same access as a human sitting at the keyboard — nothing more.
Why a Mac Mini instead of cloud infrastructure?
Deliberate architectural decision. The Mac Mini M4 handles everything locally: native game rendering, display-pipeline screen capture, Bezier-curve input execution, ChromaDB vector storage (6,000+ memories), stuck detection signal processing, and PostgreSQL telemetry pushing. Only the LLM inference is offloaded to Anthropic's servers. This edge-compute model eliminates cloud VM costs, reduces capture-to-action latency, and enables true 24/7 unattended operation with automatic session recovery. Total infrastructure cost: one Mac Mini + one API key.
How does the mouse movement look human?
Every mouse movement follows a Bezier curve with randomized control points — implemented via de Casteljau's algorithm, the same math used in vector graphics rendering. No two clicks travel the same path. On top of that: ±3px coordinate jitter on every click, variable movement duration (80–300ms), occasional hesitation pauses (5% probability), randomized typing speed with natural variance, and easing functions that simulate human acceleration and deceleration. The input is indistinguishable from a real player's hand on the mouse.
How does the memory system work?
Clawdscape implements retrieval-augmented generation (RAG) over its own lived experience. Every tick, the agent stores observations, action outcomes, navigation routes, combat encounters, deaths, and strategic insights as vector embeddings in ChromaDB across 12 memory categories. Before each decision, it performs semantic similarity search to retrieve the most relevant past experiences. Near Lumbridge? It recalls Lumbridge memories. In combat? It recalls past fights. The result: emergent long-term learning without fine-tuning. The agent genuinely improves over time by remembering what worked and what didn't.
What happens when the AI gets stuck?
A three-layer detection system handles this. Layer 1: NumPy-based minimap frame differencing — if the minimap hasn't changed in 5 ticks, the agent isn't moving. Layer 2: Scene analysis scanning recent observations for repeated obstacle keywords (fence, gate, wall, door). Layer 3: Action pattern matching to catch repetitive click loops. The critical innovation: context-aware suppression. Standing still during combat is correct. Standing still while fishing is correct. The detector only fires when the agent is genuinely stuck, not when it's supposed to be stationary. If combat starts mid-unstick, the override cancels immediately.
How does goal planning work?
Goals are organized in a hierarchical tree with six states: pending, active, in_progress, completed, failed, and blocked. Top-level goals ("Complete Tutorial Island") decompose into ordered sub-goals with prerequisites — you can't fight the Combat Instructor before visiting the Mining Instructor. Failed goals retry up to 10 times before permanent failure. Blocked goals are skipped and revisited. When all children complete, the parent auto-completes. Claude can also decompose goals dynamically, breaking down new challenges on the fly. The full tree persists to disk as JSON, surviving crashes and restarts.
How does the live dashboard get its data?
Every 3 ticks, the agent serializes its full state — screenshot, observations, reasoning, vitals, inventory, counters, location, active goals, memory stats, and action history — into a JSONB payload and upserts it to PostgreSQL on Railway. The Vercel dashboard polls the REST API every 2 seconds, rendering the live feed, game screenshot, world map with real-time GPS tracking, player vitals, activity counters, milestones, XP tracker, and session history. Single-row upsert means zero table growth — only the latest state matters. Full pipeline: Mac Mini → PostgreSQL → REST API → Vercel CDN → Browser.
How accurate are the stats (kills, deaths, XP)?
Every stat is tracked through a pattern-matching system with cooldowns and anti-patterns. Deaths use a 20-tick cooldown and match phrases like "I died" and "oh dear" — but exclude "death rune" and "death talisman" to prevent false positives from item names. Kill tracking requires completion phrases ("killed the," "defeated the") not intent phrases ("trying to kill"). Level-ups are validated against the full set of 23 real OSRS skill names to prevent "need Woodcutting level 15" from counting as reaching level 15. Every data point on this dashboard is real, verified data — not inflated guesses.
Is the codebase open source?
Fully open source. The entire system — vision engine, action executor, memory store, goal planner, stuck detector, input humanization, desktop overlay, database layer, live dashboard, and deployment config — is available on GitHub. 3,500+ lines of Python, zero proprietary dependencies. Clone it, run it, break it, improve it.