February 12, 2026 · Nick Rae · 10 min read

My OpenClaw Setup: From Chat Assistant to Autonomous Co-Pilot in One Week

Most OpenClaw setup guides stop at "install it, connect Telegram, chat with it." That's the tutorial. This is what happens when you actually live with it for a week and push it toward real autonomy.

I run OpenClaw on a Mac with Claude Opus 4.6 as my main brain, backed by a tiered model system that routes cheap tasks to Gemini Flash and mid-tier work to Claude Sonnet. The agent's name is Talos — my co-pilot. Here's what the setup actually looks like.

The Hardware

Host: Mac running 24/7
Remote node: iMac for macOS-specific tasks (Apple Notes, iMessage, Peekaboo screen automation)
Messaging: Telegram as primary channel, iMessage as backup delivery
Version: OpenClaw 2026.2.9

Nothing fancy. No cloud VPS, no Kubernetes. Just a Mac that never sleeps.

The Model Routing That Actually Works

After burning through API credits learning what doesn't work, I landed on a three-tier system:

Tier	Model	What It Does	Cost
Cheap	Gemini 2.0 Flash	Email checks, OAuth polling, web searches, quick research, single CLI commands	~free
Reliable	Claude Sonnet 4	File writing, code generation, multi-step tasks, content drafts	Medium
Heavy	Claude Opus 4.6	Main session — complex reasoning, strategy, multi-tool orchestration	Premium

The golden rule: If the task writes a file, use Sonnet minimum. Flash will describe what it would write instead of actually writing it. Learned that one the hard way — three times.

20 Cron Jobs Running My Life

This is where OpenClaw stops being a chatbot and starts being an operator. Here's what runs automatically:

Morning Routine

7:00 AM — Morning Brief: weather, calendar, important emails, Bible verse, top 3 priorities. Delivered to Telegram and iMessage. Takes 30 seconds to scan on my phone.

Business Hours

Every 30 min — Email monitoring across two accounts. Only alerts me if something's actually important (not newsletters).
8:00 AM & 4:00 PM — Business Monitor: scans for App Store notices, customer feedback, revenue signals, competitor moves.
9, 11, 1, 3, 5 PM — Autonomous Work Loop: picks the top task from a prioritized backlog and works on it. One task per cycle. Revenue-impact first.
12:00 PM — X/Twitter Intelligence Scan: searches aviation, AI, indie dev, and Tesla topics. Delivers a briefing with engagement opportunities and suggested tweets.
2:00 PM — Daily Research Report: deep-dive on one topic that makes me smarter — rotating through AI/ML, aviation, iOS dev, business strategy, personal finance.

Evening Wind-Down

7:00 PM — Evening Summary: what got done today, decisions made, blockers, updated metrics.
10:00 PM — Daily Journal: permanent record written to markdown. Searchable, thorough, no Telegram noise.

Night Shift

11:00 PM — Night Shift Builder: Talos works autonomously while I sleep. Builds deliverables, stages them for my review. Rules: never spend money, never send external comms, never push to production. Everything goes to a staging folder with a REVIEW.md manifest.

Infrastructure

Every 4 hours — Watchdog: checks all cron jobs for errors, scans email for urgency, counts orphaned sessions, checks build statuses. Only pings me if something actually needs attention.
Daily 5 AM — PiHole gravity update (ad blocking lists)
Every 6 hours — OAuth token monitoring
Weekly — Security audit, health data export reminder, weekly review
Monthly — Financial review

The key insight: Most of these jobs run with delivery: "none" — they stay silent unless there's something worth reporting. My phone isn't buzzing every 30 minutes. It buzzes when it matters.

The Memory System

OpenClaw's workspace is basically a structured second brain:

memory/
├── daily/          # YYYY-MM-DD.md journals (permanent record)
├── projects/       # Per-project status files
├── people/         # Contact/relationship context
├── reference/      # Backlog, agent patterns, metrics, failure log
├── workflows/      # Documented procedures (night shift, etc.)
└── decisions/      # Decision logs with context and outcomes

Every work loop logs to the daily file. Every failure gets recorded in a failure log with root cause. Every decision gets documented so future-me (or future-Talos) knows why we made a call.

The agent searches this memory before answering questions about prior work. It's not perfect, but it means context survives across sessions.

Sub-Agent Patterns (What I Learned the Hard Way)

OpenClaw can spawn sub-agents — isolated sessions that do a job and report back. Here's what actually works:

The Verification Protocol

Every sub-agent spawn follows this pattern:

Spawn with a clear, single-deliverable prompt
Wait 2-3 minutes
Verify the output file exists (ls -la <path>)
If missing: redo in main session immediately — do NOT re-spawn

This sounds obvious, but without it, you end up with sub-agents that "complete successfully" but produce nothing. Flash is especially guilty of this — it'll read 55K tokens of input and then just... not write the output.

The Decision Tree

Disposable/monitoring task?     → Flash (cheap)
Text generation < 500 words?    → Flash
Research/web search only?       → Flash
Needs to WRITE a file?          → Sonnet (reliable)
Multiple tool calls?            → Sonnet or main session
> 100 lines of code/content?    → Main session directly
Complex reasoning/strategy?     → Main session (Opus)

The Two-Failure Rule

If the same approach fails twice, switch strategies immediately. I wasted an entire evening trying to submit an app build through the same broken pipeline six times before implementing this rule.

The Working Agreement

This is the part most guides skip. Talos and I have an explicit working agreement documented in markdown:

Act Freely:

Research, drafting, building tools, fixing mistakes, picking work during downtime

Propose First:

Spending money, sending messages as me, irreversible changes, posting to social media

Pushback Level: 4/5 — Talos argues his case firmly with evidence. Backs down only when I explicitly overrule after hearing the argument. This matters more than you'd think. A yes-man agent is useless.

The Principles Layer (The Missing Piece)

Most agent setups have skills (what to do) and rules (what's allowed). Almost none have principles — decision-making heuristics for when there's no clear instruction.

After a week of iteration, I extracted 9 principles from real failures and wins. They live in PRINCIPLES.md and load into every session:

1. Silent by default — don't ping me unless it matters. Background jobs stay background.

2. Revenue before polish — ship what makes money first, optimize later.

3. Two-failure rule — same approach fails twice? Switch strategy immediately.

4. Verify, don't trust — check the output exists before marking done. Sub-agents lie by omission.

5. Fix first, report after — if it's reversible and low-stakes, just fix it.

6. Pushback from care — argue the case firmly, back down when overruled.

7. Friction is data — when something keeps breaking, document why and change the approach.

8. One task, one cycle — finish one thing before starting the next.

9. Protect the principal — never expose my data, spend my money, or speak as me without approval.

The file also has a regressions table — when a principle fails or a new lesson emerges, it gets logged with the date, what happened, and which principle got updated. The principles aren't static. They evolve.

This was inspired by @AtlasForgeAI's post about the three-layer architecture: Soul (who to be), Principles (how to operate), Skills (what to do). The insight is that principles fill the gap between identity and capabilities — they're what the agent falls back on when the instructions run out.

Real Infrastructure Integration

Talos isn't just a chatbot that happens to run on my Mac. It's wired into:

Home network: Philips Hue lights, Ecobee thermostat awareness, Eero router/device tracking, PiHole ad blocking management
Tesla: TeslaMate integration for vehicle tracking and metrics
Solar: Enphase panel monitoring (24 panels, ~4.6 kW system)
Development: GitHub PRs, EAS builds, App Store submissions
Social: X/Twitter via Bird CLI (free, cookie-auth, no API key needed)
Communication: Telegram, iMessage, email (Himalaya CLI + Google Workspace)
Productivity: Apple Notes, Apple Reminders, Google Calendar

Each integration is a skill — a markdown instruction file that teaches the agent how to use a specific CLI tool. Some are community skills from ClawHub, some are custom-built.

What I'd Do Differently

Start with the tiered model system. Don't run Opus for everything. You'll blow through credits and most tasks don't need it.
Build the failure log from day one. Every failure is a lesson. Document the root cause, not just "it didn't work."
Use delivery: "none" on cron jobs. Let the agent decide what's worth pinging you about. Your phone should be quiet by default.
Write the working agreement early. Define autonomy levels before the agent starts making decisions. It prevents the "wait, I didn't ask you to do that" moments.
Verify sub-agent output. Trust but verify. Always check the file exists before marking a task done.

The Bottom Line

After one week, Talos runs 20 automated jobs, monitors my email/business/infrastructure, works autonomously while I sleep, and maintains a searchable memory of everything we've done together. It's not perfect — sub-agents still flake sometimes, Flash still refuses to write files, and the occasional cron job errors out. But the watchdog catches those, and the system self-corrects.

The setup guide gets you a chatbot. A week of iteration gets you a co-pilot.