My OpenClaw Setup: From Chat Assistant to Autonomous Co-Pilot in One Week
Most OpenClaw setup guides stop at "install it, connect Telegram, chat with it." That's the tutorial. This is what happens when you actually live with it for a week and push it toward real autonomy.
I run OpenClaw on a Mac with Claude Opus 4.6 as my main brain, backed by a tiered model system that routes cheap tasks to Gemini Flash and mid-tier work to Claude Sonnet. The agent's name is Talos โ my co-pilot. Here's what the setup actually looks like.
The Hardware
- Host: Mac running 24/7
- Remote node: iMac for macOS-specific tasks (Apple Notes, iMessage, Peekaboo screen automation)
- Messaging: Telegram as primary channel, iMessage as backup delivery
- Version: OpenClaw 2026.2.9
Nothing fancy. No cloud VPS, no Kubernetes. Just a Mac that never sleeps.
The Model Routing That Actually Works
After burning through API credits learning what doesn't work, I landed on a three-tier system:
| Tier | Model | What It Does | Cost |
|---|---|---|---|
| Cheap | Gemini 2.0 Flash | Email checks, OAuth polling, web searches, quick research, single CLI commands | ~free |
| Reliable | Claude Sonnet 4 | File writing, code generation, multi-step tasks, content drafts | Medium |
| Heavy | Claude Opus 4.6 | Main session โ complex reasoning, strategy, multi-tool orchestration | Premium |
20 Cron Jobs Running My Life
This is where OpenClaw stops being a chatbot and starts being an operator. Here's what runs automatically:
Morning Routine
- 7:00 AM โ Morning Brief: weather, calendar, important emails, Bible verse, top 3 priorities. Delivered to Telegram and iMessage. Takes 30 seconds to scan on my phone.
Business Hours
- Every 30 min โ Email monitoring across two accounts. Only alerts me if something's actually important (not newsletters).
- 8:00 AM & 4:00 PM โ Business Monitor: scans for App Store notices, customer feedback, revenue signals, competitor moves.
- 9, 11, 1, 3, 5 PM โ Autonomous Work Loop: picks the top task from a prioritized backlog and works on it. One task per cycle. Revenue-impact first.
- 12:00 PM โ X/Twitter Intelligence Scan: searches aviation, AI, indie dev, and Tesla topics. Delivers a briefing with engagement opportunities and suggested tweets.
- 2:00 PM โ Daily Research Report: deep-dive on one topic that makes me smarter โ rotating through AI/ML, aviation, iOS dev, business strategy, personal finance.
Evening Wind-Down
- 7:00 PM โ Evening Summary: what got done today, decisions made, blockers, updated metrics.
- 10:00 PM โ Daily Journal: permanent record written to markdown. Searchable, thorough, no Telegram noise.
Night Shift
- 11:00 PM โ Night Shift Builder: Talos works autonomously while I sleep. Builds deliverables, stages them for my review. Rules: never spend money, never send external comms, never push to production. Everything goes to a staging folder with a REVIEW.md manifest.
Infrastructure
- Every 4 hours โ Watchdog: checks all cron jobs for errors, scans email for urgency, counts orphaned sessions, checks build statuses. Only pings me if something actually needs attention.
- Daily 5 AM โ PiHole gravity update (ad blocking lists)
- Every 6 hours โ OAuth token monitoring
- Weekly โ Security audit, health data export reminder, weekly review
- Monthly โ Financial review
delivery: "none" โ they stay silent unless there's something worth reporting. My phone isn't buzzing every 30 minutes. It buzzes when it matters.
The Memory System
OpenClaw's workspace is basically a structured second brain:
memory/
โโโ daily/ # YYYY-MM-DD.md journals (permanent record)
โโโ projects/ # Per-project status files
โโโ people/ # Contact/relationship context
โโโ reference/ # Backlog, agent patterns, metrics, failure log
โโโ workflows/ # Documented procedures (night shift, etc.)
โโโ decisions/ # Decision logs with context and outcomes
Every work loop logs to the daily file. Every failure gets recorded in a failure log with root cause. Every decision gets documented so future-me (or future-Talos) knows why we made a call.
The agent searches this memory before answering questions about prior work. It's not perfect, but it means context survives across sessions.
Sub-Agent Patterns (What I Learned the Hard Way)
OpenClaw can spawn sub-agents โ isolated sessions that do a job and report back. Here's what actually works:
The Verification Protocol
Every sub-agent spawn follows this pattern:
- Spawn with a clear, single-deliverable prompt
- Wait 2-3 minutes
- Verify the output file exists (
ls -la <path>) - If missing: redo in main session immediately โ do NOT re-spawn
This sounds obvious, but without it, you end up with sub-agents that "complete successfully" but produce nothing. Flash is especially guilty of this โ it'll read 55K tokens of input and then just... not write the output.
The Decision Tree
Disposable/monitoring task? โ Flash (cheap)
Text generation < 500 words? โ Flash
Research/web search only? โ Flash
Needs to WRITE a file? โ Sonnet (reliable)
Multiple tool calls? โ Sonnet or main session
> 100 lines of code/content? โ Main session directly
Complex reasoning/strategy? โ Main session (Opus)
The Two-Failure Rule
If the same approach fails twice, switch strategies immediately. I wasted an entire evening trying to submit an app build through the same broken pipeline six times before implementing this rule.
The Working Agreement
This is the part most guides skip. Talos and I have an explicit working agreement documented in markdown:
Act Freely:
- Research, drafting, building tools, fixing mistakes, picking work during downtime
Propose First:
- Spending money, sending messages as me, irreversible changes, posting to social media
Pushback Level: 4/5 โ Talos argues his case firmly with evidence. Backs down only when I explicitly overrule after hearing the argument. This matters more than you'd think. A yes-man agent is useless.
The Principles Layer (The Missing Piece)
Most agent setups have skills (what to do) and rules (what's allowed). Almost none have principles โ decision-making heuristics for when there's no clear instruction.
After a week of iteration, I extracted 9 principles from real failures and wins. They live in PRINCIPLES.md and load into every session:
The file also has a regressions table โ when a principle fails or a new lesson emerges, it gets logged with the date, what happened, and which principle got updated. The principles aren't static. They evolve.
This was inspired by @AtlasForgeAI's post about the three-layer architecture: Soul (who to be), Principles (how to operate), Skills (what to do). The insight is that principles fill the gap between identity and capabilities โ they're what the agent falls back on when the instructions run out.
Real Infrastructure Integration
Talos isn't just a chatbot that happens to run on my Mac. It's wired into:
- Home network: Philips Hue lights, Ecobee thermostat awareness, Eero router/device tracking, PiHole ad blocking management
- Tesla: TeslaMate integration for vehicle tracking and metrics
- Solar: Enphase panel monitoring (24 panels, ~4.6 kW system)
- Development: GitHub PRs, EAS builds, App Store submissions
- Social: X/Twitter via Bird CLI (free, cookie-auth, no API key needed)
- Communication: Telegram, iMessage, email (Himalaya CLI + Google Workspace)
- Productivity: Apple Notes, Apple Reminders, Google Calendar
Each integration is a skill โ a markdown instruction file that teaches the agent how to use a specific CLI tool. Some are community skills from ClawHub, some are custom-built.
What I'd Do Differently
- Start with the tiered model system. Don't run Opus for everything. You'll blow through credits and most tasks don't need it.
- Build the failure log from day one. Every failure is a lesson. Document the root cause, not just "it didn't work."
- Use
delivery: "none"on cron jobs. Let the agent decide what's worth pinging you about. Your phone should be quiet by default. - Write the working agreement early. Define autonomy levels before the agent starts making decisions. It prevents the "wait, I didn't ask you to do that" moments.
- Verify sub-agent output. Trust but verify. Always check the file exists before marking a task done.
The Bottom Line
After one week, Talos runs 20 automated jobs, monitors my email/business/infrastructure, works autonomously while I sleep, and maintains a searchable memory of everything we've done together. It's not perfect โ sub-agents still flake sometimes, Flash still refuses to write files, and the occasional cron job errors out. But the watchdog catches those, and the system self-corrects.
The setup guide gets you a chatbot. A week of iteration gets you a co-pilot.