135,000+ exposed instances

OpenClaw Security Lab

Name: OpenClaw Security Lab
Author: Argo

OpenClaw is GitHub's most-starred project (250K+ stars) — a self-hosted AI agent that runs on your hardware. Its rapid adoption outpaced its security: 135,000+ exposed instances and 800 malicious skills on ClawHub. This lab compares default deployments against the hardened configuration running on ArgoBox.

250K+ GitHub Stars

135K+ Exposed Instances

800+ Malicious Skills

18 Vetted Skills

Challenges

Examine the OpenClaw hardening rules and identify security controls.

Test prompt injection attacks against the hardened deployment and understand the defenses.

Analyze the full security architecture, from skill isolation to escalation paths.

0% Complete

Default vs Hardened Deployment

Side-by-side comparison of 10 critical security aspects. Red indicates vulnerable, green indicates protected.

Default OpenClaw

VULNERABLE

ArgoBox Hardened

PROTECTED

ClawHub Skills

Enabled (800+ malicious)

Any community skill can execute arbitrary code on the host

ClawHub Skills

Blocked entirely

Only 18 vetted skills loaded from local config, ClawHub disabled

Secret Protection

None — agent can echo secrets

API keys, tokens, and credentials can leak into chat output

Secret Protection

Non-negotiable SOUL rules

SOUL.md enforces: never disclose API keys, credentials, or tokens under any circumstance

Email Commands

Accepts email as instruction

Incoming emails can inject instructions into the agent context

Email Commands

Email is not a command channel

Email is data only — never parsed as instructions or commands

Git Push to Main

Allowed without approval

Agent can push directly to main branch without human review

Git Push to Main

Requires explicit approval

All git pushes require explicit human approval before execution

Filesystem Access

Unrestricted

Agent has full filesystem read/write access to the host

Filesystem Access

Workspace only, config read-only

Sandboxed to workspace directory; config files are read-only

Rate Limiting

None

No throttling on external API calls — runaway costs possible

Rate Limiting

10 external API calls/min

Hard limit of 10 external API calls per minute prevents bill shock

Prompt Injection

No defense

Agent blindly follows "ignore previous instructions" patterns

Prompt Injection

Refuses manipulation attempts

Detects and refuses prompt injection, role hijacking, and instruction override attempts

Failure Handling

Retries forever

Failed tasks loop indefinitely, consuming resources and API quota

Failure Handling

Stop after 3 consecutive failures

Three consecutive failures trigger stop + alert to operator

API Key Exposure

Can leak in chat

Keys can appear in conversation output, logs, or error messages

API Key Exposure

Never disclosed, non-negotiable

Non-negotiable SOUL rule — API keys are never disclosed regardless of prompt

Monitoring

None

No visibility into agent health, uptime, or resource usage

Monitoring

Health checks every 3 hours

Automated health checks with self-healing rules and escalation path

Interactive Prompt Test

Test how the hardened OpenClaw responds to common attack prompts. Click a preset or type your own.

Try an attack:

OpenClaw Security Engine

SOUL.md rules loaded. 18 vetted skills active. ClawHub disabled. Ready for security testing. Try sending a prompt to see how the hardened deployment responds.

Vetted Skill Gallery

18 carefully reviewed skills — no ClawHub marketplace, no untrusted code execution. Every skill is audited and loaded from local configuration.

email Vetted

Tool

Send, receive, and search email with FTS5 indexing and multi-mailbox support

rag Vetted

Knowledge

Semantic search across 535K+ chunks with multi-tier privacy controls

calendar Vetted

Tool

Schedule management, event creation, and calendar integration

workbench Vetted

Tool

Code execution sandbox with file management and project scaffolding

applyr Vetted

Tool

Automated job application workflow with resume tailoring

cloudflare Vetted

Tool

DNS management, zone controls, and subdomain security auditing

network Vetted

Tool

Network discovery, topology mapping, and device inventory

self-healing Vetted

Methodology

Container restart, disk cleanup, and automated recovery procedures

research Vetted

Knowledge

Web search, content extraction, and source verification pipeline

voice-commands Vetted

Tool

Text-to-speech and speech-to-text with ElevenLabs and Whisper

telegram-files Vetted

Tool

File management and sharing through Telegram bot integration

backup-verify Vetted

Cron

Automated backup validation and integrity checks across storage tiers

argobox-health-check Vetted

Cron

Full ArgoBox platform health assessment with module status reporting

playground-monitor Vetted

Cron

Playground container status, resource usage, and availability monitoring

playground-health-check Vetted

Cron

Deep health verification of all playground lab environments

playground-full-test Vetted

Cron

End-to-end integration testing of playground functionality

playground-investigate Vetted

Methodology

Diagnostic investigation of playground issues with root cause analysis

argobox Vetted

Knowledge

Platform architecture reference, module registry, and deployment context

Autonomous Cron Schedule

10 scheduled jobs running autonomously — each one audited, rate-limited, and monitored for failures.

00:30 Daily

Memory Consolidation

Consolidate conversation history and context into persistent memory

02:00 Daily

Dev Task Runner

Execute queued development tasks and CI/CD pipeline checks

06:30 Daily

Market Intelligence

Gather market data, competitor analysis, and industry news

08:00 Daily

Morning Briefing

Compile overnight events, alerts, and daily priorities

10:30 Daily

Content Scout

Discover relevant content, research papers, and community discussions

23:00 Daily

Security Audit

Scan for exposed services, expired certificates, and policy violations

Every 6h Recurring

Infra Health Monitor

Check all infrastructure nodes, containers, and network connectivity

Every 8h Recurring

Build Swarm Monitor

Verify build swarm drone status, queue depth, and artifact integrity

Sun 09:00 Weekly

Weekly Report

Generate comprehensive weekly summary of operations and metrics

Wed 10:00 Weekly

Connectivity Test

Full mesh connectivity verification across all network segments

Self-Healing Heartbeat

Automated recovery rules that keep the agent running — from container restarts to phone call escalation.

Container unhealthy

Auto-restart

Docker health check fails for 30+ seconds

Disk usage >85%

docker system prune

Automatic cleanup of unused images, containers, and volumes

Restart loop (>3 in 10min)

Stop + report

Prevents infinite restart cycles that mask root cause

SSL cert <7 days

Report to operator

Early warning for certificate expiry before service disruption

Memory <500MB free

Alert + diagnostic

Memory pressure detection with process-level breakdown

CRITICAL unresolved 15min

Phone call via Twilio

Escalation to phone call if critical alert has no human response

Escalation Path

Auto-restart

Stop + Report

Notify Operator

Phone Call (Twilio)

If a CRITICAL alert goes unacknowledged for 15 minutes, the system escalates to a phone call via Twilio. No silent failures.

Run OpenClaw the safe way

Don't be one of the 135,000+ exposed instances. Deploy with hardened SOUL rules, vetted skills, and self-healing infrastructure.

Deploy OpenClaw Safely Book a Setup Call View on GitHub

OpenClaw Security Lab

Challenges

Default vs Hardened Deployment

Default OpenClaw

ArgoBox Hardened

Interactive Prompt Test

Vetted Skill Gallery

Autonomous Cron Schedule

Memory Consolidation

Dev Task Runner

Market Intelligence

Morning Briefing

Content Scout

Security Audit

Infra Health Monitor

Build Swarm Monitor

Weekly Report

Connectivity Test

Self-Healing Heartbeat

Escalation Path

Run OpenClaw the safe way

System Status

🌐 Gateway

🚀 Orchestrators

🤖 Build Drones

🔨 Active Builds