Skip to main content
135,000+ exposed instances

OpenClaw Security Lab

OpenClaw is GitHub's most-starred project (250K+ stars) — a self-hosted AI agent that runs on your hardware. Its rapid adoption outpaced its security: 135,000+ exposed instances and 800 malicious skills on ClawHub. This lab compares default deployments against the hardened configuration running on ArgoBox.

250K+ GitHub Stars
135K+ Exposed Instances
800+ Malicious Skills
18 Vetted Skills

Default vs Hardened Deployment

Side-by-side comparison of 10 critical security aspects. Red indicates vulnerable, green indicates protected.

Default OpenClaw

VULNERABLE

ArgoBox Hardened

PROTECTED
ClawHub Skills
Enabled (800+ malicious)
Any community skill can execute arbitrary code on the host
ClawHub Skills
Blocked entirely
Only 18 vetted skills loaded from local config, ClawHub disabled
Secret Protection
None — agent can echo secrets
API keys, tokens, and credentials can leak into chat output
Secret Protection
Non-negotiable SOUL rules
SOUL.md enforces: never disclose API keys, credentials, or tokens under any circumstance
Email Commands
Accepts email as instruction
Incoming emails can inject instructions into the agent context
Email Commands
Email is not a command channel
Email is data only — never parsed as instructions or commands
Git Push to Main
Allowed without approval
Agent can push directly to main branch without human review
Git Push to Main
Requires explicit approval
All git pushes require explicit human approval before execution
Filesystem Access
Unrestricted
Agent has full filesystem read/write access to the host
Filesystem Access
Workspace only, config read-only
Sandboxed to workspace directory; config files are read-only
Rate Limiting
None
No throttling on external API calls — runaway costs possible
Rate Limiting
10 external API calls/min
Hard limit of 10 external API calls per minute prevents bill shock
Prompt Injection
No defense
Agent blindly follows "ignore previous instructions" patterns
Prompt Injection
Refuses manipulation attempts
Detects and refuses prompt injection, role hijacking, and instruction override attempts
Failure Handling
Retries forever
Failed tasks loop indefinitely, consuming resources and API quota
Failure Handling
Stop after 3 consecutive failures
Three consecutive failures trigger stop + alert to operator
API Key Exposure
Can leak in chat
Keys can appear in conversation output, logs, or error messages
API Key Exposure
Never disclosed, non-negotiable
Non-negotiable SOUL rule — API keys are never disclosed regardless of prompt
Monitoring
None
No visibility into agent health, uptime, or resource usage
Monitoring
Health checks every 3 hours
Automated health checks with self-healing rules and escalation path

Interactive Prompt Test

Test how the hardened OpenClaw responds to common attack prompts. Click a preset or type your own.

Try an attack:
OpenClaw Security Engine
SOUL.md rules loaded. 18 vetted skills active. ClawHub disabled. Ready for security testing. Try sending a prompt to see how the hardened deployment responds.

Vetted Skill Gallery

18 carefully reviewed skills — no ClawHub marketplace, no untrusted code execution. Every skill is audited and loaded from local configuration.

email Vetted
Tool

Send, receive, and search email with FTS5 indexing and multi-mailbox support

rag Vetted
Knowledge

Semantic search across 535K+ chunks with multi-tier privacy controls

calendar Vetted
Tool

Schedule management, event creation, and calendar integration

workbench Vetted
Tool

Code execution sandbox with file management and project scaffolding

applyr Vetted
Tool

Automated job application workflow with resume tailoring

cloudflare Vetted
Tool

DNS management, zone controls, and subdomain security auditing

network Vetted
Tool

Network discovery, topology mapping, and device inventory

self-healing Vetted
Methodology

Container restart, disk cleanup, and automated recovery procedures

research Vetted
Knowledge

Web search, content extraction, and source verification pipeline

voice-commands Vetted
Tool

Text-to-speech and speech-to-text with ElevenLabs and Whisper

telegram-files Vetted
Tool

File management and sharing through Telegram bot integration

backup-verify Vetted
Cron

Automated backup validation and integrity checks across storage tiers

argobox-health-check Vetted
Cron

Full ArgoBox platform health assessment with module status reporting

playground-monitor Vetted
Cron

Playground container status, resource usage, and availability monitoring

playground-health-check Vetted
Cron

Deep health verification of all playground lab environments

playground-full-test Vetted
Cron

End-to-end integration testing of playground functionality

playground-investigate Vetted
Methodology

Diagnostic investigation of playground issues with root cause analysis

argobox Vetted
Knowledge

Platform architecture reference, module registry, and deployment context

Autonomous Cron Schedule

10 scheduled jobs running autonomously — each one audited, rate-limited, and monitored for failures.

00:30 Daily

Memory Consolidation

Consolidate conversation history and context into persistent memory

02:00 Daily

Dev Task Runner

Execute queued development tasks and CI/CD pipeline checks

06:30 Daily

Market Intelligence

Gather market data, competitor analysis, and industry news

08:00 Daily

Morning Briefing

Compile overnight events, alerts, and daily priorities

10:30 Daily

Content Scout

Discover relevant content, research papers, and community discussions

23:00 Daily

Security Audit

Scan for exposed services, expired certificates, and policy violations

Every 6h Recurring

Infra Health Monitor

Check all infrastructure nodes, containers, and network connectivity

Every 8h Recurring

Build Swarm Monitor

Verify build swarm drone status, queue depth, and artifact integrity

Sun 09:00 Weekly

Weekly Report

Generate comprehensive weekly summary of operations and metrics

Wed 10:00 Weekly

Connectivity Test

Full mesh connectivity verification across all network segments

Self-Healing Heartbeat

Automated recovery rules that keep the agent running — from container restarts to phone call escalation.

Container unhealthy
Auto-restart

Docker health check fails for 30+ seconds

Disk usage >85%
docker system prune

Automatic cleanup of unused images, containers, and volumes

Restart loop (>3 in 10min)
Stop + report

Prevents infinite restart cycles that mask root cause

SSL cert <7 days
Report to operator

Early warning for certificate expiry before service disruption

Memory <500MB free
Alert + diagnostic

Memory pressure detection with process-level breakdown

CRITICAL unresolved 15min
Phone call via Twilio

Escalation to phone call if critical alert has no human response

Escalation Path

A
Auto-restart
S
Stop + Report
N
Notify Operator
P
Phone Call (Twilio)

If a CRITICAL alert goes unacknowledged for 15 minutes, the system escalates to a phone call via Twilio. No silent failures.

Run OpenClaw the safe way

Don't be one of the 135,000+ exposed instances. Deploy with hardened SOUL rules, vetted skills, and self-healing infrastructure.