Five Projects Burning
The Setup
March 1, 11:41 PM. I kicked off a Gentoo @world update with 1,556 packages queued. Binary packages from a build swarm should make this fast — just let emerge run, it'll be done in an hour or two, go to bed, wake up to a fresh system.
(Narrator voice: this is where the story went wrong.)
But that's not the story of March 1st. That's the story of March 2nd and the 14 days that follow. Today, I have five different projects burning hot and they need babysitting.
EdgeMail: 6,500 Lines in One Session
I extracted ArgoBox's entire email system into a standalone npm package. That's 85+ API actions, 14 D1 tables, FTS5 search, threading, contacts, labels, filters, templates, scheduled send — all of it, isolated, framework-agnostic.
The session ran long. Context limits hit twice. I was 100% sure I'd finish the core architecture and then hit a wall trying to build the API routes. But something just... worked. The monorepo structure clicked (pnpm workspaces + Turborepo). The adapter pattern from api-credentials landed perfectly. The Hono router factory became elegant instead of messy.
By the end I had:
@edgemail/core— 3,362 lines. Database class, receiver pipeline, sender with draft promotion, scheduler, auth (API key + JWT), migrations.@edgemail/api— 1,483 lines. 61 REST routes across 12 sub-routers. Zero duplication.@edgemail/client— 1,411 lines. 50+ typed async methods. Works in browser and Node.create-edgemail— 250 lines. Interactive CLI scaffolding.
All 4 packages compile clean. Zero type errors. The code is ready; it just needs README, tests, and a first git commit.
This is the part I love about extraction work — you're not inventing new patterns, you're discovering what the patterns want to be. ArgoBox's email code was tangled because it lived in a monolithic framework. Once I isolated it, the boundaries became obvious. The shape emerged.
Argonaut-RAG: Standalone Package + Embedding Pipeline
Separate session, same energy. Pulled the RAG engine out of ArgoBox, built a standalone npm package with:
- CLI tools: ingest, embed, search, stats, collections, REPL
- Ollama batch embedding (upgraded from serial to batch — 70-80 chunks/sec now, was 25/sec)
- SQLite FTS5 + vector search fusion (BM25 + cosine similarity, 0.3/0.7 weighting)
- 11 databases across 3 tiers (Knowledge, Vaults, Blog)
The embedding pipeline is still running — about 294K chunks across all databases. I started it and just... let it go. It'll finish overnight.
The real work here was the API integration for ArgoBox's admin panel. The old code showed 2 hardcoded databases. Now it should enumerate 11 databases with per-model comparison UI. Not done yet, but the blocking question is finally answered: can we run multiple embedding models in parallel? Yes. We can. GPU stays busy, costs drop, you get better search results.
Colorado Legal RAG: 28 Hours of Indexing
Different project, same complexity. I fixed the memory issue that was silently killing the build process, set up Titan CT 103 with torch + sentence-transformers + chromadb, and kicked off the indexing pipeline.
1,319 Colorado statutes → 82,130 chunks in about 2 hours.
Now it's working through case law downloads (614K chunks in 6+ hours) and federal statutes (queued, will run next). The index is growing autonomously. No babysitting needed, just SSH in and tail the log if I get curious.
This is the kind of work that feels like nothing for weeks and then suddenly: you have a product. Invisible infrastructure that just quietly builds itself.
Wompus Protocol & Token Efficiency Analysis
I also spent time looking at how Claude Code was burning tokens. Eighteen sessions. 35,922 messages. 4,316 tool calls. ~1.39M output tokens. The cache reads exceeded 1 billion tokens across sessions, which actually means the system is working — cache hits are cheap — but the pattern of sessions was inefficient.
So I built a protocol called Wompus (the word, not an acronym — I just liked how it sounded). The idea:
- Haiku for research, exploration, reading code
- Sonnet for small fixes and docs
- Opus for all production code changes
- End-of-session handoffs documented and archived so next session doesn't repeat work
- Model escalation (
/model opus) instead of handovers when you hit the limits mid-session
The wompus keyword ends a session and triggers full documentation + handover creation. Every new session checks for pending handovers automatically.
Math says this saves 60-70% of tokens if sessions stay under 200 messages and Haiku handles exploration. Because Haiku discovering code and Opus writing code is cheaper than Opus doing both. And Haiku's output tokens are 1/20th the cost even though the quality is… different.
Deployment Speed + Cloudflare Purge
Someone asked why ArgoBox deployments are slow. I dug into it, didn't find a silver bullet, but did find that Cloudflare Pages had accumulated 150+ old deployments cluttering the dashboard. Used the CF API to bulk-delete them.
Small win. Not the bottleneck fix we need, but it cleared some noise.
The Confluence
Here's what's weird about March 1: I'm not stressed. All five of these projects are running in parallel. Three of them are basically done (EdgeMail code complete, Wompus protocol written, CF purge done). Two of them are long-running infrastructure that doesn't need human intervention (RAG embeddings, Colorado legal indexing).
The Gentoo @world update is sitting there in the background. I'm not thinking about it. The emerge is probably still running, or it finished, or it crashed — I won't know until morning.
This is what operational maturity feels like: you have enough processes that things run themselves while you build new things.
What's Next
EdgeMail needs README + tests + first commit tomorrow.
Colorado legal RAG needs the query API built once the indexing completes.
The ArgoBox admin RAG page needs to show all 11 databases with per-model comparison UI.
The Gentoo system will either be fine in the morning or it will not be fine, and I'll figure that out then.
I set the emerge running at 11:41 PM with zero expectations. In infrastructure work, you learn fast to be genuinely uncertain about how long things take. Sometimes the system surprises you and everything just works.
We'll see.