Skip to main content
AI / ML

Argo Knowledge RAG

The brain behind ArgoBox's AI assistant

Hybrid BM25 + vector search across 166K+ chunks — 100% local, zero API cost

0 Chunks Indexed
0 Collections
0 Local
$0 Per Query

What It Powers

Public AI Assistant

The “Ask Argonaut” chat on every page uses this RAG system to answer questions about ArgoBox with accurate, sourced responses.

Try it live →

Private Knowledge Engine

Admin-only tiers search across 10,000+ documents from Obsidian vaults, legal PDFs, and raw session transcripts — all running locally.

Claude Code Integration

Powers Claude Code's AI context system, letting it find relevant documentation and past decisions across the entire project history.

Try It

Click a topic below or type your own query

How It Works

Ingest

Reads MD, PDF, DOCX, RTF, EML, XLSX — extracts text and frontmatter

Multi-format

Chunk

Paragraph-aware splitting (400 words, 80 overlap) preserves semantic units

Context-aware

Embed

GPU-accelerated via Ollama (qwen3-embedding or nomic-embed-text)

Local GPU

Search

Hybrid FTS5 BM25 + vector cosine similarity with configurable weights

Hybrid

Four-Tier Architecture

Each tier adds scope and security. Data flows down — never up.

Public 874 chunks

Blog posts, docs, project pages. Served from CDN via Cloudflare Pages. Powers the public AI assistant.

OpenRouter text-embedding-3-small (1536d)
Knowledge 33K chunks

Sanitized content safe for external AI. Runs through 148 regex patterns that strip real IPs, hostnames, and usernames.

qwen3-embedding:0.6b (1024d)
Vaults 132K chunks

Full Obsidian vaults with raw content. Local access only — never leaves the machine.

nomic-embed-text (768d)
Private 166K chunks

Everything including legal documents, personal notes, and raw transcripts. Maximum security.

nomic-embed-text (768d)

Key Features

100% Local

All embedding runs on a local GPU. Data never leaves the machine. Zero per-query cost for local tiers.

Hybrid Search

FTS5 BM25 for keyword precision + cosine similarity for semantic understanding. Configurable 0.3/0.7 weighting.

Multi-Format Parser

Ingests MD, PDF, DOCX, RTF, EML, MSG, XLSX via subprocess parsers. Handles corrupted files gracefully.

Content-Hash Dedup

SHA-256 hashing prevents duplicate ingestion. Re-run safely — only new or changed files are processed.

Identity Sanitization

148-pattern regex system strips real names, IPs, and credentials for safe external sharing.

Streaming Vector Scan

Never loads all 166K chunks into memory. Streams through the database for constant memory usage at any scale.

What It's Trained On

0 Blog Posts Homelab, Docker, Linux, networking deep dives
0 Technical Docs Architecture, modules, infrastructure guides
0 Public Docs AI systems, playground guides, reference material
0 Journal Entries Engineering diary, debugging sessions, decisions
0 Project Descriptions TerraTracer, Tendril, Build Swarm, and more
0 Website Pages Tour, about, resources, playground content

Performance

Public tier search 874 chunks, CDN-served embeddings
<100ms
Knowledge tier (33K) Hybrid BM25 + vector
~3 sec
Vaults tier (132K) Hybrid FTS5 + brute-force vector
~23 sec
Private tier (166K) Full brute-force scan, no ANN index
~48 sec

Quick Start

Terminal
$git clone https://github.com/lazarusoftheshadows/argo-knowledge-rag.git
$cd argo-knowledge-rag && npm install
$npm run build# Compile TypeScript
# Ingest your markdown files:
$node dist/cli/index.js ingest ~/my-vault/ --db rag.db --collection my-notes
$node dist/cli/index.js embed --db rag.db --model qwen3-embedding:0.6b
$node dist/cli/index.js search "Docker bridge networking" --db rag.db

Built With

TypeScript SQLite + FTS5 better-sqlite3 Ollama qwen3-embedding nomic-embed-text OpenRouter Node.js

Related