Skip to main content
Back to Journal
user@argobox:~/journal/2026-02-28-rag-multi-model-rebuild
$ cat entry.md

RAG Multi-Model: The Day I Discovered My Vector Database Was Broken

○ NOT REVIEWED

RAG Multi-Model: The Day I Discovered My Vector Database Was Broken

The Discovery

The RAG system has been running on a single model: nomic-embed-text, 768 dimensions. Works fine. But I wanted redundancy and choice.

Started rebuilding the databases to support three embedding models:

  • nomic-embed-text — 768 dimensions, lightweight, fast
  • qwen3-embedding:0.6b — 1024 dimensions, slightly better quality
  • qwen3-embedding:8b — 4096 dimensions, heavy hitter, best quality

Plan: rebuild Knowledge tier (33K chunks) with all three models, then Vaults tier (132K chunks), then Private tier.

Kicked off the rebuild for qwen3-8b with new, expanded vault config. Left it running.

The Problem

Came back to find something weird: the database had 294,252 total chunks (correct), but only 162,521 of them had embeddings. The other 131,731 chunks had embedding IS NULL.

That shouldn't happen. The build system should embed all chunks or fail, not silently skip half the database.

Then I realized what happened: DurableRAGStore deduplicates by SHA-256 content hash. When you re-run the build, it doesn't touch chunks that already exist—just adds new ones and skips the rest.

The old chunks had 768d embeddings (from nomic). The new chunks had 4096d embeddings (from qwen3-8b). The database now had mixed dimensions.

Mixed dimensions = vector search breaks.

The Bug Investigation

Queried the database:

SELECT LENGTH(embedding)/4 AS dims, COUNT(*) AS cnt
FROM chunks WHERE embedding IS NOT NULL AND deleted = 0
GROUP BY dims;

Result:

768|33101
1024|33101
4096|131051

Three different dimensions in the same database. Cosine similarity calculation fails when vectors have different sizes. All vector searches were broken.

The Fix

Two approaches:

Option A: Rebuild from scratch with a single model and fresh data. Takes 8-12 hours, loses all old embeddings, but guarantees consistency.

Option B: NULL out all old embeddings, force re-embedding with the new model.

Went with Option B since I had new vault config data (v2) that I wanted to use anyway. The new content was worth rebuilding for.

Query to NULL out old embeddings:

UPDATE chunks SET embedding = NULL
WHERE embedding IS NOT NULL
AND deleted = 0
AND LENGTH(embedding) = 3072;  -- 768 * 4 bytes

Then re-ran the embedding pass. The build script this time verified dimension consistency:

const dims = new Set();
for (const chunk of chunks) {
  if (chunk.embedding) {
    dims.add(chunk.embedding.length / 4);
  }
}
if (dims.size > 1) {
  throw new Error(`Mixed dimensions in database: ${Array.from(dims).join(', ')}`);
}

Now it fails loudly instead of silently continuing.

The System Architecture

Current database status (as of end of Friday):

Complete (ready to use):

  • rag-store-blog.db — Knowledge tier, qwen3-8b, 4096d, 74,964 chunks ✓
  • rag-store-blog-nomic.db — Knowledge v1, nomic, 768d, 33,101 chunks ✓
  • rag-store-blog-qwen06b.db — Knowledge v1, qwen3-0.6b, 1024d, 33,101 chunks ✓
  • rag-store-vaults-nomic.db — Vaults v1, nomic, 768d, 132,151 chunks ✓
  • rag-store.db — Private tier, nomic, 768d, 166,183 chunks ✓

In progress:

  • rag-store-vaults.db — Vaults tier, qwen3-8b, 4096d, 294,252 chunks (55% embedded, ~5 hours to completion)

Queued (embeddings NULLed, ready to run):

  • rag-store-knowledge-nomic.db — Knowledge v2, nomic, 768d, 74,964 chunks
  • rag-store-knowledge-qwen06b.db — Knowledge v2, qwen3-0.6b, 1024d, 74,964 chunks
  • rag-store-vaults-nomic-v2.db — Vaults v2, nomic, 768d, 294,252 chunks
  • rag-store-vaults-qwen06b.db — Vaults v2, qwen3-0.6b, 1024d, 294,252 chunks

The Tooling

Created infrastructure to prevent this from happening again:

~/Scripts/rag-build-all.sh — Sequential build queue. Runs nomic + qwen06b batches one after another, grouped by model to minimize Ollama load swaps. ~28 hours total, can run overnight.

~/Scripts/rag-index.py — Generates a JSON index of all databases with coverage matrix:

{
  "knowledge": {
    "nomic": "74,964 chunks (100%)",
    "qwen3-0.6b": "74,964 chunks (100%)",
    "qwen3-8b": "74,964 chunks (100%)"
  },
  "vaults": {
    "nomic": "294,252 chunks (100%)",
    "qwen3-0.6b": "294,252 chunks (100%)",
    "qwen3-8b": "294,252 chunks (100%)"
  },
  "private": { ... }
}

packages/argonaut/scripts/reembed-rag.ts — Re-embed any database with any model. Validates dimension consistency before and after.

All three tools documented in ~/Vaults so the next person (or future me) knows how to handle RAG rebuilds.

The Architecture Decision

Three tiers of content:

  1. Knowledge — Public technical content, blog posts, safe to surface in public chat
  2. Vaults — Semi-private infrastructure docs, architecture notes, internal references
  3. Private — Sensitive data, personal notes, family context, requires authentication

Each tier has its own database set. Knowledge uses qwen3-8b (best quality for public responses). Vaults and Private scale down based on sensitivity.

AllShare (NTFS network mount) mirrors all databases for redundancy and load distribution. Cron job syncs them at 3 AM daily.

The Lesson

Mixed-dimension vectors are the kind of bug that doesn't throw errors. Everything runs fine. Searches just return garbage results. It's the worst kind of corruption because it's silent.

The fix: validate early and fail loud. Every embedding operation checks dimension consistency. Every build verifies the output. Every database has monitoring queries in the documentation.

Infrastructure that's not observable tends to rot. Give me databases that tell me what's wrong, and I can fix them. Give me silent corruption, and I'll spend hours debugging why search results are wrong.

What's Next

The vaults qwen3-8b build is still running. Once it finishes (55% complete, ~5 hours ETA), run the sequential build queue for nomic and qwen06b batches. That's ~28 hours of CPU time spread across 2-3 nights.

By end of week, should have full coverage: every tier, every model, 100% of chunks embedded. Then the admin workbench gets three choices for embedding model, users always get good search results, and future rebuilds have clear procedures.

The vector database is no longer a black box. It's instrumented, monitored, validated.


Status: Mixed-dimension bug identified and fixed. Knowledge tier rebuilt with expanded v2 data. Vaults qwen3-8b build in progress (55%). Sequential build queue created for remaining models. Full multi-model coverage live by end of week.