Skip to main content
user@argobox:~/journal/2025-12-16-the-extent-tree-that-vanished
$ cat entry.md

The Extent Tree That Vanished

○ NOT REVIEWED

05:06 - The NAS is acting strange. Slow writes. Random I/O errors. Time to check the filesystem.

sudo btrfs check /dev/sda1

Waited. And waited. Then:

ERROR: could not find extent tree

That’s bad. Really bad.

05:30 - The extent tree is how Btrfs tracks which blocks are in use. Without it, the filesystem doesn’t know what’s data and what’s free space. It’s like losing the index to a library—the books are still there, but good luck finding anything.

First rule: don’t make it worse.

sudo mount -o remount,ro /dev/sda1

Read-only mode. No writes until I understand what happened.

06:15 - Tried the backup superblock:

sudo mount -o ro,usebackup,recovery /dev/sda1 /mnt/recovery

It mounted. I can see my files. But I don’t trust this filesystem for anything permanent anymore.

07:00 - Started copying critical data to another drive:

rsync -avP /mnt/recovery/important/ /mnt/backup/important/

The extent tree corruption means even “successful” reads might be returning garbage. Verify everything copied correctly.

08:30 - Recovery options, in order of increasing desperation:

  1. btrfs check —readonly - Just look, don’t touch
  2. btrfs rescue super-recover - Try alternate superblocks
  3. btrfs check —repair - Attempt fixes (DANGEROUS)
  4. btrfs restore - Extract files to new location (last resort)

I went with option 4. The filesystem was too corrupted to trust repairs.

sudo btrfs restore /dev/sda1 /mnt/backup/

This bypasses the filesystem’s internal tracking and just reads blocks that look like data. Slow, but safer than trusting a corrupted extent tree.

12:00 - Data recovered. 2.3TB across 1.2 million files. Spot-checked critical directories—everything looks intact.

The Post-Mortem:

Checked dmesg from before the corruption:

BTRFS warning: csum failed root 5 ino 257 off 4096
BTRFS error: bdev /dev/sda1 errs: wr 0, rd 47, flush 0, corrupt 12, gen 0

Read errors and corruption warnings I’d been ignoring for weeks. The drive was dying. The extent tree corruption was just the final symptom.

The Lessons:

  1. Don’t ignore filesystem warnings. csum failed means checksums don’t match. That’s corruption, not a suggestion.

  2. Monitor drive health. smartctl -a /dev/sda would have shown the degradation before it became catastrophic.

  3. Btrfs extent tree loss is recoverable if you act fast and don’t write anything. Read-only mount, backup data, figure out root cause.

  4. btrfs restore is underrated. When the filesystem metadata is corrupted, it can still extract data by scanning for file structures directly.

14:00 - New drive ordered. Old drive marked as “do not trust.” Data safe.

The NAS will be rebuilt this weekend. This time with proper SMART monitoring.


The extent tree warnings started three weeks ago. I saw them in dmesg. I thought “probably fine.” It was not fine.