ArgoBeat Generative Audio Sprint

Date: 2026-03-22 Duration: ~21 hours Commits: 15+ Result: Core engine working; content layer needs higher-fidelity source material.

What I Was Trying to Build

A Brain.fm competitor. Generative focus music in the browser. Web Audio API, FM synthesis, entrainment modulation — the whole stack, from scratch.

I rebuilt the engine from an early prototype into something that actually runs. 5 mood modes, a 3-layer modulation chain, session randomization, lo-fi effects. 15+ commits pushed to both Gitea and GitHub. Deployed to CF Pages.

The singing bowls already point in the right direction. The rest of the session clarified where procedural synthesis works and where the content layer needs real samples or precomposed stems.

The Engine Architecture

This part is solid. I'm keeping it.

5-mood system: focus, deepWork, relax, meditate, sleep. Each with per-mood Hz target ranges.
3-layer invisible entrainment modulation chain: amplitude modulation + spectral modulation + stereo panning with drift.
Session randomization: random Hz within range, random key transposition, random seed. No two sessions identical.
SoundscapeManager for loading audio files with graceful fallback.
GenerativeMusicEngine with per-mood pattern generators.
Lo-fi effects chain: reverb, delay, tape saturation, vinyl crackle.
FM synthesis for Rhodes-like tones — body at 1:1 carrier-modulator ratio, bell partial at 14:1.

The UI works too. 5 mood cards, Soundscapes/Both/Music toggle, visualizer, progress ring, transport controls, session timer, fade in/out, completion chime.

What I Learned About FM Synthesis

The modulation index controls everything. This was the key tuning insight of the session.

Index 2.0-4.0 (DX7 keyboard territory) = harsh, metallic, ear-fatiguing. Fine for a synthesizer solo. Terrible for background focus music.
Index 0.7-1.5 = warm, gentle, actually pleasant. This is the range.

The 14:1 bell ratio creates a problem above 1500 Hz. 8000 Hz fundamental times 14 = 112,000 Hz. Browsers cap at the Nyquist limit. The oscillator just spams. Had to clamp it.

The Singing Bowls

Meditate mode. 5-partial inharmonic synthesis at ratios 1.0, 2.71, 5.04, 8.09, 11.79.

These sound genuinely good. Not "good for procedural generation" — actually good. Something about inharmonic partials at those specific ratios just works. My one win of the session.

Everything Else

Procedural noise synthesis for nature sounds: rain, ocean, forest, cafe. Filtered noise buffers are not enough for believable soundscapes. Real audio recordings are mandatory.

Oscillator-based drums also hit the limits of pure synthesis for this product. I have 18 CC0 drum and piano samples downloaded, 1.3 MB total, that should replace the procedural percussion layer.

The uniqueness problem is the real killer. 10 pentatonic notes in a 2-octave range. 3 chord progressions per mood. By minute 8 of a session, you've heard every possible combination. Shuffling the order of a small set is not the same as generating new content.

The Brain.fm Discovery

This changed everything. Or maybe it should have.

Brain.fm doesn't generate music procedurally. Their patent describes human composers writing the actual music, then applying amplitude modulation at the target brainwave frequency. The music quality comes from humans. The therapy comes from the modulation layer.

Their research is legit — a Nature Communications Biology paper from 2024 validated that 16 Hz AM for focus (beta range) actually works. ADHD users get more benefit, covaried with ASRS score.

My modulation chain, the 3-layer invisible AM + spectral + stereo panning, tracks the same broad architectural idea. That part is correct. The input material is where the product direction changed: higher-quality composed or sampled audio should feed the modulation layer.

The Volume Compounding Problem

Blend gain times pattern velocity times synthesis gain times effects chain gain. Four multiplicative stages. The signal either gets crushed to silence or boosted to harshness. Every time I fixed one stage, another was wrong. You have to think about the total gain path, not each stage independently.

This seems obvious written down. It was not obvious at 3 AM with 6 gain nodes open in Web Audio Inspector.

The Audio File Problem

572 MB of soundscape audio cannot go in a git repo. The 18 instrument samples at 1.3 MB can and are committed to /apps/web/public/audio/samples/. But the soundscapes — the nature sounds, the ambient textures — need Cloudflare R2 or equivalent CDN hosting.

Also sitting on my machine: an audio-raw/ directory at 642 MB that probably needs to be cleaned up or moved somewhere intentional.

And variations.ts — 2,153 lines of procedural soundscape variations, 92 total — is now completely unused since I switched to audio files. Same with synthesizer.ts at 35 KB. Dead code from a dead approach.

The Product Assessment

The feedback was clear: the engine architecture works, but pure procedural FM synthesis with a small progression set is not enough for a polished focus-music product.

The architecture is right. The modulation science is right. The audio quality is wrong.

Next steps are clear: use the 18 downloaded samples for instruments, expand to 20+ chord progressions per mood, add variable BPM, upload soundscape audio to R2. Probably need to look into pre-generating music stems with MusicGen or Suno offline and streaming them. Maybe Mubert API at $49/month for real-time generation.

Or just accept that Brain.fm solved this with human composers and a $7/month subscription.

21 hours. 15+ commits. Singing bowls sound good. The rest needs a stronger content-generation strategy.

I should probably sleep. It's been 12 hours.