The DSP, unredacted.
TuneLab publishes its methodology because real DSP holds up to scrutiny. Execution is 99% of the battle — revealing our techniques doesn’t give competitors our data, our training corpus, or our production infrastructure. It gives developers what they actually need: confidence the numbers aren’t scraped from a black box.
Deep Dives
Five ways we turn audio into data.
Every feature below is computed from raw audio by a published DSP method. No Spotify scraping. No cached metadata. No hidden heuristics.
BPM Detection
BiLSTM ensemble on 3-resolution spectrograms. Float-precision tempo (116.01, not 116). IOI histogram peak-picking plus Viterbi phase analysis.
Key Detection
KeyNet CNN trained on CQT (constant-Q transform) spectrograms. 24-class output (12 keys × major/minor). 70 ms inference per track.
Mood Classification
MAEST transformer embeddings feeding 6 specialized MLP heads for energy, danceability, happiness, acousticness, instrumentalness, speechiness.
Song Structure
Chroma self-similarity matrix plus Foote novelty kernel segmentation. Detects intro, verse, chorus, drop, breakdown, and outro boundaries with timestamps.
Beat Grid
Frame-level beat and downbeat probabilities from the BiLSTM. Viterbi DBN gives globally-optimal beat positions with confidence scores and half/double alternatives.
Accuracy first
Benchmarks we’re proud of.
| Task | Model | Accuracy | Dataset |
|---|---|---|---|
| BPM (±2% tolerance) | BiLSTM ensemble | 94.8% | GTZAN + Ballroom |
| Key (exact match) | KeyNet CNN | 82.1% | GiantSteps-MTG |
| Mode (major/minor) | KeyNet CNN | 91.3% | GiantSteps-MTG |
| Beat tracking (F1, 50ms) | BiLSTM + Viterbi | 0.89 | SMC + Ballroom |
| Energy regression (Pearson r) | MAEST + MLP | 0.81 | Internal 10K holdout |
| Danceability regression (Pearson r) | MAEST + MLP | 0.78 | Internal 10K holdout |
Benchmarks are reported on standard academic datasets so you can compare us directly to published papers. Numbers degrade gracefully on edge genres (classical, ambient, atonal) — see each deep-dive’s "Known limitations" section for specifics.