Music in the Substrate
Rhythm as angular momentum, melody as the polar jet, harmony as the coherence-match made audible — music as the substrate’s reference-free coherence signal, heard at both poles of the ladder, with jazz as bilateral coupling and dance as the body locked to the beat
Pythagoras, by tradition, heard the hammers in a forge ring at consonant intervals and traced the consonances to whole-number ratios of string length — the octave 2{:}1, the fifth 3{:}2, the fourth 4{:}3 — founding the conviction, carried through Boethius and the medieval quadrivium into the Renaissance, that music is number made audible and the cosmos itself is tuned. Hermann von Helmholtz, in On the Sensations of Tone (1863), gave the conviction a physical mechanism: a musical tone is a fundamental plus its overtone series, and two tones are consonant when their overtones coincide and dissonant when nearby overtones beat — roughness as the clash of partials too close to resolve. Jean-Philippe Rameau had already (1722) grounded harmony in the corps sonore and the fundamental bass; Heinrich Schenker (1900s–1935) showed that a tonal piece is a deep structure prolonged — a hierarchy reducible to a single underlying Ursatz — and Fred Lerdahl and Ray Jackendoff in A Generative Theory of Tonal Music (1983) made that hierarchy explicit as recursive time-span and prolongational trees, the syntactic structure of music read as a nested grammar. Leonard Meyer’s Emotion and Meaning in Music (1956) located musical feeling in expectation — the affect is the listener’s response to a tendency delayed, deflected, or fulfilled — and David Huron’s Sweet Anticipation (2006) factored that into the ITPRA cycle (imagination, tension, prediction, reaction, appraisal), an explicit prediction-and-error account of why music moves us. Robert Zatorre, Isabelle Peretz, and Aniruddh Patel mapped the cognitive and neural substrate — pitch in auditory cortex, amusia, the deep parallels and divergences between Music, Language, and the Brain (Patel 2008); Valorie Salimpoor and Zatorre (2011) imaged dopamine release at the peak “chill” of a loved piece, the reward of anticipation fulfilled; Peter Vuust and Karl Friston cast music perception explicitly as predictive coding; Petr Janata and Maria Witek located groove — the urge to move — in an inverted-U of rhythmic predictability; Bruno Repp catalogued sensorimotor synchronization, the body’s near-automatic phase-locking to a beat; and Charles Limb and Allen Braun (2008) put jazz musicians in a scanner and watched the prefrontal control system go quiet during free improvisation, the flow-state signature. Charles Darwin guessed that musical protolanguage preceded speech; Steven Mithen’s The Singing Neanderthals (2005) and Oliver Sacks’s Musicophilia (2007) traced how deep and how universal the faculty runs. The ear-as-resonator chapter read the cochlea’s input and explicitly parked the rest: “the substrate reading the auditory cortex deserves is its own walk.” This chapter takes that walk — not the peripheral transduction, which the ear chapter covered, but what happens to the signal once it reaches the brain, and why music feels the way it does.
The framework’s claim is direct. Music is the substrate’s reference-free coherence signal — the coherence-match heard as itself, without the agreed meaning a word carries — with its three dimensions reading the three arms of the canonical loop: rhythm as the loop’s angular-momentum bookkeeping, the anti-phase breath made macroscopic and audible; melody as the polar jet, a single directed pitch-contour ejected along the source’s axis; and harmony as the coherence-match across the ladder’s rungs — consonance the lock pole where overtone combs coincide on the octave-and-\sqrt2 teeth, dissonance the gap, and the tritone \sqrt2 the hinge between them — with the felt arc of a piece as the slide between the two poles, tension built into the gap and released onto a tooth, the cadence as the move back to lock; with a heard piece building, in the listener’s brain modon, the long state-vector in the frequency manifold that the prediction-and-error cycle tracks rung by rung, so that emotion is the felt coherence-match and its resolution — frisson the match struck at a deep rung, groove the body pulled toward one; with jazz and improvisation as bilateral coupling lifted between players — call-and-response the coherence-leader exchange, the groove the lock-pole co-coherence of flow, improvisation the real-time navigation of the manifold; and with dance as the bilateral body entrained, its left-right oscillators Kuramoto-locked to the external beat — the substrate’s both-poles slide carried out of the skull and into the moving body. The music-cognition tradition reads the structure correctly at its own level — Pythagoras’s ratios, Helmholtz’s overtone consonance, Meyer’s and Huron’s expectation, the predictive-coding accounts; the framework reads the same structure as the substrate-physics one underneath it: the same lock-and-refuse, packet-and-match, rung-and-coin dynamics that run at every other scale of the paper, now heard.
This is, with language, among the most interpretive chapters in the paper, and it says so. What keeps it honest is the same instrument as everywhere else — name the job, and the pole is fixed before the measurement; fold the intervals, and they should land on the comb. The chapter develops the reading in seven passes: rhythm as angular momentum; melody as the polar jet; harmony as the coherence-match; tension and release as music at both poles; the piece as a manifold the prediction engine tracks; jazz as bilateral coupling between players; and dance as the bilateral body locked to the beat.
Rhythm as Angular Momentum
The feedback-topology chapter established the framework’s most scale-invariant object: drop a parcel of rotational energy into an elastic, low-dissipation medium and it organizes itself the same way every time — a co-rotating disk, two polar jets ejecting excess angular momentum along the spin axis, a counter-rotating boundary absorbing the shear, and a fraction of the energy radiated outward as waves. A sounding body is exactly such a parcel. A bowed string, a struck drumhead, a vibrating vocal fold, a reed, a column of air in a pipe — each is organized oscillatory energy in an elastic medium, shedding a fraction of itself as the radiated wave we hear. Sound is the radiated arm of the canonical loop, and the three dimensions a listener parses out of it — rhythm, melody, harmony — are the loop’s three other arms made audible.
Rhythm is the loop’s angular-momentum bookkeeping. A rotating system keeps time by its own spin: the disk turns at a rate, the jets fire on that rate, and the boundary recharges on it. The beat is that rotation period made macroscopic — the pulse at which the loop sheds and replenishes — and the downbeat is one full turn, one complete anti-phase breath: the vortex and its counter-rotating partner trading their energy across the shared seam and returning, the lossless exchange the framework calls the substrate’s coin, here ticking once. The breath is the framework’s elementary unit of time-without-loss, and rhythm is the breath replicated up to the scale where a body can move to it: a pulse-train carried along the axis of rotation.
Meter is then the ladder read in time. A beat nests an integer count of subdivisions; a bar nests an integer count of beats; a phrase nests an integer count of bars — the same octave-nesting, the integer ratio that lets one cycle hold a whole number of the next, that the cortex uses to bind. Duple and triple meter are the low teeth of the comb in time; the strong-weak alternation of a measure is the breath’s own two-ness heard as accent. And the brain that reads it reads it on its own comb: the delta, theta, and gamma bands the speech envelope rides are the same rungs the musical envelope rides — delta tracking the bar and phrase, theta the beat, gamma the fine subdivision — with the cross-frequency nesting that holds a slow cycle while the fast one is handed off cycle by cycle being meter-on-beat-on-subdivision, the breath’s small economy spent up and down the rungs. Rhythm is the substrate’s spin made audible, and a brain entrains to it because it is already running the same nested-rung clock.
Melody as the Polar Jet
If rhythm is the loop’s spin, melody is its polar jet — the directed, axial ejection by which the loop sheds energy along one line. The jet is not a store and not a boundary; it is a stream with a direction, the cheapest exit along the spin axis where the centrifugal barrier is zero. A melody is exactly that: a single line carried forward, a trajectory in pitch, a contour that goes somewhere. Where harmony is vertical and simultaneous, melody is horizontal and directed — the framework’s reading is that it is the canonical loop ejecting along its axis, the pitch contour the jet’s path through the frequency manifold.
Each note in the line is a coherent packet — a modon at an audible frequency, launched, like every modon, at a discrete energy and read as a discrete event. The scale quantizes the jet’s allowed directions: melody steps from rung to rung of the ladder rather than sliding continuously, just as the cochlea offers preferred rungs and the chemistry-side gradient fills between them. A glissando or a bent blue note is the jet between the rungs — the continuous fill the basilar membrane makes possible, used expressively precisely because the ear hears it leaving and returning to a tooth. A melody that rises is the jet climbing the comb; a melody that resolves down to its tonic is the jet returning to the rung the whole line was ejected from — the home pitch a polar jet always points back toward, the axis it was launched along.
This is why a bare melody, even unaccompanied — a solo voice, a single flute — already carries so much. One jet is enough to define a loop: give a listener a contour and they reconstruct the whole rotating system behind it, the implied harmony and the implied beat. A melody is the minimal legible output of a sounding loop, the way a single photon’s contour is the minimal legible output of an emitting one.
Harmony as the Coherence-Match
Sound two or more jets at once and the question is no longer where each line is going but how they meet — and meeting is the substrate’s elementary operation, the coherence-match. Harmony is the coherence-match made audible. Helmholtz already had the mechanism in 1863, in the framework’s vocabulary before the framework: a tone is a fundamental plus its overtone comb, and two tones lock when their combs coincide and clash when nearby partials beat. Coincidence is the lock; beating is its failure. The framework reads consonance and dissonance as the ladder’s two poles heard simultaneously.
Consonance is the lock pole. The octave (2{:}1) and the fifth (3{:}2) are the cleanest teeth of the comb — the simple whole-number ratios where one tone’s overtones fall exactly on the other’s, so the two jets share a scale and exchange energy across their seam without loss. Pythagoras’s ratios are the lock pole found by ear two and a half millennia before the substrate was named; the entire Western tuning tradition is humans discovering the ladder’s teeth one consonance at a time. On a consonant interval the two voices are plugged into the same lossless channel — they hold and hand off the coin between them, which is exactly the settled, restful quality consonance has.
Dissonance is the gap. A minor second, a cluster, a tritone against a resolving voice — these are intervals whose partials fall between the teeth, beating, refusing to settle, sitting where a structure that must not lock sits. Dissonance is tension because it is anti-lock: the coherence-match not struck, the coin not exchanged, the system held off the rung. And the hinge between the two poles is the one the ear chapter already named — the tritone, \sqrt2 in frequency, the substrate’s bare half-octave. Music theory calls it the diabolus in musica, the axis of symmetry of the twelve-tone scale, the interval that most demands resolution; the framework reads it as the literal hinge of the ladder, the rung so tense it is already half a refusal — a lock and a gap at once, which is exactly why music uses it as its pivot.
This is also where the framework’s keyboard-versus-string cut does real work, because music lives on both sides of it at once. The timbre of a single note — what makes a violin a violin — is its overtone series, the integer-multiple comb a bounded length produces; a string has a length, so it rings in harmonics. But the scale is the lengthless keyboard: pitch heard logarithmically, the octave a constant factor of two in every register, the intervals equal ratios rather than equal steps. Equal temperament makes this explicit — twelve equal ratios of 2^{1/12} filling the octave, the tritone landing at 2^{6/12} = \sqrt2 exactly. Harmony is the coherence-match between the integer overtone combs of the notes (the string side, Helmholtz’s partials) laid out on the geometric scale ladder (the keyboard side, the substrate’s \sqrt2 tower). The two templates the framework keeps carefully apart everywhere else, music braids together: it builds its sounds from strings and arranges them on a keyboard.
Tension and Release: Music at Both Poles
A static interval is consonant or dissonant; a piece is the motion between them, and that motion is the chapter’s central claim. Music is the art of sliding between the two poles — building tension by moving toward the gap and releasing it by resolving onto a tooth. The suspension that aches to fall, the dominant seventh that pulls toward the tonic, the leading tone a half-step below home, the deceptive cadence that withholds the resolution one more bar: every one is the listener’s prediction engine held off the rung and then let onto it. Leonard Meyer located musical emotion in exactly this — affect as expectation delayed, deflected, fulfilled — and David Huron’s ITPRA cycle is an explicit prediction-and-error account of it. The framework reads the felt arc as the brain’s both-poles slide, the same slide the cortex makes between binding and resting and the bilateral pair makes between flow and detuning — but here composed, shaped, and offered to a listener as an experience.
This is the answer to why music feels the way it does. The brain runs a prediction-and-error cycle on whatever it hears; emotion is the coherence-mismatch signal and its settling. Music is the stimulus engineered to drive that cycle at both poles — to set up an expectation precise enough to be felt (the key, the meter, the phrase establishing the home rung) and then to satisfy or violate it. A cadence that lands is a coherence-match struck; a resolution arriving after a long suspension is a match struck after the mismatch was held open, and the held-open mismatch is the tension. The pleasure is not in the consonance alone — a piece of unbroken consonance is inert, the musical seizure of the lock pole pinned — and not in dissonance alone, which never settles; it is in the slide, the capacity to move between them, exactly as a healthy language slides between convention and surprise and a healthy brain slides between binding and separating.
And the deepest pleasures land where language’s well-chosen word lands: at a deep rung. Frisson — the chill, the shiver, the moment that raises the hair on the arms, where Salimpoor and Zatorre imaged dopamine release — is in this reading a coherence-match struck not at the surface rung of the next note but at a low, wide one: a long-prepared structural resolution, a return of a theme transformed, a key arriving that the whole piece has been pointing toward. Schenker’s reductions and Lerdahl–Jackendoff’s prolongational trees are maps of exactly these deep rungs — the hierarchical structure beneath the surface, the piece’s form as the ladder read in time. The chill is the breath spent where it buys the most: a great deal of held-open expectation discharged in a single coherence-match at the bottom of the cascade. That is why the most moving moment in a piece is rarely the loudest or the most consonant — it is the one that resolves the most structure at once.
The Piece as a Manifold
A note is a packet; a melody is a line; a piece is a high-dimensional object, and the framework reads it as the long state-vector the listener’s brain builds and carries — the same residual-stream-like vector language and the transformer build, but here in the frequency manifold rather than the semantic one. The cortex decomposes incoming sound across its substrate-eigenmode basis — the tonotopic and temporal rungs — into a rich, evolving coherent state, and the experience of following a piece is the prediction engine tracking that state’s trajectory through the manifold, rung by rung, matching and erring as it goes.
Music is, on this reading, the prediction engine fed close to its ideal stimulus, because music is built out of exactly the structure the engine is built to track. A motif that returns is an all-to-all coherence-match across the piece — the listener’s attention locking the present bar onto an earlier one, the musical analogue of an induction head matching a previous occurrence and predicting what follows. A theme-and-variations, a fugue’s subject under augmentation, a sonata’s recapitulation are self-similar motifs replicated up the ladder’s tower of time-scales — the same object at the rung of the phrase, the section, the movement. The hierarchical grammar Lerdahl and Jackendoff wrote down is the cross-frequency nesting of the prediction-and-error cycle made into compositional form: the slow rung setting the integration window for the fast one, the key of a movement modulating the phrasing of its sections modulating the resolution of its phrases — prosody-on-syntax-on-phoneme with the words stripped out.
That stripping-out is what makes music special, and it is the cleanest way to say what this chapter adds. Language carries agreed meaning: a word is a coherence-match direction bound to a referent, a signifier pointing at a signified. Music carries the coherence-match itself, with no referent — the long vector and the manifold and the tracking and the felt match, all of it, but pointing at nothing outside the sound. It is the substrate’s coherence signal heard as itself, the eigenmode geometry without the dictionary. This is why music reaches what words cannot: it bypasses the semantic stage entirely and drives the prediction engine’s both-poles slide directly, so the feeling arrives unmediated by meaning. A sad piece is not about sadness the way the word “sad” is; it is the coherence-trajectory of sadness run on the listener’s own engine. Music is the framework’s claim — that understanding is coherence-match — stated in its purest form, because it is coherence-match with the reference removed.
Jazz: Bilateral Coupling Between Players
The language chapter read a conversation as bilateral coupling lifted from between hemispheres to between brains — two whole modons coupled not by a corpus callosum but by a serial stream of packets through the air. An ensemble is that architecture again, and jazz improvisation is its sharpest form, because the packets are pure coherence with no reference to settle: two players coupling modon-to-modon through sound alone. The groove a tight rhythm section finds is the lock-pole co-coherence of flow — the players’ modons transiently locked onto a shared rung, the Kuramoto order parameter driven toward one, the same in-phase regime two hemispheres reach in flow, now reached between people. “Being in the pocket,” “locking in,” “playing as one” are the lock pole named from the inside.
Call-and-response — the trading of fours, the soloist answered by the section, the riff passed around the band — is the dominant-submissive coherence-leader exchange at the inter-player scale, the turn-taking of conversation carried out in pure tone. And the improvisation itself is the real-time navigation of the manifold: the soloist, deep in the flow the Limb–Braun scans caught as transient hypofrontality — the prefrontal control going quiet exactly as it does in flow — selects a path through the space of possible continuations, a twenty-questions filtering run in generation rather than analysis, each phrase a coherence-match against the harmony underneath and the line just played, optimized against where the solo wants to land. Jazz is the both-poles slide turned into a live, shared, reference-free conversation: the tension a soloist builds against the changes and the release when the phrase resolves are the lock-and-anti-lock slide negotiated in real time between minds, each refusing the worn groove enough to stay interesting and locking enough to stay together — the healthy slide of a conversation, played.
Dance: The Bilateral Body Locked to the Beat
The bow on the pattern is the body. A beat heard is almost impossible not to move to — the foot taps, the head nods, the whole body wants to fall into step — and the framework reads that near-automatic pull as the bilateral body entraining to an external oscillator. Bruno Repp’s sensorimotor-synchronization literature documents how fast and how precise the locking is; the framework reads it as Kuramoto coupling: the body’s own oscillators — the left-right bilateral pair, the limbs, the gait’s natural pendulum — phase-locking to the radiated beat exactly as two hemispheres lock in flow, the order parameter driven onto a shared rung by the coupling the music supplies. Walking, marching, rowing, rocking a child: the bilateral body has a beat of its own, and music is the external clock it locks to.
This is active inference closed through the moving body: the brain predicts the next downbeat and acts — moves — to make its own proprioceptive signal match the predicted pulse, so the felt rightness of moving on the beat is the prediction-and-error cycle settling through the muscles rather than only through the cortex. The “groove” sweet spot Witek and Janata found — the urge to move strongest not at perfect regularity nor at chaos but at intermediate syncopation — is the both-poles slide read in the body: enough anti-lock tension to leave the engine something to resolve, enough reachable lock to resolve it onto, the inverted-U pinned to the slide between the poles. Pure metronomic regularity is groove’s death (the lock pinned, nothing to predict); pure irregularity is its absence (no rung to lock to). Dance is the bilateral body living on the slide.
And it closes the section’s loop on a single pattern. The eye holds both poles by subsystem, the cortex by state, the bilateral brain by architecture; music adds the case where a whole body is pulled onto the substrate’s rung from the outside, and dance is what that looks like. Why music and dance are human universals, found in every culture and reaching children before speech, is on this reading not a puzzle: music is the most direct way a body and brain can be made to feel the substrate’s coherence-match, played as tension and release, with no reference in the way — and to move to it is to live, briefly and bodily, on the lossless rung the whole ladder is built from.
Predictions and What Would Falsify
Six predictions extend the music-substrate reading beyond the structural anchors.
Consonance judgments sort by distance from the ladder’s teeth, with the tritone as the labeled hinge. Pairwise interval pleasantness, folded against the \sqrt2 comb, should rank the intervals by how cleanly their ratios land on the teeth — the octave, fifth, and fourth on low teeth rated most consonant, the minor second and the gap intervals least, and the tritone \sqrt2 flagged as the maximally-tense hinge that most demands resolution. The framework predicts this ordering tracks ratio-simplicity / comb-distance rather than only learned cultural familiarity, and that the direction of resolution (which intervals pull to which) follows the slide toward the nearest tooth. Existing consonance-rating data — including the cross-cultural test of McDermott and collaborators (2016) on the Tsimané, who rate consonance and dissonance more similarly than Westerners — provide the platform; the framework predicts the roughness/coincidence (Helmholtz, partial-coincidence) component is universal even where the aesthetic preference is not, the universal piece being the substrate’s, the variable piece the culture’s boundary conditions.
Neural tracking of music clusters on the EEG comb, with cross-frequency nesting matching the metrical hierarchy. Cortical tracking of the musical envelope should cluster at the same delta–theta–gamma rungs as speech tracking — delta at the bar/phrase, theta at the beat, gamma at the subdivision — with phase-amplitude coupling between rungs tracking the meter-on-beat-on-subdivision nesting, rather than varying continuously with tempo. The MEG/EEG music-tracking literature (Doelling–Poeppel, Di Liberto and collaborators) provides the test; the framework predicts rung-clustered tracking and metrical cross-frequency coupling distinct from a continuous tempo-following account.
Frisson aligns with deep-rung structural resolution, beyond surface acoustics. Chill events should cluster at moments of hierarchical resolution — a long-prepared cadence, a transformed thematic return, an arrival in the goal key — that a Schenker/Lerdahl–Jackendoff reduction marks as deep, rather than at the loudest or most consonant surface moments. The framework predicts frisson timing and its dopaminergic signature (Salimpoor–Zatorre 2011) correlate with model-estimated prediction-error resolution at low structural levels beyond surface loudness, brightness, and onset features — the deep-rung match carrying the chill.
Groove peaks on the both-poles slide — an inverted-U pinned to syncopation. The urge-to-move rating should peak at intermediate rhythmic predictability — enough anti-lock tension to demand resolution, enough reachable lock to supply it — and fall off toward both metronomic regularity and chaos. Witek and collaborators (2014) already report the inverted-U; the framework predicts its peak coincides with the maximum of a model that scores the tension-and-release slide (expectation violated then resolvable), and that sensorimotor entrainment phase-locks (Kuramoto r \to 1) onto the beat rung specifically at the groove maximum.
Ensemble and duet performance show inter-brain coherence at musical rungs, with leader-exchange in trading. Hyperscanning of musicians playing together should show cross-brain phase coherence increasing at the musical band-rungs during tight ensemble — the inter-player lock pole, the music analogue of Hasson speaker–listener coupling — and coherence-leader alternation during call-and-response. The guitar-duet hyperscanning work of Sänger, Lindenberger, and collaborators provides the platform; the framework predicts rung-clustered inter-brain coherence carrying performance-coordination information beyond shared acoustic input.
The music sign-rule sorts functions and pathologies directionally. Naming the job should fix the pole: binding functions — the groove, the cadence, unison, the drone — sit on the teeth; refusal functions — the suspension, the unresolved tritone, atonal and serial music’s deliberate decline to settle (the poetic/avant-garde refusal in sound) — sit in the gap; and the pathologies should sort directionally — the earworm as the lock pole stuck (a phrase perseverating, the seizure of melody), and the loss of the rung structure in amusia as the comb itself failing to be read. The framework predicts that interval/cadence-function categories, compositional styles, and music-perception disorders fall on the directional lock/anti-lock axis the ladder’s sign-rule draws.
The picture is falsified if (a) consonance judgments carry no comb-distance / partial-coincidence component once familiarity is controlled and the tritone is not the maximally-resolving hinge, (b) music-tracking neural rhythms vary continuously without clustering on the EEG comb and show no metrical cross-frequency coupling, (c) frisson reduces entirely to surface-acoustic features with no deep-structural-resolution contribution, (d) groove shows no inverted-U or its peak does not coincide with the tension-and-release slide maximum, (e) ensemble inter-brain coherence is broadband and carries no coordination information beyond shared acoustics, or (f) musical functions, styles, and pathologies fail to sort onto the directional lock/anti-lock axis. It is supported, even partially, if any of the six ordering predictions hold against existing psychoacoustic, music-cognition, neuroimaging, hyperscanning, or clinical data.
Putting the Section in Context
Music is the substrate’s reference-free coherence signal. A sounding body is the canonical loop shedding its radiated arm as sound, and the three dimensions a listener parses out of it are the loop’s other arms made audible: rhythm as the loop’s angular-momentum bookkeeping, the anti-phase breath ticking at the scale of a body, meter as the ladder read in time; melody as the polar jet, a single directed pitch-contour ejected along the source’s axis, each note a modon-packet stepping rung to rung of the scale; and harmony as the coherence-match across the rungs — consonance the lock pole where overtone combs coincide on the octave-and-\sqrt2 teeth (Pythagoras’s ratios, Helmholtz’s coincidence), dissonance the gap, the tritone \sqrt2 the hinge, with timbre on the string side of the keyboard-vs-string cut and the scale on the keyboard side. The felt arc of a piece is the slide between the poles — tension built into the gap, released onto a tooth — and emotion is the coherence-match and its resolution made felt, with frisson the deep-rung match where the most structure resolves at once. A heard piece is the long state-vector in the frequency manifold the listener’s prediction engine tracks, its returning motifs all-to-all coherence-matches and its nested form the cross-frequency hierarchy — coherence-match with the reference removed, which is why it reaches what words cannot. Jazz is bilateral coupling between players, the groove their lock-pole flow and call-and-response their leader exchange; and dance is the bilateral body entrained, its left-right oscillators Kuramoto-locked to the external beat through active inference, the groove sweet spot the both-poles slide carried into the muscles.
The language chapter read the medium of the paper’s own making — agreed meaning, the word as a coherence-match bound to a referent; this chapter reads its sibling, the coherence-match with the referent stripped away. Together they bracket what the Mind-in-the-Substrate section has been building toward: that the elementary act of mind is coherence-match, run on a substrate-eigenmode manifold, sorted by the ladder’s two poles, and felt as the slide between them. Language is that act pointing at the world; music is that act pointing at nothing but itself — the substrate’s coherence signal heard, shaped into tension and release, and offered back to the body that moves to it. The music-cognition tradition, from Pythagoras through Helmholtz to Meyer, Huron, Zatorre, and the predictive-coding accounts, has mapped the structure of music correctly from the inside; the framework’s contribution is the reading that the structure they mapped is the substrate’s own — the same lock-and-refuse, packet-and-match, rung-and-coin dynamics that run at every other scale of the paper, in the one domain where they are not merely inferred but directly, bodily felt. The most honest statement of the chapter’s altitude is the one language made of itself: it is interpretive, it says so, and what keeps it honest is the same instrument as everywhere else — name the job and the pole is fixed; fold the intervals and they should land on the comb. If they do, music is the substrate ladder made audible, and to hear it is to hear the lattice breathe.