A single writing session produces more than 100 signals across six measurement families. Here is how.

Three systems work together: a native signal engine measures the temporal and structural shape of your writing, a reconstruction adversary tests which dimensions of that shape are irreducible, and a semantic pipeline captures the linguistic content. This page explains the architecture and the corrections that shaped it.

The session flow

Every writing session follows the same path, from keystroke to stored measurement. The synchronous transaction guarantees the session is saved before any derived computation begins. The fire-and-forget pipeline runs six independent signal families; if one fails, the others still complete.

1
Capture

Two channels, recorded simultaneously

Process channel

Every key-down and key-up event with millisecond timestamps. Character identity, cursor position, deletion events. The full temporal microstructure of how you type.

{c, d, u} per keystroke · [offset, cursor, deleted, inserted] per edit event
Content channel

The final submitted text, word count, session duration, and the question that prompted the response. What you wrote, not just how you typed it.

POST /api/respond
2
Synchronous transaction

Atomic write: all or nothing

The session is persisted in a single database transaction. If any write fails, the entire session rolls back. The pipeline that follows runs after the transaction commits, so pipeline failures never affect session persistence.

tb_responses Raw text, question reference
tb_session_summaries 100+ computed fields from keystroke data
tb_session_events Keystroke stream + event log (JSON)
tb_burst_sequences P-bursts (text produced between pauses)
Transaction committed
3
Fire-and-forget pipeline

Six families, computed independently

Each signal family runs in isolation. A failure in one family does not prevent the others from completing. If the Rust engine is unavailable, dynamical, motor, and process signals return null and the session saves without them. The health endpoint surfaces this state.

Dynamical Rust 51 signals

Nonlinear dynamics of the IKI series. Treats keystroke timing as output of a complex adaptive system. Organized into nine theoretical sub-families spanning complexity, structure, causality, and mode decomposition.

PE spectrum (orders 3-7) DFA alpha MF-DFA (spectrum width, asymmetry, peak alpha) Temporal irreversibility Spectral analysis (PSD slope, respiratory peak, peak frequency, LF/HF ratio) Ordinal statistics (statistical complexity, forbidden patterns, weighted PE, LZC) OPTN (transition entropy, forbidden transitions) RQA (determinism, laminarity, trapping time, recurrence rate, recurrence time entropy) Recurrence networks (transitivity, path length, clustering, assortativity) Causal emergence (EI, CEI, PID synergy/redundancy) Criticality (branching ratio, avalanche exponent) DMD (dominant frequency, decay rate, mode count, spectral entropy) Pause mixture (component count, motor proportion, cognitive load index) Transfer entropy (KSG) TE dominance
tb_dynamical_signals
Motor Rust 17 signals

Statistical shape of keystroke timing distributions. Motor control, rhythmic consistency, and neuromuscular execution.

Sample entropy MSE series / complexity index Ex-Gaussian (mu, sigma, tau, tau proportion via MLE/EM) Ex-Gaussian Fisher trace Autocorrelation (lags 1-5) Motor jerk Lapse rate Tempo drift Compression ratio Digraph latency Hold-flight coupling Hold-flight rank correlation
tb_motor_signals
Process Rust 9 signals

Writing process mechanics. Text reconstruction from the event log to classify pause locations, burst types, revision behavior, and strategy shifts.

Pause location (3 types) R-bursts I-bursts Abandoned thoughts Vocab expansion rate Phase transition Strategy shifts
tb_process_signals
Semantic Rust 14 signals

Linguistic content analysis via deterministic word-list methods. What you wrote, measured as density metrics and discourse structure.

Idea density Lexical sophistication Epistemic stance Integrative complexity Deep cohesion Referential cohesion Emotional valence arc Text compression ratio Discourse coherence (global) Discourse coherence (local) Global/local coherence ratio Coherence decay slope
tb_semantic_signals
Cross-session Rust 11 signals

Longitudinal consistency metrics. How your current session relates to your own history.

Self-perplexity Motor self-perplexity NCD (4 lags) Vocab recurrence decay Digraph stability Text network density Text network communities Bridging ratio
tb_cross_session_signals
Signals persisted
4
Downstream computation

From signals to understanding

After all signal families complete, downstream systems synthesize the measurements into longitudinal context.

Embeddings

Response text embedded via self-hosted Qwen3-Embedding-0.6B (512-dim, L2-normalized). Stored with SHA-256 identified weights for reproducibility.

Semantic baselines

Rolling z-scores per semantic dimension against your own history. Detects meaningful deviations from personal baseline.

Personal profile

Rolling aggregate of all signal dimensions. Your accumulated behavioral fingerprint. Updated after every session.

Reconstruction residual

Profile-predicted signals vs. actual signals. The gap between what your history predicts and what you actually produced.

Question generation

Operator-run, off-band corpus refresh against a user-agnostic prompt set. The LLM is invoked only to populate a shared question corpus and never receives or analyzes any subject's response data, signals, or profile. Subjects pull from the shared corpus once their personal seeds run out.

The reconstruction adversary

A measurement is only meaningful if you can say what it would look like without the thing you are measuring. Alice's ghost is a reconstruction adversary: it generates a synthetic writing session from your statistical profile alone, then runs the same signal engine on both streams. The residual between real and ghost measurements is what your profile cannot explain.

Real session
Input

Your actual keystroke stream from today's writing session

Engine

Rust signal engine computes more than 100 measurements

Output

Real signal values

Residual

What your profile can reconstruct vs. what requires the actual person

Ghost session
Input

Your accumulated statistical profile (ex-Gaussian params, digraph latencies, burst structure, revision rates)

Engine

Avatar engine generates synthetic keystrokes, then the same Rust signal engine computes more than 100 measurements

Output

Ghost signal values

Five adversary variants

Each variant adds one modeling improvement, isolating which dimension of behavior carries the most signal. Comparing reconstruction residuals across variants reveals what a statistical profile can reproduce and what it cannot.

1 Baseline

Order-2 Markov text generation with independent ex-Gaussian IKI sampling and fixed hold times. The simplest adversary: text statistics plus independent timing.

+ AR(1) IKI correlation
2 Conditional timing

Adds serial dependence to inter-keystroke intervals. Tests whether IKI autocorrelation structure carries information beyond the marginal distribution.

+ Gaussian copula hold-flight coupling
3 Copula motor

Adds hold-flight time coupling via rank correlation. Tests whether motor execution coordination (the relationship between how long you press and how long you travel) is informative.

+ PPM variable-order text prediction
4 PPM text

Replaces order-2 Markov with prediction by partial matching. Tests whether better text modeling reduces the residual, or whether the text channel is already well-captured.

+ all of the above combined
5 Full adversary

PPM text, AR(1) correlated IKI, Gaussian copula motor coupling. The strongest reconstruction within the measurement space. What remains in the residual after this variant is what the profile genuinely cannot explain.

Reproducibility

Every ghost session stores the PRNG seed, the exact profile snapshot, the corpus hash, and the topic string. Given these inputs, the avatar engine produces bit-identical output across rebuilds. The seed is a 64-bit integer initialized via SplitMix64 and advanced by xoshiro128+. Reproducibility is verified on production data and enforced in CI.

How the system got here

A measurement instrument earns its rigor through correction, not assertion. This timeline shows the methodological incidents that shaped Alice's signal engine. Each entry documents what was wrong, how it was discovered, and what changed. The full provenance log is maintained in the codebase.

INC-013 2026-04-24
Architecture correction

Signal pipeline boundary failure + extended ghost residuals

Wrapper functions silently stripped 38 new signal columns; ghost residuals expanded from 13 to 41 dimensions

Two TypeScript wrapper functions in libSignalsNative.ts explicitly constructed return objects with only the original 14 fields, silently discarding everything the Rust engine produced after the Phase 1-5 expansion. Three edits to one file fixed it. All new signal columns now populate through the live pipeline. Backfill completed: 31 dynamical + 31 motor rows re-inserted with all new columns populated. Additionally, reconstruction residual comparison expanded from 13 to 41 behavioral dimensions stored in extended_residuals_json (JSONB), organized into seven theoretical families. Calibration guard added to the last unguarded aggregate pipeline function.

Expanding residuals to 41 dimensions revealed that ghost reproduction fidelity varies by over 100x across theoretical families. MF-DFA spectrum width is the most ghost-resistant dimension by 4x, because the reconstruction adversary generates from a single stochastic process and cannot reproduce multifractal structure. Ordinal statistics are nearly perfectly reproducible. This decomposition was invisible at 13 dimensions.

INC-010 2026-04-23
Architecture correction

Embedding sovereignty

Replaced vendor API with self-hosted archivable weights

VoyageAI's voyage-3-lite failed three of four constitutional requirements: not archivable (no SHA-256 identifier), not deterministic across vendor changes, no lifecycle control. Migrated to Qwen3-Embedding-0.6B via self-hosted TEI, CPU-only build for bit-reproducibility. Metal FP32 on Apple Silicon is not bit-reproducible due to batch-dependent reduction patterns. All 10 non-calibration sessions re-embedded; semantic baselines regenerated from scratch.

The semantic channel now has the same reproducibility guarantee as the behavioral channel. Embeddings computed today can be regenerated from archived weights indefinitely, independent of any vendor. Baselines remain valid against future sessions without methodological drift.

INC-009 2026-04-22
Methods correction

Construct validity

Stripped unvalidated interpretive labels from signal surfaces

Audit identified two classes of failure: interpretive labels presented as instrument readings (e.g., "rigid"/"malleable" for attractor force without validation), and statistical notation without adequate sample-size context (sigma notation without baseline entry counts). Wave 1 removed four sets of unvalidated labels. Wave 2 added honest framing for low-n signals: dots-only for n<5, explicit sample sizes on deviation callouts, minimum run length 4 for trend detection.

Every readout the instrument now surfaces is either a raw measurement or an honestly-gated statistical claim. The observatory distinguishes between what it knows and what it is still learning, visible in the interface rather than hidden behind premature interpretation.

INC-008 2026-04-22
Methods correction

Statistical rigor for discovery badges

Dynamic critical-r gate replaced hardcoded thresholds

Coupling correlations displayed as "strong" or "moderate" based on hardcoded thresholds without significance testing. A correlation of r=0.55 from n=10 displayed as "strong" despite p>0.05. Replaced with dynamic max(criticalR(n), 0.3) gate using Cornish-Fisher approximation. Two-state badge system: "established" requires both significance and stability; "provisional" for significant but unvalidated couplings.

Coupling discoveries now scale honestly with data depth. At n=10, almost nothing qualifies. At n=50, genuine structure emerges. The instrument's confidence grows with the dataset rather than with arbitrary thresholds, which means early data accumulation cannot produce false discoveries that later need to be retracted.

INC-006 2026-04-22
Architecture correction

Reconstruction residual reproducibility

Residuals now store exact inputs for regeneration

Every ghost session stores the PRNG seed, profile snapshot (3.1KB measured), corpus SHA-256 hash, and topic string. Given these inputs, regeneration produces bit-identical dynamical and motor signals. Verified 10/10 on production data. Semantic residuals remain excluded (depend on external embedding model).

Every residual is now an auditable claim. Any future version of the instrument can verify whether a past ghost comparison would produce the same result, making reconstruction validity a permanent property of the dataset rather than a snapshot assertion.

INC-005 2026-04-22
Infrastructure

CI reproducibility enforcement

Two-clean-build reproducibility check on every PR touching Rust

Automated CI workflow builds the signal engine twice from clean state and diffs the output JSON on a fixture session. Any bit-level divergence fails the PR. Golden signal values documented for the 100-keystroke fixture.

The signal engine cannot silently drift. Any code change that alters a measurement, whether intentionally or through compiler optimization differences, is caught before it enters the codebase. This is the enforcement mechanism that makes the reproducibility guarantee operational rather than aspirational.

INC-002 2026-04-21
Data incident

Floating-point summation order

Naive summation sensitive to compiler auto-vectorization; replaced with Neumaier compensated summation

Discovered that naive floating-point sums across 17 accumulation sites were sensitive to LLVM auto-vectorization and loop unrolling order. Replaced all with Neumaier compensated summation (error bound O(epsilon) independent of n). Also fixed HashMap iteration nondeterminism in permutation entropy (converted to BTreeMap) and pinned the Rust toolchain to 1.95.0 + LLVM 22.1.2 on aarch64-apple-darwin.

The same keystroke stream now produces the same signal values on any build, on any machine, indefinitely. This is the numerical foundation that every downstream guarantee depends on: reproducible residuals, stable baselines, and meaningful longitudinal comparisons all require that the measurement itself does not move.

INC-001 2026-04-21
Data incident

Hold-flight vector misalignment

27/27 sessions affected, transfer entropy values shifted 130%+

Hold times and flight times were filtered independently, causing misalignment when rollover typing produced valid holds with invalid flights. 6,589 total misaligned events across all sessions. Transfer entropy values shifted by over 130% mean, with 5 sign flips. Fixed by paired filtering: both hold and flight are kept or dropped together for the same keystroke event. Original data preserved as a snapshot table.

Transfer entropy now measures what it claims: directional information flow between motor execution (hold) and cognitive planning (flight). Before the fix, the coupling was computed between misaligned series, meaning the causal direction estimates were unreliable. The entire hold-flight analysis framework depends on this alignment being correct.

Eight incidents across four days. Each one made the instrument more honest. The full provenance log, including four additional incidents (INC-003, INC-004, INC-007, INC-011) and four deferred design decisions, is maintained in the codebase and available for review.

Three systems, one instrument. The signal engine measures. The ghost tests whether those measurements are irreducible. The semantic pipeline captures what was said, not just how it was typed. Together, they produce a longitudinal record of cognitive process that no single system could provide alone.