A running measurement system, not a whitepaper.

Alice captures keystroke-level temporal data and submitted text from daily writing sessions. Six signal families measure two orthogonal cognitive channels: motor/dynamical execution (via a native Rust engine) and semantic/propositional output (via deterministic text analysis). The architecture enforces six validity constraints by design, not by policy.

What is captured

Every writing session produces two data channels, recorded simultaneously:

Process channel Keystroke stream

Key-down and key-up timestamps (millisecond precision)
Character identity for each event
Full temporal microstructure: inter-key intervals, hold durations, flight times
Deletion events and revision sequences
Pause architecture (location, duration, context)

Content channel Submitted text

Final response text
Word count and session duration
Burst sequences (text produced between pauses)
Revision density and deletion patterns
The question that prompted the response (for context-aware analysis)

Neither channel alone is sufficient. A decline in lexical diversity with stable production fluency means vocabulary contraction. The same decline with slowed production means retrieval difficulty under cognitive load. You need both to distinguish them.

The signal engine

Every signal family — behavioral and semantic, including cross-session comparisons — is computed by a single native Rust engine compiled to a platform-specific binary. Rust is the single source of truth. There is no fallback implementation, no TypeScript shadow path, and no LLM anywhere in the signal pipeline. A measurement instrument cannot have two sources of truth.

Dynamical signals Rust

Complexity and structure of the temporal process. How ordered or disordered your keystroke timing is, and at what scales.

Permutation entropy spectrum (orders 3-7) Detrended fluctuation analysis MF-DFA (multifractal spectrum) Spectral analysis (PSD slope, LF/HF ratio) Ordinal statistics (statistical complexity, forbidden patterns, LZC) Recurrence quantification (6 metrics) Recurrence networks (transitivity, clustering, assortativity) Causal emergence (EI, CEI, PID) Dynamic mode decomposition Pause mixture model Transfer entropy (KSG estimator)

Motor signals Rust

The statistical shape of your keystroke timing distribution. How your motor system executes the physical act of typing.

Ex-Gaussian parameters (mu, sigma, tau via MLE) Ex-Gaussian Fisher trace Sample entropy Multiscale entropy / complexity index Autocorrelation structure Compression ratio (Lempel-Ziv) Hold-flight rank correlation

Process signals Rust

The structural shape of how you wrote. Pause patterns, burst dynamics, revision behavior.

Pause-burst segmentation I-burst detection and counting Deletion density and revision patterns Text reconstruction from keystroke stream

Semantic signals Rust

What you wrote, measured as density metrics. Propositional content, lexical choice, epistemic stance, cohesion structure.

Idea density Lexical sophistication Epistemic stance Integrative complexity Deep and referential cohesion Emotional valence arc Discourse coherence (global, local, ratio, decay)

Cross-session signals Rust

How your current session relates to your own history. Longitudinal consistency and drift.

Self-perplexity Motor self-perplexity Normalized compression distance (4 lags) Vocabulary recurrence decay Digraph stability Text network density Text network communities Bridging ratio

Behavioral state Rust

Seven-dimensional state vector derived from session summaries. Fluency, deliberation, revision, commitment, volatility, thermal, presence. Convergence is a derived composite (Euclidean distance from personal center in 7D space).

PersDyn dynamics (baseline, variability, attractor force) Cross-dimension coupling matrix Trajectory deviation from personal baseline

Each signal family has defined failure modes. When a session produces insufficient data for reliable estimation, the signal returns null with a typed error variant (InsufficientData, ZeroVariance, DegenerateValue). A missing measurement is better than a wrong one.

Calibration

A single writing session cannot distinguish a bad day from a trend. Alice uses within-person calibration to separate transient state (fatigue, illness, distraction) from trajectory shifts.

Same-day baseline

Brief calibration task before the main writing session. Establishes today's motor baseline: are your fingers fast or slow right now, independent of what you're thinking about?

Historical self-reference

Deviation is measured against your own accumulated history, not a population average. Your "normal" is defined by your data, not a cohort mean.

Trajectory detection

A single outlier is noise. A consistent drift over weeks is signal. The longitudinal record makes this distinction possible in ways that cross-sectional designs cannot.

Design constraints enforced by architecture

These are not guidelines. They are properties of the system that cannot be violated without changing the code.

Unmediated input The writing interface has no autocomplete, no predictive text, no suggestions. Every character in the keystroke stream corresponds to a deliberate motor and cognitive act.

Participant-blind measurement Signal values are never displayed to the user. No dashboard, no trend line, no "your processing speed today." The instrument is invisible. The user sees only the question and the writing space.

No gamification No streaks, no scores, no achievements, no progress bars. These are measurement artifacts that confound the signal. Retention comes from question quality, not dopamine mechanics.

Single source of truth Every signal is computed by exactly one implementation (the Rust engine). If the engine is unavailable, the measurement does not happen. The session saves; signals are absent. A silent wrong answer is worse than no answer.

Entries do not come back out Response text is stored but never displayed back to the user. The system learns from what you write. It does not show you what it learned. The black box stays black.

No AI in the writing path AI generates questions (from accumulated history) and powers observation. It never touches the writing process. The generative AI is on the instrument side, never the participant side.

Implementation

Runtime Astro (SSR, Node adapter)

Signal engine Rust, compiled via napi-rs to native Node addon

Database PostgreSQL 17 + pgvector (embeddings)

AI Claude API (user-agnostic question generation only — never invoked on user response data)

Embeddings Qwen3-Embedding-0.6B, self-hosted (512-dim vectors, HNSW indexed)

Signal families All families — Dynamical, Motor, Process, Semantic, Cross-session, Behavioral State — computed in Rust

Status Running daily at n=1 since April 2026

This is not a concept. It is a running system producing measurements every day. The gap between "running" and "validated" is the work that remains.

Why this instrument needs to exist The broader vision Read the papers