The dataset does not exist yet. We are building it.
Alice is accumulating a longitudinal keystroke-cognition dataset that no other instrument produces. The signal pipeline is running. The architecture satisfies the six validity constraints. What it needs now is participants, research partners, and time.
What the longitudinal record enables
No existing dataset combines daily ecological writing, keystroke-level process capture, linguistic content analysis, and intra-individual baselines over months or years. Once this data exists at scale, it can answer questions that the field currently cannot ask:
What is the natural within-person trajectory of processing speed, lexical retrieval, and executive function over years in healthy adults? When does "normal variability" become "early drift"?
Which keystroke-derived signals have sufficient test-retest reliability to serve as biomarkers? What is the minimum observation window for stable personal baselines?
Does routine AI mediation alter the cognitive processes visible in unmediated writing? Can you detect the "creative scar" (Zhou and Liu 2025) in keystroke dynamics?
How do pause distributions, burst structure, and revision patterns relate to linguistic output quality in naturalistic (non-lab) writing over time?
Are Molenaar's non-ergodicity results confirmed in keystroke dynamics? How different are within-person dynamics from between-person averages in this modality?
How to participate
If you work in cognitive aging, digital biomarkers, psycholinguistics, or AI-cognition interaction and want access to longitudinal process-level writing data, reach out. We are interested in research partnerships where Alice serves as the data collection instrument for studies designed by domain experts.
We can discuss: anonymized data sharing agreements, custom signal extraction for specific research questions, joint publications, and protocol adaptations for specific populations.
Alice is currently at n=1. Multi-user architecture is in development. If you want to use Alice as a daily writing practice and contribute your longitudinal data to the research corpus, we will announce when enrollment opens.
Participation means: writing daily, unmediated, in response to one question. Your entries remain private. Signal computation is invisible. You experience a writing practice. The research happens underneath.
If you represent a university research group, digital health organization, or clinical neuroscience lab interested in adopting the Alice protocol for longitudinal cognitive measurement, we are open to institutional partnerships.
The signal pipeline (Rust engine, PostgreSQL schema, embedding infrastructure) can be deployed independently. The protocol constraints are open. The implementation is the reference.
This research requires sustained development time, infrastructure, and eventually multi-site deployment. If you fund cognitive health research, AI safety research, or longitudinal behavioral science, the instrument gap described in our papers is the problem we are solving.
We are not a startup seeking growth capital. We are a research instrument seeking the resources to validate itself at the scale required for the claims to be tested.
Current state
Get in touch
If any of the above applies to you, reach out directly.
Or leave your email to be notified when enrollment opens.
You're on the list.