Embedding Methods Specification
This document is the methodological artifact for Alice's embedding pipeline. Paper One's methods section cites this file directly. Its purpose is long-horizon reproducibility: an external researcher decades from now can pull the exact weights by SHA-256 and reproduce the exact vectors.
Current Embedding Model
Model: Qwen3-Embedding-0.6B
Source: Qwen/Qwen3-Embedding-0.6B on Hugging Face
Architecture: Qwen3ForCausalLM (causal LM fine-tuned for embeddings)
Parameters: 0.6 billion
Pooling: Last-token
Native output dimension: 1024
Weights Identification
SHA-256 (model.safetensors):
0437e45c94563b09e13cb7a64478fc406947a93cb34a7e05870fc8dcd48e23fd
Hugging Face commit: 97b0c614be4d77ee51c0cef4e5f07c00f9eb65b3
Archival location on disk:
~/.cache/huggingface/hub/models--Qwen--Qwen3-Embedding-0.6B/blobs/0437e45c94563b09e13cb7a64478fc406947a93cb34a7e05870fc8dcd48e23fd
Inference Environment
| Property | Value |
|---|---|
| Serving layer | Hugging Face Text Embeddings Inference (TEI) |
| TEI version | 1.9.3 |
| Build | Source build with cargo install --path router -F candle (CPU-only, no Metal) |
| Binary location | ~/.cargo/bin/text-embeddings-router |
| Backend | candle (CPU) |
| Precision | float32 (--dtype float32) |
| Platform | darwin-arm64 (Apple Silicon) |
| Matryoshka dimension | 512 (client-side truncation + L2 renormalization) |
| Deterministic | Yes (verified: cosine 1.0, max element diff 0 across successive calls) |
Startup command:
~/.cargo/bin/text-embeddings-router --model-id Qwen/Qwen3-Embedding-0.6B --dtype float32 --port 8090
Matryoshka Truncation
The model outputs 1024-dimensional vectors natively. Alice truncates to 512 dimensions via Matryoshka Representation Learning:
- Take the first 512 components of the 1024-dim vector
- L2-renormalize the truncated vector
This preserves the vector(512) schema and HNSW index. The 512-dimensional
output is a proper prefix of the model's native output, optimized during
Matryoshka training. No quality loss relative to a 512-native model.
Truncation is performed client-side in src/lib/libEmbeddings.ts.
Database Reference
Model version table: tb_embedding_model_versions
Row: embedding_model_version_id = 1
Active from: 2026-04-23
Prior Model (Invalidated)
Model: voyage-3-lite (VoyageAI API)
Status: All 10 embeddings soft-invalidated via invalidated_at timestamp
on 2026-04-23. Rows preserved in tb_embeddings for audit trail.
Reason for replacement: API-based model fails Alice's constitutional
precondition of archivable weights and deterministic inference. No control
over vendor model version lifecycle or deprecation.
Licensing
License: Apache 2.0
Known concern: GitHub issue QwenLM/Qwen3-Embedding#166 raises a
question about MS MARCO training data licensing (non-commercial use clause)
potentially affecting the Apache 2.0 release of the trained weights. This
is acceptable for Paper One research use. Flagged for Phase Two commercial
deployment review.
Bit-Reproducibility Verification
Verified 2026-04-23. Two successive calls to TEI with identical input text produced:
- Cosine similarity: 1.0
- Max element difference: 0
- Vectors bit-identical at both 1024-dim (native) and 512-dim (Matryoshka)
The CPU-only candle backend with FP32 provides IEEE 754 deterministic inference for the same binary on the same hardware.