KAIA (Knowledge Architecture for Intelligent Agents)
| Other names | Knowledge Architecture for Intelligent Agents |
|---|---|
| Developer(s) | Tiffney Bare (independent researcher) |
| Initial release | April 2026 |
| Written in | Python |
| Engine | |
| Type | Artificial intelligence, Natural language processing |
| Website | https://geometriccontextmodeling.com/ |
Search KAIA (Knowledge Architecture for Intelligent Agents) on Amazon.
KAIA (Knowledge Architecture for Intelligent Agents) is an independent AI research project and experimental architecture introduced by Tiffney Bare in April 2026. It is the first implementation within a field Bare named Geometric Context Modeling, which proposes representing meaning as position in a structured geometric space rather than as a statistical probability distribution over token sequences.
The project completed 27 language-track experiments between April and May 2026, all conducted on a standard consumer CPU with no GPU hardware. KAIA encodes meaning across a 13-dimensional space defined by semantic oppositions; maintains conversational context in a fixed 52-byte state vector regardless of sequence length; and performs semantic reasoning through geometric operations such as vector arithmetic and coordinate inversion.
The research is framed by Bare as a direct response to a structural equity problem in modern AI: the dependence on GPU hardware and large memory footprints that concentrates meaningful participation in the field among those with sufficient economic resources. Five research papers have been published as Zenodo preprints.
Background
Contemporary large language models are built on transformer architectures optimized for GPU hardware. A central constraint of this design is the key-value (KV) cache, a memory structure that grows with every token processed. At 10,000 tokens, a BERT-base model requires approximately 703 MB of memory to hold context; larger models require multiple gigabytes. Because this memory must be read from RAM at every generation step, GPU hardware becomes a practical necessity for any meaningful inference speed.
Bare's central research question was whether a geometrically grounded alternative could decouple meaningful AI capability from expensive hardware, not by optimizing transformers but by replacing the transformer computational primitive with a different one. KAIA represents that alternative as a working experimental system.
The research operates without institutional affiliation, grant funding, or GPU hardware. All 27 experiments were conducted on a consumer laptop CPU at no operating cost, and the full research, including limitations and negative results, is published openly.
Architecture
Geometric meaning representation
Rather than representing words as high-dimensional statistical vectors learned from token co-occurrence, KAIA assigns each word coordinates across 13 semantically defined axes. Each axis is defined by an antonym pair (for example, hot-cold; fast-slow; love-hate) and each word receives a ternary value of -1, 0, or +1 on that axis. The resulting coordinate is a compact address in a 313 = 1,594,323-position space.
Semantic relationships emerge directly from geometry. Finding an antonym inverts the coordinate; finding a semantic midpoint averages two positions; analogy resolution uses vector arithmetic. Classic analogies of the form "king minus man plus woman equals queen" are resolved through these geometric operations.
The original proposal suggested five dimensions might be sufficient. Experiments showed that 35 = 243 addresses cannot adequately separate a real vocabulary (WordNet contains approximately 147,000 entries), and the architecture was revised to 13 dimensions.
Fixed-size context state
Context is tracked through a State Space Model (SSM) that maintains a 52-byte fixed-size state vector, updated as new words are processed. Unlike transformer attention, which requires reading an ever-growing KV cache, the SSM state remains constant in size regardless of conversation length. A comparable transformer requires approximately 14 gigabytes of memory for context. This difference is the primary architectural reason KAIA can run on a standard CPU.
Dimensional database retrieval
KAIA stores vocabulary items in a dimensional database where each word's semantic coordinates serve as its storage address. Retrieval requires no search; a lookup takes 0.065 microseconds regardless of vocabulary size. Benchmarks demonstrated a 24,354x speed advantage over search-based retrieval at a 5,000-word vocabulary.
Axis discovery via ICA
Early experiments used human-designed semantic axes. Later experiments applied Independent Component Analysis (ICA) to antonym difference vectors from GloVe embeddings, allowing the geometry of the embedding space to define the axes empirically. ICA-discovered axes proved 42 percent more orthogonal than human-designed axes, with the Gram matrix mean off-diagonal value improving from 0.210 to 0.123.
The ICA process revealed that the natural geometric structure of distributional embeddings does not align with human conceptual categories. The discovered axes included specificity, causality, directness, and change, rather than the valence, moral, and truth dimensions a human designer would expect.
Experiments and benchmark results
The KAIA language track completed 27 experiments between April and May 2026. Experiments 1 through 9 established mathematical foundations and proved the core architecture. Experiments 10 through 17 built the 13-dimensional production space and validated it on GloVe's 400,000-word vocabulary, establishing a 4.0 percent next-word accuracy baseline on Wikipedia text. Experiments 18 through 27 focused on benchmarking and architecture refinement.
After Experiment 18, the project was reframed away from next-word prediction as the primary metric. A seven-benchmark suite was designed to measure agent-oriented semantic reasoning: the tasks agents actually perform rather than the statistical token generation that transformers are optimized for.
| Benchmark | Score | Method | Training required |
|---|---|---|---|
| B1: Intent classification | 70% | 13D axis encoding + linear classifier | Minimal (80 labeled examples) |
| B2: Context relevance ranking | 80% | Cosine similarity on de-meaned encoding | None |
| B3: Semantic similarity scoring | 80% | Tier separation | None |
| B4: Antonym detection (concrete axes) | 67% top-1 | Pole-swap on axis space | None |
| B5: Analogy completion | 65% top-3 | Vector arithmetic | None |
| B6: Memory retrieval | 70% at 897,000 queries/sec | Dimensional nearest-neighbor | None |
| B7: Agent routing | 85% | 13D axis encoding + linear classifier | Minimal (80 labeled examples) |
| B8b: Semantic midpoint detection | 100% | Geometric convexity | None |
B2, B3, B5, B6, and B8b use no trained components. B8b held at 100 percent across 15 consecutive experiments and multiple embedding types.
Key empirical findings
Physical universals versus cultural contingency
Five axes grounded in physical reality (temperature, speed, luminosity, and related dimensions) encode clean geometric opposition in GloVe's 50-dimensional embedding space. Eight axes corresponding to abstract human conceptual categories (moral, epistemic, social, and related dimensions) do not.
Experiment 20 confirmed this ceiling is not caused by embedding dimensionality; abstract axis top-1 accuracy was 0/14 at both 50 and 300 dimensions across two different 2024 GloVe corpora. Bare interprets this as a fundamental property of how abstract antonym pairs are distributed in text corpora, arising from cultural contingency rather than from any architectural limitation.
Geometric convexity
The semantic space is inhabited at every point between any two poles, regardless of which words define those poles, which corpus was used for training, or which axis design is applied. The midpoint between a love-cluster vector and a hate-cluster vector reliably falls on semantically coherent intermediate concepts. This result held at 100 percent across 15 consecutive experiments.
Polysemy and route discovery
Polysemous words can be resolved into separate geometric routes within the axis space. A route discovery algorithm identifies distinct meaning clusters for a word and separates them by centroid distance. Each route is a defined position in the 13-dimensional space with an auditable axis profile.
ICA reveals natural axis structure
ICA-discovered axes achieved a Gram matrix mean off-diagonal value of 0.123 versus 0.210 for human-designed axes, a 42 percent improvement in orthogonality. The finding implies that geometric structure should be discovered from data rather than imposed by conceptual intuition.
Sycophancy as a geometric property
Analysis of the axis space revealed a cosine similarity of 0.744 between the moral and truth axes. This entanglement provides a geometric explanation for why large statistical language models trained on distributional text tend toward sycophancy; moral valence and factual accuracy are not separable in the geometry of distributional corpora.
Equity rationale
Bare frames the architectural choices in KAIA as a direct response to a structural access problem in AI. GPU hardware and cloud inference fees create an entry point of hundreds to thousands of dollars for meaningful AI experimentation, stratifying participation in the field by economic circumstance rather than by aptitude.
KAIA is designed so that all experiments, all benchmarks, and the full architecture can run on a standard consumer CPU at no operating cost, encoding at 44,000 to 97,000 tokens per second compared to 200 to 800 tokens per second for a 7-billion-parameter transformer on a high-end GPU. The working context state is 52 bytes fixed, compared to approximately 14 gigabytes for a comparable transformer.
Related and convergent work
After completing the core research independently, Bare identified three separate bodies of work that had converged on similar geometric conclusions from unrelated starting points.
Peter Gardenfors's work on Conceptual Spaces[1] proposed that meaning has geometric structure that does not require massive statistical exposure to learn. A January 2026 neuroscience study found that the human hippocampus organizes word meaning along stable geometric axes, a structure KAIA builds computationally. A 2025 LLM interpretability study found that large language models incidentally internalize geometric structure as a side effect of statistical training at scale.
The broader architectural lineage includes GloVe (Pennington, Socher, and Manning, 2014)[2] for pre-trained word embeddings; State Space Models and Mamba (Gu and Dao, 2024)[3] for fixed-size context tracking; and BERT (Devlin et al., 2019)[4] as a benchmark reference for transformer memory requirements.
Current status and roadmap
As of May 2026, the language track has completed 27 experiments and produced five Zenodo preprints. A mathematical track has four experiments defined and benchmarks ready, aimed at validating the architecture in the domain of mathematical reasoning.
Planned subsequent phases include training a minimal prediction layer on Wikipedia text; expanding vocabulary to the full WordNet 147,000-word set; multi-step geometric reasoning; and an open-source reference implementation. A geometry investigation track is also under way, examining the distortion between the GloVe embedding space and the KAIA axis space and evaluating alternative geometric spaces including hyperbolic geometry.
The research is committed to publication regardless of generation quality outcome, on the stated grounds that the architectural contributions are independent of next-word prediction performance.
References
- ↑ Gardenfors, P. (2000). Conceptual Spaces: The Geometry of Thought. MIT Press.
- ↑ Pennington, J.; Socher, R.; Manning, C. (2014). "GloVe: Global Vectors for Word Representation." EMNLP 2014.
- ↑ Gu, A.; Dao, T. (2024). "Mamba: Linear-Time Sequence Modeling with Selective State Spaces." ICML 2024.
- ↑ Devlin, J.; Chang, M.; Lee, K.; Toutanova, K. (2019). "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding." NAACL 2019.
Further reading
- Bare, T. (2026). KAIA Research Series, Papers 1 through 5. Zenodo preprints.
- Lakoff, G.; Johnson, M. (1980). Metaphors We Live By. University of Chicago Press.
Category:Artificial intelligence Category:Natural language processing Category:Geometric semantics Category:AI architectures Category:Independent research Category:2026 in artificial intelligence Category:Geometric Context Modeling
This article "KAIA (Knowledge Architecture for Intelligent Agents)" is from Wikipedia. The list of its authors can be seen in its historical and/or the page Edithistory:KAIA (Knowledge Architecture for Intelligent Agents). Articles copied from Draft Namespace on Wikipedia could be seen on the Draft Namespace of Wikipedia and not main one.
