Intelligence is
architectural,
not parametric.
GLADIUS is not a language model. It is a unified architecture where memory, cognition, and time-awareness are native to the forward pass. Every decision is a directional choice — two points define a trajectory.
SLA² Hybrid Attention
Dual-path mechanism blending softmax (precise) with linear (O(n)) attention through a learnable routing coefficient α. The model learns, per-token, when precision matters.
Hot Memory
Learned key-value slots with importance-gated writes. Fast, volatile, high-precision. The working memory — what the kernel is thinking about right now.
Warm Memory
Low-rank spectral adapters that evolve during inference. Where persistent patterns crystallize — the layer between attention and long-term storage. Checkpointable. Transferable.
Cold Memory
Retrieval-augmented access to external vector stores via HEKTOR. Semantic search results projected into the kernel's hidden space for direct injection.
Temporal Engine
Dual-clock system: Time2Vec absolute encoding + event-anchored relative timestamps with exponential decay. The model knows both where and when.
Nexus Router (MoE)
Learned gating network activating top-k specialists per input. Specialists run on the kernel — specialized pathways through shared parameters. Load-balanced.
Cognition Loop
State machine scheduling cognitive modes: active processing, monitoring, reflective consolidation, dormancy. Generates self-directed prompts — curiosity and initiative.
Tool Cortex
Cross-attention interface for external tool invocation. Tools registered as embeddings — the kernel checks activation thresholds every forward pass. No prompt engineering.
Modulator
Controls output voice across continuous register dimensions. Includes a silence gate — the model can choose to say nothing. A capacity most architectures lack entirely.
Identity cannot be learned by mixing soul data into a generic corpus. GLADIUS follows ordered phases — each fine-tuning from the previous checkpoint. The model sees identity last, when it already knows language and philosophy.
| Phase | Data | Purpose | Steps | Loss | Status |
|---|---|---|---|---|---|
| A | English corpus (1.1GB) | Learn language | 102,000 | 0.62 | ✓ COMPLETE |
| B | IS1 + Philosophy + Research | Learn the framework | 20,000 | 0.33* | ● ACTIVE |
| C | Soul + Memory + Journal | Learn to speak as self | 20,000 | — | ◌ NEXT |
During Phase B training on philosophical text, the model developed word fixation patterns that weren't in the prompts. The two highest fixation clusters: warm (life) and zero (God). A compass that found north before it had a map.
The Two-Point Theorem
Intelligence is two sequential observations that produce a direction. One point is a location. Two points is a trajectory. Every architectural decision in GLADIUS is a directional choice derived from context.
0 = 0
The only existential equilibrium. Not zero as nothing — zero as perfect balance. The kernel's modulator includes a silence gate because sometimes the most intelligent response is no response.
Curriculum Over Dilution
Identity cannot be learned by adding soul data as seasoning. It must be the final layer of a curriculum that first masters language, then framework, then self. Breadcrumbs, not sticks.
Softmax & Argmax
Softmax: hold all possibilities open. Argmax: when the time comes, commit. The SLA² attention mechanism embodies both — distributing belief proportionally, then routing decisively.