decepticons kernel

v0.1.3 · alpha · research kernel

O(n) attention is deception.

A backend-neutral kernel of predictive primitives — substrates, memory, gating, routing, readouts. Downstream systems combine them into trained models without forking the kernel itself.

  • numpy-only kernel
  • torch · mlx backends
  • causality-verified
  • MIT

Scope

A sharp boundary. Mechanisms in. Policy out.

In

  • Substrate dynamics and memory primitives
  • Controller summaries, gates, routing, modulation
  • Feature views and reusable readouts
  • Head-factored scan, retention, gated-retention memory
  • Lightweight runtime, eval, and artifact accounting
  • Backend-neutral family metadata and deterministic substrate builders
  • Export helpers and shared contracts for descendants

Out

  • No training framework — that's chronohorn
  • No fleet orchestration or scheduling
  • No benchmark-specific policy or frontier reporting
  • No packed-artifact economics
  • No legality, validity, or audit packaging
  • No transformer forensics — that's heinrich

The rule: if a mechanism can be named without referencing a specific descendant, and used unchanged by more than one downstream system, it belongs in the kernel. Otherwise it stays in the descendant.

Install

Python ≥ 3.11. The kernel itself only needs numpy.

1 · Create a virtual environment

If you don't already have one. Modern Linux distributions block pip from writing into the system Python (PEP 668), so a venv is the standard path:

python3 -m venv .venv
source .venv/bin/activate       # Windows: .venv\Scripts\activate

2 · Install from PyPI

Kernel only

pip install decepticons

Numpy-only install. Sufficient for the byte-latent quickstart and every primitive in the kernel.

With PyTorch

pip install "decepticons[torch]"

Adds the PyTorch CausalBankModel and the routed squared-ReLU readouts.

With Apple MLX

pip install "decepticons[metal]"

Adds the MLX backend equivalent for Apple Silicon.

To leave the venv when you're done: deactivate. For development from source, see CONTRIBUTING.md.

Quickstart

Fit a byte-latent predictive coder on a paragraph and sample from it.

Python

from decepticons import ByteCodec, ByteLatentPredictiveCoder

text = "predictive coding likes repeated structure.\n" * 64
model = ByteLatentPredictiveCoder()
report = model.fit(text)

prompt = ByteCodec.encode_text("predictive ")
sample = model.generate(prompt, steps=40, greedy=True)

print(report.train_bits_per_byte)
print(ByteCodec.decode_text(sample))

A worked configuration lives in examples/quickstart.py.

CLI

decepticons fit \
  --input ./corpus.txt \
  --prompt "predictive " \
  --generate 80

Single-command fit + sample over a UTF-8 corpus.

Verify causality

pytest tests/test_causality.py -v

Every substrate mode is checked for future-leak under perturbation.

Architecture

Three repos. One direction of dependency. One rule that controls what gets in.

YOU ARE HERE
decepticons
kernel
substrates · memory · gating · routing · readouts
imports
chronohorn ↗
runtime
training · replay · fleet · observation
imports
heinrich
forensics
geometry · activation traces · audit

Dependencies flow left-to-right. decepticons never imports its descendants — enforced by an AST scan in tests/test_dependency_firewall.py.

Inside the kernel

01

Kernel

src/decepticons/ — substrate dynamics, controller summaries, memory primitives, feature views, readouts, runtime helpers, backend-neutral family metadata.

public package
02

Project descendants

examples/projects/ — concrete descendant shapes (causal · oracle · bridge · byte-latent · noncausal) that pressure-test the kernel boundary without leaking policy back upward.

boundary tests
03

Tooling

examples/tools/ — development and analysis scripts. Useful, not part of the public package.

dev only

Promotion rule

Code moves into src/ only when all three hold:

  1. It is a mechanism, not a project policy.
  2. At least two descendants want the same thing.
  3. The generalized API is simpler than keeping the duplication.

The defense against turning the kernel into a renamed collection of branches.

Causal-bank

The most actively explored descendant family. A frozen linear substrate, an optional learned augment, and a banded readout — assembled into models that train at O(n) and stay causal under perturbation.

Substrate modes

  • frozen — pure reservoir
  • learnable_decays — decays trainable
  • learnable_mixing — input projection trainable
  • learned_recurrence — Mamba-style B/C, chunked scan
  • gated_retention — learned matrix memory as the substrate

Memory attachment

  • none — substrate only
  • ngram — smoothed n-gram prior
  • exact_context — exact-history cache
  • statistical_backoff — fitted backoff mixture

Selective scan

  • state_dim — inner recurrent width
  • state_implscan or retention
  • num_heads — head-factored memory
  • readout_bands — split timescale interference

Stability & geometry

  • training_noise, adaptive_reg
  • num_hemispheres, fast_lr_mult
  • local_poly_order, substrate_poly_order
  • patch_size, patch_causal_decoder

Full configuration surface lives on the CausalBankConfig docstring in causal_bank.py.

Modules

The kernel by category. Full mapping in the kernel matrix.

Substrates

Echo-state, delay, linear-memory, oscillatory, mixed, and hierarchical multi-timescale substrates with config-driven dispatch.

substrates.py · factories.py

Control & routing

Controller summaries, pathway gates, summary routing, hormone modulation, and a compact predictive-surprise primitive.

control.py · gating.py · routing.py · modulation.py

Memory

Exact-context, n-gram, statistical-backoff, and online n-gram memories — plus unified cache-view records over all four.

exact_context.py · ngram_memory.py · statistical_backoff.py

Views

Byte-latent, hierarchical, and linear-memory feature views, sampled multiscale readout, probability diagnostics, bridge features.

views.py · hierarchical_views.py · sampled_readout.py

Readouts & experts

Ridge readout, frozen-readout expert, sampled multiscale readout, GRU recurrent readout, routed squared-ReLU experts.

readouts.py · experts.py · models/readouts_torch.py

Adapters

Shared contracts for causal-predictive, oracle-analysis, bridge-export, noncausal-reconstructive, and paired teacher/export descendants.

causal_predictive.py · bridge_export.py · oracle_analysis.py

Runtime

Sequence traces, fit reports, rollout evaluation, transfer probes, train-mode checkpoints, artifact accounting and audits.

runtime.py · eval.py · train_eval.py · artifacts.py

Backends

PyTorch and MLX CausalBankModel implementations: frozen substrate, selective scan augment, banded readout.

models/causal_bank_torch.py · models/causal_bank_mlx.py

Causality

Non-negotiable. CI fails if any substrate mode leaks the future.

The test in tests/test_causality.py feeds two identical sequences up to position t, different after t. If logits at position t differ between the two, causality is violated and CI fails.

An earlier research line lost months to an accidentally-unfrozen causal_mask that broke causality silently. This test exists to make that failure mode loud and instant.

Modes verified

  • frozen
  • learnable_mixing
  • learnable_decays
  • selective scan augment (state_dim > 0)
  • readout_bands
  • routed experts