v0.1.3 · alpha · research kernel
O(n) attention is deception.
A backend-neutral kernel of predictive primitives — substrates, memory, gating, routing, readouts. Downstream systems combine them into trained models without forking the kernel itself.
- numpy-only kernel
- torch · mlx backends
- causality-verified
- MIT
Scope
A sharp boundary. Mechanisms in. Policy out.
In
- Substrate dynamics and memory primitives
- Controller summaries, gates, routing, modulation
- Feature views and reusable readouts
- Head-factored scan, retention, gated-retention memory
- Lightweight runtime, eval, and artifact accounting
- Backend-neutral family metadata and deterministic substrate builders
- Export helpers and shared contracts for descendants
Out
- No training framework — that's chronohorn
- No fleet orchestration or scheduling
- No benchmark-specific policy or frontier reporting
- No packed-artifact economics
- No legality, validity, or audit packaging
- No transformer forensics — that's heinrich
The rule: if a mechanism can be named without referencing a specific descendant, and used unchanged by more than one downstream system, it belongs in the kernel. Otherwise it stays in the descendant.
Install
Python ≥ 3.11. The kernel itself only needs numpy.
1 · Create a virtual environment
If you don't already have one. Modern Linux distributions block pip from
writing into the system Python (PEP 668),
so a venv is the standard path:
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
2 · Install from PyPI
Kernel only
pip install decepticons
Numpy-only install. Sufficient for the byte-latent quickstart and every primitive in the kernel.
With PyTorch
pip install "decepticons[torch]"
Adds the PyTorch CausalBankModel and the routed squared-ReLU readouts.
With Apple MLX
pip install "decepticons[metal]"
Adds the MLX backend equivalent for Apple Silicon.
To leave the venv when you're done: deactivate.
For development from source, see
CONTRIBUTING.md.
Quickstart
Fit a byte-latent predictive coder on a paragraph and sample from it.
Python
from decepticons import ByteCodec, ByteLatentPredictiveCoder
text = "predictive coding likes repeated structure.\n" * 64
model = ByteLatentPredictiveCoder()
report = model.fit(text)
prompt = ByteCodec.encode_text("predictive ")
sample = model.generate(prompt, steps=40, greedy=True)
print(report.train_bits_per_byte)
print(ByteCodec.decode_text(sample))
A worked configuration lives in examples/quickstart.py.
CLI
decepticons fit \
--input ./corpus.txt \
--prompt "predictive " \
--generate 80
Single-command fit + sample over a UTF-8 corpus.
Verify causality
pytest tests/test_causality.py -v
Every substrate mode is checked for future-leak under perturbation.
Architecture
Three repos. One direction of dependency. One rule that controls what gets in.
Dependencies flow left-to-right.
decepticons never imports its descendants —
enforced by an AST scan in
tests/test_dependency_firewall.py.
Inside the kernel
Kernel
src/decepticons/ — substrate dynamics, controller summaries, memory primitives, feature views, readouts, runtime helpers, backend-neutral family metadata.
Project descendants
examples/projects/ — concrete descendant shapes (causal · oracle · bridge · byte-latent · noncausal) that pressure-test the kernel boundary without leaking policy back upward.
Tooling
examples/tools/ — development and analysis scripts. Useful, not part of the public package.
Promotion rule
Code moves into src/ only when all three hold:
- It is a mechanism, not a project policy.
- At least two descendants want the same thing.
- The generalized API is simpler than keeping the duplication.
The defense against turning the kernel into a renamed collection of branches.
Causal-bank
The most actively explored descendant family. A frozen linear substrate, an optional learned augment, and a banded readout — assembled into models that train at O(n) and stay causal under perturbation.
Substrate modes
frozen— pure reservoirlearnable_decays— decays trainablelearnable_mixing— input projection trainablelearned_recurrence— Mamba-style B/C, chunked scangated_retention— learned matrix memory as the substrate
Memory attachment
none— substrate onlyngram— smoothed n-gram priorexact_context— exact-history cachestatistical_backoff— fitted backoff mixture
Selective scan
state_dim— inner recurrent widthstate_impl—scanorretentionnum_heads— head-factored memoryreadout_bands— split timescale interference
Stability & geometry
training_noise,adaptive_regnum_hemispheres,fast_lr_multlocal_poly_order,substrate_poly_orderpatch_size,patch_causal_decoder
Full configuration surface lives on the CausalBankConfig docstring in
causal_bank.py.
Modules
The kernel by category. Full mapping in the kernel matrix.
Substrates
Echo-state, delay, linear-memory, oscillatory, mixed, and hierarchical multi-timescale substrates with config-driven dispatch.
substrates.py · factories.py
Control & routing
Controller summaries, pathway gates, summary routing, hormone modulation, and a compact predictive-surprise primitive.
control.py · gating.py · routing.py · modulation.py
Memory
Exact-context, n-gram, statistical-backoff, and online n-gram memories — plus unified cache-view records over all four.
exact_context.py · ngram_memory.py · statistical_backoff.py
Views
Byte-latent, hierarchical, and linear-memory feature views, sampled multiscale readout, probability diagnostics, bridge features.
views.py · hierarchical_views.py · sampled_readout.py
Readouts & experts
Ridge readout, frozen-readout expert, sampled multiscale readout, GRU recurrent readout, routed squared-ReLU experts.
readouts.py · experts.py · models/readouts_torch.py
Adapters
Shared contracts for causal-predictive, oracle-analysis, bridge-export, noncausal-reconstructive, and paired teacher/export descendants.
causal_predictive.py · bridge_export.py · oracle_analysis.py
Runtime
Sequence traces, fit reports, rollout evaluation, transfer probes, train-mode checkpoints, artifact accounting and audits.
runtime.py · eval.py · train_eval.py · artifacts.py
Backends
PyTorch and MLX CausalBankModel implementations: frozen substrate, selective scan augment, banded readout.
models/causal_bank_torch.py · models/causal_bank_mlx.py
Causality
Non-negotiable. CI fails if any substrate mode leaks the future.
The test in
tests/test_causality.py
feeds two identical sequences up to position t, different after t.
If logits at position t differ between the two, causality is violated and CI fails.
An earlier research line lost months to an accidentally-unfrozen causal_mask
that broke causality silently. This test exists to make that failure mode loud and instant.
Modes verified
frozenlearnable_mixinglearnable_decays- selective scan augment (
state_dim > 0) readout_bands- routed experts
Documentation
Six docs cover the full kernel surface, from extraction rule to research anchors.
Architecture
Three-layer model, package map, and the promotion rule that controls what gets in.
Kernel matrix
Capability matrix — every kernel area mapped to its module.
Chronohorn boundary
Where the kernel ends and the runtime descendant begins. Mechanism vs policy.
Downstream patterns
Causal, noncausal, oracle, bridge, and byte-latent descendant patterns.
Related work
Research anchors and prior art across predictive coding, reservoir computing, and rate-distortion.
Ecosystem landscape
Where this kernel sits relative to the open ecosystem (March 2026 snapshot).