01 — What rote does
Memoization without the decorators
Picture an analyze.py that reads a CSV, builds some features, fits a
model, and saves a plot. A clean run takes a few minutes. You edit one line in the
plotting code and run the file again. Plain Python re-executes everything from the
top, including the parts that didn't change.
rote watches the first run. For each function call, it decides whether
the work was slow enough to be worth caching and whether the function is safe to
cache. When both are true, the return value is written to disk under
.rote/. On the next run, the cached values come straight from disk; the
only thing that actually executes is the function you edited (and anything downstream
of it).
There are no decorators to add. rote run analyze.py rewrites the AST of
your script in memory so each top-level function gets wrapped automatically, then
runs the rewritten code. The file on disk doesn't change.
02 — How it works
What rote does on every run
Each of the widgets further down the page corresponds to one of these four steps.
- 01 Trace
While your script runs, rote uses sys.monitoring (PEP 669) to see every function call. It also uses audit hooks (PEP 578) to log file opens, network access, subprocess starts, and any use of exec or eval. Those events feed the next step.
- 02 Judge
rote memoizes a call only if it passes four checks. The call has to have run for at least a second; no impure I/O can have happened during it; the arguments must hash to the same value at exit as at entry; and the source code (along with everything the function transitively depends on) has to match what was cached. Any single failing check skips the cache write.
- 03 Store
The return value is serialized using whichever library makes sense for its type: PyArrow IPC for DataFrames, numpy.save for ndarrays, safetensors for tensors, msgpack for primitives, cloudpickle for anything that doesn’t fit those buckets. The bytes go into a content-addressed SQLite cache. Files that the function read are content-hashed too, so a backdated
touch -rcan’t trick the cache into reusing a stale value. - 04 Replay
On the next run, rote checks the function’s identity along with its arguments and any tracked file inputs. If everything matches, the cached value comes back in microseconds. If anything has changed, the entry is invalidated and the function runs normally. The first run pays the write cost; every run after that reuses the result.
Functions that do impure work (network calls, anything that reads
time.time(), and so on) are never cached. rote logs why each call was
skipped in rote.stats()["invalidation_reasons"], so you can read entries
like calls impure stdlib: time.time and tell whether the cache misses
you're seeing match what you'd expect.
03 — Edit-rerun loop
Re-running a pipeline after one edit
A small pipeline with four stages: parse, aggregate, train, plot. Pick the stage you just edited. Plain Python re-runs all four. rote re-runs only the stages whose source or inputs changed. The numbers come from running each variant in a fresh Python process (the same shape as saving the script and re-typing the python command), where the plain rerun takes 1.83 s and the warm rote rerun takes 0.38 s.
You edited plot. Stages parse, aggregate, train are served from cache. Stage plot recomputes; downstream stages re-run because their inputs changed.
04 — Differences from the paper
Every place rote behaves differently, with the reason
The paper claim, rote's current measurement or behaviour, and one sentence on what
made the change possible. Most rows credit a library or a PEP that shipped after
2011; that's the bulk of the gap. The full log lives in
site/DISCREPANCIES.md.
| Topic | Paper (2011) | rote (2026) | Why it differs |
|---|---|---|---|
| Edit-rerun speedup paper §4.2 · bench/results/cross_process_pipeline.json | ~10× on real workflows (fresh interpreter each run). | 4.8× cross-process on a paper-shaped pipeline (1.83 s → 0.38 s; joblib 0.19 s). | Roughly half the paper’s factor. Hardware has moved, and rote content-hashes file dependencies on every hit where the paper trusted (size, mtime). The validation costs cycles but closes a stale-result hole. |
| File-dependency identity paper §3.5 · tests/unit/test_file_hash_cache.py::test_size_preserving_mtime_backdated_edit_still_invalidates | Keyed on (size, mtime). | Indexed on (dev, ino); validated against (size, mtime_ns, ctime_ns) in a persistent SQLite table. | A touch -r rewinding mtime after a same-size overwrite would fool the paper’s scheme; ctime_ns moves anyway because the kernel updates it on every inode write and userspace can’t backdate it without root. |
| Source-change detection paper §3.2 · src/rote/identity.py enabled by: libcst (2019) | Coarse source-byte hashing — adding a comment busts the cache. | Canonical-AST hash via libcst (strips comments, docstrings, annotations; De Bruijn-renames bound variables). | libcst didn’t exist when the paper was written. The newer machinery means cosmetic edits no longer invalidate. |
| Serialization format paper Figure 6 · bench/results/serialize_microbench.json enabled by: PyArrow IPC (2016), safetensors (2023) | Pickle variants dominate the warm path (Figure 6). | Type-dispatched: PyArrow IPC → DataFrame, numpy.save → ndarray, safetensors → tensors, msgpack → primitives, cloudpickle as fallback. | PyArrow IPC, numpy’s zero-copy load path, and safetensors all post-date the paper. For DataFrames and ndarrays — the cases that matter to modern research — they beat pickle. For huge homogeneous Python containers pickle still wins; documented openly. |
| Interpreter compatibility paper §1 · pyproject.toml enabled by: PEP 578 audit hooks (2018), PEP 669 sys.monitoring (2023) | Required a CPython 2.6.3 patch — a custom interpreter binary that stopped tracking upstream years ago. | Pure-Python library on stock CPython 3.12+. | PEP 578 (audit hooks) and PEP 669 (sys.monitoring) let user code observe events the 2011 prototype needed an interpreter fork to see. |
| Concurrency tests/correctness/test_concurrency.py | Single-process — no shared-cache IPC story. | Multi-process safe via SQLite WAL + atomic blob rename. 16-process hammer test in tests/correctness/. | A modern research workflow runs notebooks and CLI jobs against the same cache. SQLite WAL didn’t see widespread adoption until ~2010 and the audit-hook scaffolding to keep file dependencies honest under concurrency post-dates the paper too. |
| Argument-mutation detection src/rote/purity.py · tests/unit/test_purity.py | Not modelled — static analysis assumes pure-looking functions are pure. | Copy-on-call fingerprinting: hash arguments at entry, re-hash at exit; any drift disqualifies the call. | A pure-by-inspection function can still mutate a list argument in place. Modern researchers passing DataFrames around hit this constantly. |
| Coverage of pure long-running calls paper §4.3 · tests/integration/test_realistic_coverage.py | Reported high coverage on the original five-script corpus (fraction of pure calls memoized). | 100% of cold compute eliminated on the warm re-run across corpus/realistic/ (five multi-second scripts, ~26 s → 0 s). | Different denominator — work eliminated vs. pure-call fraction. Flagging the mismatch rather than asserting parity. |
05 — Speedups
Two ways to measure how fast it is
The first reference point is the speedup the original paper reported in 2011. The second is joblib, which is the most common memoization library for Python research scripts today. Both sets of numbers come from bench/results/*.json. The toggle below picks which one to look at first.
| Comparison | Paper (2011) | rote (2026) | Source |
|---|---|---|---|
| Edit-rerun on a multi-stage script, fresh interpreter each run | ~10× | 4.9×(1.75 s → 353.4 ms) | cross_process_pipeline.json · paper §4.2 |
| Same pipeline, one interpreter, LRU pre-warmed | not measured separately | ~48× | paper_pipeline.json |
The cross-process row is the one that lines up with the paper's measurement. Roughly half the paper's reported speedup is the order of magnitude we'd expect after fifteen years of hardware progress, plus the cost of rote content-hashing every file dependency on every hit. The in-process number is the upper bound once interpreter startup is amortised; it's listed here as a second data point, not the headline.
06 — Serializers (paper Figure 6, updated)
Picking a serializer by what the function returns
The paper compared three pickle variants. rote uses different serializers depending on the return type. PyArrow IPC handles DataFrames, numpy.save handles ndarrays, safetensors handles tensors, msgpack handles primitives, and cloudpickle is the fallback for anything that doesn't fit those buckets. The chart below also shows the workloads where pickle still wins (large homogeneous Python containers), since those are the cases where the dispatch decision matters most.
| Payload | rote serializer | rote | pickle (HIGHEST) | ratio |
|---|---|---|---|---|
| numpy · 1 M float64 | numpy | 0.44 ms | 0.35 ms | 1.26× slower |
| numpy · 3 M float32 | numpy | 0.66 ms | 1.1 ms | 1.71× faster |
| arrow · 1 M-row table | arrow | 2.8 ms | 3.6 ms | 1.31× faster |
| dict · 100K items | msgpack | 47 ms | 11 ms | 4.27× slower |
| list · 1 M ints | msgpack | 362 ms | 11 ms | 32.52× slower |
Source: bench/results/serialize_microbench.json. Min of 5 trials per cell.
07 — Call graph
What happens when you edit one node
Click any node to edit it. rote rehashes the function's canonical AST, sees that the new value doesn't match what was cached, and marks the node as missed. Anything further down the pipeline that depended on its output is now stale too. Anything earlier in the pipeline is unaffected. This is the propagation rule from §3.4 of the paper, drawn live so you can watch it.
Click a stage to edit it.
08 — Purity model
The four checks rote runs before caching anything
The paper has a purity model in §3.3 onward; this is a refresher on it. Each card
shows what the paper required and which specific signal rote actually checks. If any
single signal fails, the cache write is skipped and the reason gets logged in
rote.stats()["invalidation_reasons"].
- 01 §3.3.2 perf guard
Long enough for the cache to be worth it
duration_ns ≥ Config.min_duration_s (default 1 s)
Below the threshold, the cache write costs more than the recomputation would have. This is the perf guard from the paper. rote also tracks the per-call encode time in
src/rote/purity.py, so a 1 GB return for a trivial call gets blacklisted even if the body itself ran for long enough. - 02 §3.3 + §3.3.1
No impure I/O during the call
no audit-hook event for network / exec / subprocess; any file opened in
"w"mode is closed before the call returnsPEP 578 audit hooks (2018) classify network access, subprocesses, exec, file appends, and writes that are still open when the call returns. A
with open(...)write that closes inside the call is what paper §3.3.1 calls a self-contained write: still pure, and tracked as a write-dependency. - 03 beyond the paper
Arguments weren’t mutated in place
arg fingerprints at entry == arg fingerprints at exit
rote fingerprints mutable arguments when the call starts and again when it returns. If anything moved, the cache write is skipped. The paper assumed pure-looking functions actually were pure; this catches in-place mutation of lists, dicts, and DataFrames that static analysis would miss.
- 04 §3.4 + §3.5
Source and dependencies still match
blake3(canonical_AST(func) ⊕ transitive_callee_ids ⊕ file_dep_hashes ⊕ global_dep_fingerprints) matches the cached key
The libcst canonical AST means cosmetic edits (a comment, a rename, a formatting change) don’t change the hash; the paper’s source-byte hash would have invalidated all of those. Transitive callee ids cover the §3.4 case. The file_dep_hashes column extends the paper’s §3.5 (size, mtime_ns) with ctime_ns and a stream-hashed content digest, which is how rote catches backdated edits.
09 — Live editor
The hash, live as you type
Edit the function below. Cosmetic edits like adding a comment or renaming a local variable don't change the hash, because the canonicalisation strips them out before hashing. A semantic edit (a literal value, an operator) does change it. The paper hashed raw source bytes (§3.2), which would have invalidated any edit at all; the canonical AST form is what draws the distinction, and that's what libcst gives us.
A short JavaScript canonicalisation runs as soon as the page loads, so the editor responds immediately. Pyodide loads in the background, and once it's ready the same source goes through the real rote.identity.canonical_source function (libcst plus hashlib). Both hashes are displayed; if they disagree, it's the JS approximation that's wrong.
08b · the file-dependency adversarial edit
Paper §3.5 keyed file deps on (size, mtime). rote also tracks ctime_ns and a content hash. Toggle the scenarios to see which signal catches each edit.
| signal | value | vs baseline | used by |
|---|---|---|---|
| size | 4096 B | unchanged | paper + rote |
| mtime_ns | 1716080400000000000 | unchanged | paper + rote |
| ctime_ns | 1716080400000000000 | unchanged | rote only |
| content_hash | 7f3a91bd4e2c8f4a | unchanged | rote only |
cache hitsNothing has changed yet.