implement event table subtyping for small events in value log
Some checks failed
Go / build (push) Has been cancelled
Go / release (push) Has been cancelled

This commit is contained in:
2025-11-14 12:15:52 +00:00
parent 7169a2158f
commit 29e175efb0
11 changed files with 2275 additions and 43 deletions

View File

@@ -0,0 +1,187 @@
Reiser4 had *several* ideas that were too radical for Linux in the 2000s, but **would make a lot of sense today in a modern CoW (copy-on-write) filesystem**—especially one designed for immutable or content-addressed data.
Below is a distilled list of the Reiser4 concepts that *could* be successfully revived and integrated into a next-generation CoW filesystem, along with why they now make more sense and how they would fit.
---
# ✅ **1. Item/extent subtypes (structured metadata records)**
Reiser4 had “item types” that stored different structures within B-tree leaves (e.g., stat-data items, directory items, tail items).
Most filesystems today use coarse-grained extents and metadata blocks—but structured, typed leaf contents provide clear benefits:
### Why it makes sense today:
* CoW filesystems like **APFS**, **Btrfs**, and **ZFS** already have *typed nodes* internally (extent items, dir items).
* Typed leaf records allow:
* Faster parsing
* Future expansion of features
* Better layout for small objects
* Potential content-addressed leaves
A modern CoW filesystem could revive this idea by allowing different **record kinds** within leaf blocks, with stable, versioned formats.
---
# ✅ **2. Fine-grained small-file optimizations—but integrated with CoW**
Reiser4s small-file packing was too complicated for mutable trees, but in a CoW filesystem it fits perfectly:
### In CoW:
* Leaves are immutable once written.
* Small files can be stored **inline** inside a leaf, or as small extents.
* Deduplication is easier due to immutability.
* Crash consistency is automatic.
### What makes sense to revive:
* Tail-packing / inline-data for files below a threshold
* Possibly grouping many tiny files into a single CoW extent tree page
* Using a “small-files leaf type” with fixed slots
This aligns closely with APFSs and Btrfss inline extents but could go further—safely—because of CoW.
---
# ✅ **3. Semantic plugins *outside the kernel***
Reiser4s plugin system failed because it tried to put a framework *inside the kernel*.
But moving that logic **outside** (as user-space metadata layers or FUSE-like transforms) is realistic today.
### Possible modern implementation:
* A CoW filesystem exposes stable metadata + data primitives.
* User-space “semantic layers” do:
* per-directory views
* virtual inodes
* attribute-driven namespace merges
* versioned or content-addressed overlays
### Why it makes sense:
* User-space is safer and maintainers accept it.
* CoW makes such layers more reliable and more composable.
* Many systems already do this:
* OSTree
* Git virtual filesystem
* container overlayfs
* CephFS metadata layers
The spirit of Reiser4s semantics CAN live on—just not in-kernel.
---
# ✅ **4. Content-addressable objects + trees (Reiser4-like keys)**
Reiser4 had “keyed items” in a tree, which map closely to modern content-addressable storage strategies.
A modern CoW FS could:
* Store leaf blocks by **hash of contents**
* Use stable keyed addressing for trees
* Deduplicate at leaf granularity
* Provide Git/OSTree-style guarantees natively
This is very powerful for immutable or append-only workloads.
### Why it's feasible now:
* Fast hashing hardware
* Widespread use of snapshots, clones, dedupe
* Object-based designs in modern systems (e.g., bcachefs, ZFS)
Reiser4 was ahead of its time here.
---
# ✅ **5. Rich directory structures (hash trees)**
Reiser4s directory semantics were much more flexible, including:
* Extensible directory entries
* Small-directory embedding
* Very fast operations on large directories
Most CoW FSes today use coarse directory structures.
A modern CoW FS could adopt:
* Fixed-format hashed directories for fast lookup
* Optional richer metadata per entry
* Inline storage of tiny directories
Essentially, a more flexible but POSIX-compliant version of Reiser4 directories.
---
# ✅ **6. Atomic multi-item updates via “transaction items”**
Reiser4 had advanced concepts for batched updates via a plugin model, which could be simplified into:
* A single CoW commit representing a set of operations
* Versioned writes to multiple trees
This is similar to what APFS and Btrfs do, but can be made more explicit.
### Why its relevant
Modern workloads (containers, datasets, package managers) rely heavily on atomic snapshots—rich commit semantics at the filesystem layer are a big win.
---
# 🧪 **7. Advanced multi-key indexing (“dancing tree” ideas, but simplified)**
Reiser4 used a flexible key scheme for ordering items in the unified tree.
While we dont want "dancing trees" again, a **multi-dimensional key tuple** is extremely useful for:
* Querying by filename + offset
* Efficiently supporting both sparse files and directories
* Custom ordering schemes without rebalancing everything
A CoW filesystem can use simple B-trees or B+trees with structured keys.
---
# 🚫 Reiser4 ideas that *do NOT* make sense even in a CoW filesystem
To be clear, these should **not** be revived:
* In-kernel plugin architecture
* Files-as-directories breaking POSIX
* Dynamic per-file semantic behavior
* Arbitrary user-defined item types in kernel
* Complex item-level transformations at write time
* Dancing-tree hyper-rebalancing policies
These remain too brittle or too unpredictable for a production CoW filesystem.
---
# 🧠 **Summary: What Reiser4 ideas *should* come back in future CoW designs**
### ✔ Good ideas to revive:
1. **Record/item subtypes within leaf nodes** for efficiency
2. **Advanced small-file packing that works with CoW**
3. **Semantic “plugins” in user-space only**
4. **Content-addressable storage for leaves/objects**
5. **Rich, hashed, fast directories**
6. **Explicit atomic multi-tree commit semantics**
7. **Structured multi-field keys inside B-trees**
### ✘ Not worth reviving:
* Kernel plugin system
* Semantics-heavy namespaces
* Dancing trees
* Micro-level tree balancing complexity
* Deeply nested “item-in-item-in-item” structures
* Anything that breaks POSIX semantics
---
If you want, I can sketch out a **concept design** for a hypothetical “Reiser5-like CoW filesystem” that incorporates the good parts but avoids the mistakes—something that could realistically compete with APFS, ZFS, and bcachefs today.