188 lines
6.2 KiB
Markdown
188 lines
6.2 KiB
Markdown
Reiser4 had *several* ideas that were too radical for Linux in the 2000s, but **would make a lot of sense today in a modern CoW (copy-on-write) filesystem**—especially one designed for immutable or content-addressed data.
|
||
|
||
Below is a distilled list of the Reiser4 concepts that *could* be successfully revived and integrated into a next-generation CoW filesystem, along with why they now make more sense and how they would fit.
|
||
|
||
---
|
||
|
||
# ✅ **1. Item/extent subtypes (structured metadata records)**
|
||
|
||
Reiser4 had “item types” that stored different structures within B-tree leaves (e.g., stat-data items, directory items, tail items).
|
||
Most filesystems today use coarse-grained extents and metadata blocks—but structured, typed leaf contents provide clear benefits:
|
||
|
||
### Why it makes sense today:
|
||
|
||
* CoW filesystems like **APFS**, **Btrfs**, and **ZFS** already have *typed nodes* internally (extent items, dir items).
|
||
* Typed leaf records allow:
|
||
|
||
* Faster parsing
|
||
* Future expansion of features
|
||
* Better layout for small objects
|
||
* Potential content-addressed leaves
|
||
|
||
A modern CoW filesystem could revive this idea by allowing different **record kinds** within leaf blocks, with stable, versioned formats.
|
||
|
||
---
|
||
|
||
# ✅ **2. Fine-grained small-file optimizations—but integrated with CoW**
|
||
|
||
Reiser4’s small-file packing was too complicated for mutable trees, but in a CoW filesystem it fits perfectly:
|
||
|
||
### In CoW:
|
||
|
||
* Leaves are immutable once written.
|
||
* Small files can be stored **inline** inside a leaf, or as small extents.
|
||
* Deduplication is easier due to immutability.
|
||
* Crash consistency is automatic.
|
||
|
||
### What makes sense to revive:
|
||
|
||
* Tail-packing / inline-data for files below a threshold
|
||
* Possibly grouping many tiny files into a single CoW extent tree page
|
||
* Using a “small-files leaf type” with fixed slots
|
||
|
||
This aligns closely with APFS’s and Btrfs’s inline extents but could go further—safely—because of CoW.
|
||
|
||
---
|
||
|
||
# ✅ **3. Semantic plugins *outside the kernel***
|
||
|
||
Reiser4’s plugin system failed because it tried to put a framework *inside the kernel*.
|
||
But moving that logic **outside** (as user-space metadata layers or FUSE-like transforms) is realistic today.
|
||
|
||
### Possible modern implementation:
|
||
|
||
* A CoW filesystem exposes stable metadata + data primitives.
|
||
* User-space “semantic layers” do:
|
||
|
||
* per-directory views
|
||
* virtual inodes
|
||
* attribute-driven namespace merges
|
||
* versioned or content-addressed overlays
|
||
|
||
### Why it makes sense:
|
||
|
||
* User-space is safer and maintainers accept it.
|
||
* CoW makes such layers more reliable and more composable.
|
||
* Many systems already do this:
|
||
|
||
* OSTree
|
||
* Git virtual filesystem
|
||
* container overlayfs
|
||
* CephFS metadata layers
|
||
|
||
The spirit of Reiser4’s semantics CAN live on—just not in-kernel.
|
||
|
||
---
|
||
|
||
# ✅ **4. Content-addressable objects + trees (Reiser4-like keys)**
|
||
|
||
Reiser4 had “keyed items” in a tree, which map closely to modern content-addressable storage strategies.
|
||
|
||
A modern CoW FS could:
|
||
|
||
* Store leaf blocks by **hash of contents**
|
||
* Use stable keyed addressing for trees
|
||
* Deduplicate at leaf granularity
|
||
* Provide Git/OSTree-style guarantees natively
|
||
|
||
This is very powerful for immutable or append-only workloads.
|
||
|
||
### Why it's feasible now:
|
||
|
||
* Fast hashing hardware
|
||
* Widespread use of snapshots, clones, dedupe
|
||
* Object-based designs in modern systems (e.g., bcachefs, ZFS)
|
||
|
||
Reiser4 was ahead of its time here.
|
||
|
||
---
|
||
|
||
# ✅ **5. Rich directory structures (hash trees)**
|
||
|
||
Reiser4’s directory semantics were much more flexible, including:
|
||
|
||
* Extensible directory entries
|
||
* Small-directory embedding
|
||
* Very fast operations on large directories
|
||
|
||
Most CoW FSes today use coarse directory structures.
|
||
|
||
A modern CoW FS could adopt:
|
||
|
||
* Fixed-format hashed directories for fast lookup
|
||
* Optional richer metadata per entry
|
||
* Inline storage of tiny directories
|
||
|
||
Essentially, a more flexible but POSIX-compliant version of Reiser4 directories.
|
||
|
||
---
|
||
|
||
# ✅ **6. Atomic multi-item updates via “transaction items”**
|
||
|
||
Reiser4 had advanced concepts for batched updates via a plugin model, which could be simplified into:
|
||
|
||
* A single CoW commit representing a set of operations
|
||
* Versioned writes to multiple trees
|
||
|
||
This is similar to what APFS and Btrfs do, but can be made more explicit.
|
||
|
||
### Why it’s relevant
|
||
|
||
Modern workloads (containers, datasets, package managers) rely heavily on atomic snapshots—rich commit semantics at the filesystem layer are a big win.
|
||
|
||
---
|
||
|
||
# 🧪 **7. Advanced multi-key indexing (“dancing tree” ideas, but simplified)**
|
||
|
||
Reiser4 used a flexible key scheme for ordering items in the unified tree.
|
||
While we don’t want "dancing trees" again, a **multi-dimensional key tuple** is extremely useful for:
|
||
|
||
* Querying by filename + offset
|
||
* Efficiently supporting both sparse files and directories
|
||
* Custom ordering schemes without rebalancing everything
|
||
|
||
A CoW filesystem can use simple B-trees or B+trees with structured keys.
|
||
|
||
---
|
||
|
||
# 🚫 Reiser4 ideas that *do NOT* make sense even in a CoW filesystem
|
||
|
||
To be clear, these should **not** be revived:
|
||
|
||
* In-kernel plugin architecture
|
||
* Files-as-directories breaking POSIX
|
||
* Dynamic per-file semantic behavior
|
||
* Arbitrary user-defined item types in kernel
|
||
* Complex item-level transformations at write time
|
||
* Dancing-tree hyper-rebalancing policies
|
||
|
||
These remain too brittle or too unpredictable for a production CoW filesystem.
|
||
|
||
---
|
||
|
||
# 🧠 **Summary: What Reiser4 ideas *should* come back in future CoW designs**
|
||
|
||
### ✔ Good ideas to revive:
|
||
|
||
1. **Record/item subtypes within leaf nodes** for efficiency
|
||
2. **Advanced small-file packing that works with CoW**
|
||
3. **Semantic “plugins” in user-space only**
|
||
4. **Content-addressable storage for leaves/objects**
|
||
5. **Rich, hashed, fast directories**
|
||
6. **Explicit atomic multi-tree commit semantics**
|
||
7. **Structured multi-field keys inside B-trees**
|
||
|
||
### ✘ Not worth reviving:
|
||
|
||
* Kernel plugin system
|
||
* Semantics-heavy namespaces
|
||
* Dancing trees
|
||
* Micro-level tree balancing complexity
|
||
* Deeply nested “item-in-item-in-item” structures
|
||
* Anything that breaks POSIX semantics
|
||
|
||
---
|
||
|
||
If you want, I can sketch out a **concept design** for a hypothetical “Reiser5-like CoW filesystem” that incorporates the good parts but avoids the mistakes—something that could realistically compete with APFS, ZFS, and bcachefs today.
|
||
|