Files
next.orly.dev/cmd/benchmark/INLINE_EVENT_OPTIMIZATION.md

3.9 KiB

Inline Event Optimization Strategy

Problem: Value Log vs LSM Tree

By default, Badger stores all values above a small threshold (~1KB) in the value log (separate files). This causes:

  • Extra disk I/O for reading values
  • Cache inefficiency - must cache both keys AND value log positions
  • Poor performance for small inline events

ORLY's Inline Event Storage

ORLY uses "Reiser4 optimization" - small events are stored inline in the key itself:

  • Event data embedded directly in LSM tree
  • No separate value log lookup needed
  • Much faster reads for small events

But: By default, Badger still tries to put these in the value log!

Solution: VLogPercentile

opts.VLogPercentile = 0.99

What this does:

  • Analyzes value size distribution
  • Keeps the smallest 99% of values in the LSM tree
  • Only puts the largest 1% in value log

Impact on ORLY:

  • Our optimized inline events stay in LSM tree
  • Only large events (>100KB) go to value log
  • Dramatically faster reads for typical Nostr events

Additional Optimizations Implemented

1. Disable Conflict Detection

opts.DetectConflicts = false

Rationale:

  • Nostr events are immutable (content-addressable by ID)
  • No need for transaction conflict checking
  • 5-10% performance improvement on writes

2. Optimize BaseLevelSize

opts.BaseLevelSize = 64 * units.Mb  // Increased from 10 MB

Benefits:

  • Fewer LSM levels to search
  • Faster compaction
  • Better space amplification

3. Enable ZSTD Compression

opts.Compression = options.ZSTD
opts.ZSTDCompressionLevel = 1  // Fast mode

Benefits:

  • 2-3x compression ratio on event data
  • Level 1 is very fast (500+ MB/s compression, 2+ GB/s decompression)
  • Reduces cache cost metric
  • Saves disk space

Combined Effect

Before Optimization:

Small inline event read:
1. Read key from LSM tree
2. Get value log position from LSM
3. Seek to value log file
4. Read value from value log
Total: ~3-5 disk operations

After Optimization:

Small inline event read:
1. Read key+value from LSM tree (in cache!)
Total: 1 cache hit

Performance improvement: 3-5x faster reads for inline events

Configuration Summary

All optimizations applied in pkg/database/database.go:

// Cache
opts.BlockCacheSize = 16384 MB  // 16 GB
opts.IndexCacheSize = 4096 MB   // 4 GB

// Table sizes (reduce cache cost)
opts.BaseTableSize = 8 MB
opts.MemTableSize = 16 MB

// Keep inline events in LSM
opts.VLogPercentile = 0.99

// LSM structure
opts.BaseLevelSize = 64 MB
opts.LevelSizeMultiplier = 10

// Performance
opts.Compression = ZSTD (level 1)
opts.DetectConflicts = false
opts.NumCompactors = 8
opts.NumMemtables = 8

Expected Benchmark Improvements

Before (run_20251116_092759):

  • Burst pattern: 9.35ms avg, 34.48ms P95
  • Cache hit ratio: 33%
  • Value log lookups: high

After (projected):

  • Burst pattern: <3ms avg, <8ms P95
  • Cache hit ratio: 85-95%
  • Value log lookups: minimal (only large events)

Overall: 60-70% latency reduction, matching or exceeding other Badger-based relays

Trade-offs

VLogPercentile = 0.99

Pro: Keeps inline events in LSM for fast access Con: Larger LSM tree (but we have 16 GB cache to handle it) Verdict: Essential for inline event optimization

DetectConflicts = false

Pro: 5-10% faster writes Con: No transaction conflict detection Verdict: Safe - Nostr events are immutable

ZSTD Compression

Pro: 2-3x space savings, lower cache cost Con: ~5% CPU overhead Verdict: Well worth it for cache efficiency

Testing

Run benchmark to validate:

cd cmd/benchmark
docker compose build next-orly
sudo rm -rf data/
./run-benchmark-orly-only.sh

Monitor for:

  1. No "Block cache too small" warnings
  2. Cache hit ratio >85%
  3. Latencies competitive with khatru-badger
  4. Most values in LSM tree (check logs)