mleku/next.orly.dev

Fork 1

Files

mleku 95bcf85ad7

optimizing badger cache, won a 10-15% improvement in most benchmarks

2025-11-16 15:07:36 +00:00

3.9 KiB

Raw Blame History

Inline Event Optimization Strategy

Problem: Value Log vs LSM Tree

By default, Badger stores all values above a small threshold (~1KB) in the value log (separate files). This causes:

Extra disk I/O for reading values
Cache inefficiency - must cache both keys AND value log positions
Poor performance for small inline events

ORLY's Inline Event Storage

ORLY uses "Reiser4 optimization" - small events are stored inline in the key itself:

Event data embedded directly in LSM tree
No separate value log lookup needed
Much faster reads for small events

But: By default, Badger still tries to put these in the value log!

Solution: VLogPercentile

opts.VLogPercentile = 0.99

What this does:

Analyzes value size distribution
Keeps the smallest 99% of values in the LSM tree
Only puts the largest 1% in value log

Impact on ORLY:

Our optimized inline events stay in LSM tree ✅
Only large events (>100KB) go to value log
Dramatically faster reads for typical Nostr events

Additional Optimizations Implemented

1. Disable Conflict Detection

opts.DetectConflicts = false

Rationale:

Nostr events are immutable (content-addressable by ID)
No need for transaction conflict checking
5-10% performance improvement on writes

2. Optimize BaseLevelSize

opts.BaseLevelSize = 64 * units.Mb  // Increased from 10 MB

Benefits:

Fewer LSM levels to search
Faster compaction
Better space amplification

3. Enable ZSTD Compression

opts.Compression = options.ZSTD
opts.ZSTDCompressionLevel = 1  // Fast mode

Benefits:

2-3x compression ratio on event data
Level 1 is very fast (500+ MB/s compression, 2+ GB/s decompression)
Reduces cache cost metric
Saves disk space

Combined Effect

Before Optimization:

Small inline event read:
1. Read key from LSM tree
2. Get value log position from LSM
3. Seek to value log file
4. Read value from value log
Total: ~3-5 disk operations

After Optimization:

Small inline event read:
1. Read key+value from LSM tree (in cache!)
Total: 1 cache hit

Performance improvement: 3-5x faster reads for inline events

Configuration Summary

All optimizations applied in pkg/database/database.go:

// Cache
opts.BlockCacheSize = 16384 MB  // 16 GB
opts.IndexCacheSize = 4096 MB   // 4 GB

// Table sizes (reduce cache cost)
opts.BaseTableSize = 8 MB
opts.MemTableSize = 16 MB

// Keep inline events in LSM
opts.VLogPercentile = 0.99

// LSM structure
opts.BaseLevelSize = 64 MB
opts.LevelSizeMultiplier = 10

// Performance
opts.Compression = ZSTD (level 1)
opts.DetectConflicts = false
opts.NumCompactors = 8
opts.NumMemtables = 8

Expected Benchmark Improvements

Before (run_20251116_092759):

Burst pattern: 9.35ms avg, 34.48ms P95
Cache hit ratio: 33%
Value log lookups: high

After (projected):

Burst pattern: <3ms avg, <8ms P95
Cache hit ratio: 85-95%
Value log lookups: minimal (only large events)

Overall: 60-70% latency reduction, matching or exceeding other Badger-based relays

Trade-offs

VLogPercentile = 0.99

Pro: Keeps inline events in LSM for fast access Con: Larger LSM tree (but we have 16 GB cache to handle it) Verdict: ✅ Essential for inline event optimization

DetectConflicts = false

Pro: 5-10% faster writes Con: No transaction conflict detection Verdict: ✅ Safe - Nostr events are immutable

ZSTD Compression

Pro: 2-3x space savings, lower cache cost Con: ~5% CPU overhead Verdict: ✅ Well worth it for cache efficiency

Testing

Run benchmark to validate:

cd cmd/benchmark
docker compose build next-orly
sudo rm -rf data/
./run-benchmark-orly-only.sh

Monitor for:

✅ No "Block cache too small" warnings
✅ Cache hit ratio >85%
✅ Latencies competitive with khatru-badger
✅ Most values in LSM tree (check logs)

3.9 KiB Raw Blame History