- Add Neo4j driver config options for memory management: - ORLY_NEO4J_MAX_CONN_POOL (default: 25) - connection pool size - ORLY_NEO4J_FETCH_SIZE (default: 1000) - records per batch - ORLY_NEO4J_MAX_TX_RETRY_SEC (default: 30) - transaction retry timeout - ORLY_NEO4J_QUERY_RESULT_LIMIT (default: 10000) - max results per query - Apply driver settings when creating Neo4j connection (pool size, fetch size, retry time) - Enforce query result limit as safety cap on all Cypher queries - Fix QueryForSerials and QueryForIds to preserve LIMIT clauses - Add comprehensive memory tuning documentation with sizing guidelines - Add NIP-46 signer-based authentication for bunker connections - Update go.mod with new dependencies Files modified: - app/config/config.go: Add Neo4j driver tuning config vars - main.go: Pass new config values to database factory - pkg/database/factory.go: Add Neo4j tuning fields to DatabaseConfig - pkg/database/factory_wasm.go: Mirror factory.go changes for WASM - pkg/neo4j/neo4j.go: Apply driver config, add getter methods - pkg/neo4j/query-events.go: Enforce query result limit, fix LIMIT preservation - docs/NEO4J_BACKEND.md: Add Memory Tuning section, update Docker example - CLAUDE.md: Add Neo4j memory tuning quick reference - app/handle-req.go: NIP-46 signer authentication - app/publisher.go: HasActiveNIP46Signer check - pkg/protocol/publish/publisher.go: NIP46SignerChecker interface - go.mod: Add dependencies 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
15 KiB
Neo4j Database Backend for ORLY Relay
Overview
The Neo4j database backend provides a graph-native storage solution for the ORLY Nostr relay. Unlike traditional key-value or document stores, Neo4j is optimized for relationship-heavy queries, making it an ideal fit for Nostr's social graph and event reference patterns.
Architecture
Core Components
-
Main Database File (pkg/neo4j/neo4j.go)
- Implements the
database.Databaseinterface - Manages Neo4j driver connection and lifecycle
- Uses Badger for metadata storage (markers, identity, subscriptions)
- Registers with the database factory via
init()
- Implements the
-
Schema Management (pkg/neo4j/schema.go)
- Defines Neo4j constraints and indexes using Cypher
- Creates unique constraints on Event IDs and Author pubkeys
- Indexes for optimal query performance (kind, created_at, tags)
-
Query Engine (pkg/neo4j/query-events.go)
- Translates Nostr REQ filters to Cypher queries
- Leverages graph traversal for tag relationships
- Supports prefix matching for IDs and pubkeys
- Parameterized queries for security and performance
-
Event Storage (pkg/neo4j/save-event.go)
- Stores events as nodes with properties
- Creates graph relationships:
AUTHORED_BY: Event → AuthorREFERENCES: Event → Event (e-tags)MENTIONS: Event → Author (p-tags)TAGGED_WITH: Event → Tag
Graph Schema
Node Types
Event Node
(:Event {
id: string, // Hex-encoded event ID (32 bytes)
serial: int, // Sequential serial number
kind: int, // Event kind
created_at: int, // Unix timestamp
content: string, // Event content
sig: string, // Hex-encoded signature
pubkey: string, // Hex-encoded author pubkey
tags: string // JSON-encoded tags array
})
Author Node
(:Author {
pubkey: string // Hex-encoded pubkey (unique)
})
Tag Node
(:Tag {
type: string, // Tag type (e.g., "t", "d")
value: string // Tag value
})
Marker Node (for metadata)
(:Marker {
key: string, // Unique key
value: string // Hex-encoded value
})
Relationships
(:Event)-[:AUTHORED_BY]->(:Author)- Event authorship(:Event)-[:REFERENCES]->(:Event)- Event references (e-tags)(:Event)-[:MENTIONS]->(:Author)- Author mentions (p-tags)(:Event)-[:TAGGED_WITH]->(:Tag)- Generic tag associations
How Nostr REQ Messages Are Implemented
Filter to Cypher Translation
The query engine in query-events.go translates Nostr filters to Cypher queries:
1. ID Filters
{"ids": ["abc123..."]}
Becomes:
MATCH (e:Event)
WHERE e.id = $id_0
For prefix matching (partial IDs):
WHERE e.id STARTS WITH $id_0
2. Author Filters
{"authors": ["pubkey1...", "pubkey2..."]}
Becomes:
MATCH (e:Event)
WHERE e.pubkey IN $authors
3. Kind Filters
{"kinds": [1, 7]}
Becomes:
MATCH (e:Event)
WHERE e.kind IN $kinds
4. Time Range Filters
{"since": 1234567890, "until": 1234567900}
Becomes:
MATCH (e:Event)
WHERE e.created_at >= $since AND e.created_at <= $until
5. Tag Filters (Graph Advantage!)
{"#t": ["bitcoin", "nostr"]}
Becomes:
MATCH (e:Event)
OPTIONAL MATCH (e)-[:TAGGED_WITH]->(t0:Tag)
WHERE t0.type = $tagType_0 AND t0.value IN $tagValues_0
This leverages Neo4j's native graph traversal for efficient tag queries!
6. Combined Filters
{
"kinds": [1],
"authors": ["abc..."],
"#p": ["xyz..."],
"limit": 50
}
Becomes:
MATCH (e:Event)
OPTIONAL MATCH (e)-[:TAGGED_WITH]->(t0:Tag)
WHERE e.kind IN $kinds
AND e.pubkey IN $authors
AND t0.type = $tagType_0
AND t0.value IN $tagValues_0
RETURN e.id, e.kind, e.created_at, e.content, e.sig, e.pubkey, e.tags
ORDER BY e.created_at DESC
LIMIT $limit
Query Execution Flow
- Parse Filter: Extract IDs, authors, kinds, times, tags
- Build Cypher: Construct parameterized query with MATCH/WHERE clauses
- Execute: Run via
ExecuteRead()with read-only session - Parse Results: Convert Neo4j records to Nostr events
- Return: Send events back to client
Configuration
All configuration is centralized in app/config/config.go and visible via ./orly help.
Important: All environment variables must be defined in
app/config/config.go. Do not useos.Getenv()directly in package code. Database backends receive configuration via thedatabase.DatabaseConfigstruct.
Environment Variables
# Neo4j Connection
ORLY_NEO4J_URI="bolt://localhost:7687"
ORLY_NEO4J_USER="neo4j"
ORLY_NEO4J_PASSWORD="password"
# Database Type Selection
ORLY_DB_TYPE="neo4j"
# Data Directory (for Badger metadata storage)
ORLY_DATA_DIR="~/.local/share/ORLY"
# Neo4j Driver Tuning (Memory Management)
ORLY_NEO4J_MAX_CONN_POOL=25 # Max connections (default: 25, driver default: 100)
ORLY_NEO4J_FETCH_SIZE=1000 # Records per fetch batch (default: 1000, -1=all)
ORLY_NEO4J_MAX_TX_RETRY_SEC=30 # Max transaction retry time in seconds
ORLY_NEO4J_QUERY_RESULT_LIMIT=10000 # Max results per query (0=unlimited)
Example Docker Compose Setup
version: '3.8'
services:
neo4j:
image: neo4j:5.15
ports:
- "7474:7474" # HTTP
- "7687:7687" # Bolt
environment:
- NEO4J_AUTH=neo4j/password
- NEO4J_PLUGINS=["apoc"]
# Memory tuning for production
- NEO4J_server_memory_heap_initial__size=512m
- NEO4J_server_memory_heap_max__size=1g
- NEO4J_server_memory_pagecache_size=512m
# Transaction memory limits (prevent runaway queries)
- NEO4J_dbms_memory_transaction_total__max=256m
- NEO4J_dbms_memory_transaction_max=64m
# Query timeout
- NEO4J_dbms_transaction_timeout=30s
volumes:
- neo4j_data:/data
- neo4j_logs:/logs
orly:
build: .
ports:
- "3334:3334"
environment:
- ORLY_DB_TYPE=neo4j
- ORLY_NEO4J_URI=bolt://neo4j:7687
- ORLY_NEO4J_USER=neo4j
- ORLY_NEO4J_PASSWORD=password
# Driver tuning for memory management
- ORLY_NEO4J_MAX_CONN_POOL=25
- ORLY_NEO4J_FETCH_SIZE=1000
- ORLY_NEO4J_QUERY_RESULT_LIMIT=10000
depends_on:
- neo4j
volumes:
neo4j_data:
neo4j_logs:
Performance Considerations
Advantages Over Badger/DGraph
- Native Graph Queries: Tag relationships and social graph traversals are native operations
- Optimized Indexes: Automatic index usage for constrained properties
- Efficient Joins: Relationship traversals are O(1) lookups
- Query Planner: Neo4j's query planner optimizes complex multi-filter queries
Tuning Recommendations
-
Indexes: The schema creates indexes for:
- Event ID (unique constraint + index)
- Event kind
- Event created_at
- Composite: kind + created_at
- Tag type + value
-
Cache Configuration: Configure Neo4j's page cache and heap size (see Memory Tuning below)
-
Query Limits: The relay automatically enforces
ORLY_NEO4J_QUERY_RESULT_LIMIT(default: 10000) to prevent unbounded queries from exhausting memory
Memory Tuning
Neo4j runs as a separate process (typically in Docker), so memory management involves both the relay driver settings and Neo4j server configuration.
Understanding Memory Layers
-
ORLY Relay Process (~35MB RSS typical)
- Go driver connection pool
- Query result buffering
- Controlled by
ORLY_NEO4J_*environment variables
-
Neo4j Server Process (512MB-4GB+ depending on data)
- JVM heap for Java objects
- Page cache for graph data
- Transaction memory for query execution
- Controlled by
NEO4J_*environment variables
Relay Driver Tuning (ORLY side)
| Variable | Default | Description |
|---|---|---|
ORLY_NEO4J_MAX_CONN_POOL |
25 | Max connections in pool. Lower = less memory, but may bottleneck under high load. Driver default is 100. |
ORLY_NEO4J_FETCH_SIZE |
1000 | Records fetched per batch. Lower = less memory per query, more round trips. Set to -1 for all (risky). |
ORLY_NEO4J_MAX_TX_RETRY_SEC |
30 | Max seconds to retry failed transactions. |
ORLY_NEO4J_QUERY_RESULT_LIMIT |
10000 | Hard cap on results per query. Prevents unbounded queries. Set to 0 for unlimited (not recommended). |
Recommended settings for memory-constrained environments:
ORLY_NEO4J_MAX_CONN_POOL=10
ORLY_NEO4J_FETCH_SIZE=500
ORLY_NEO4J_QUERY_RESULT_LIMIT=5000
Neo4j Server Tuning (Docker/neo4j.conf)
JVM Heap Memory - For Java objects and query processing:
# Docker environment variables
NEO4J_server_memory_heap_initial__size=512m
NEO4J_server_memory_heap_max__size=1g
# neo4j.conf equivalent
server.memory.heap.initial_size=512m
server.memory.heap.max_size=1g
Page Cache - For caching graph data from disk:
# Docker
NEO4J_server_memory_pagecache_size=512m
# neo4j.conf
server.memory.pagecache.size=512m
Transaction Memory Limits - Prevent runaway queries:
# Docker
NEO4J_dbms_memory_transaction_total__max=256m # Global limit across all transactions
NEO4J_dbms_memory_transaction_max=64m # Per-transaction limit
# neo4j.conf
dbms.memory.transaction.total.max=256m
db.memory.transaction.max=64m
Query Timeout - Kill long-running queries:
# Docker
NEO4J_dbms_transaction_timeout=30s
# neo4j.conf
dbms.transaction.timeout=30s
Memory Sizing Guidelines
| Deployment Size | Heap | Page Cache | Total Neo4j | ORLY Pool |
|---|---|---|---|---|
| Development | 512m | 256m | ~1GB | 10 |
| Small relay (<100k events) | 1g | 512m | ~2GB | 25 |
| Medium relay (<1M events) | 2g | 1g | ~4GB | 50 |
| Large relay (>1M events) | 4g | 2g | ~8GB | 100 |
Formula for Page Cache:
Page Cache = Data Size on Disk × 1.2
Use neo4j-admin server memory-recommendation inside the container to get tailored recommendations.
Monitoring Memory Usage
Check Neo4j memory from relay logs:
# Driver config is logged at startup
grep "connecting to neo4j" /path/to/orly.log
# Output: connecting to neo4j at bolt://... (pool=25, fetch=1000, txRetry=30s)
Check Neo4j server memory:
# Inside Neo4j container
docker exec neo4j neo4j-admin server memory-recommendation
# Or query via Cypher
CALL dbms.listPools() YIELD pool, heapMemoryUsed, heapMemoryUsedBytes
RETURN pool, heapMemoryUsed
Monitor transaction memory:
CALL dbms.listTransactions()
YIELD transactionId, currentQuery, allocatedBytes
RETURN transactionId, currentQuery, allocatedBytes
ORDER BY allocatedBytes DESC
Implementation Details
Replaceable Events
Replaceable events (kinds 0, 3, 10000-19999) are handled in WouldReplaceEvent():
MATCH (e:Event {kind: $kind, pubkey: $pubkey})
WHERE e.created_at < $createdAt
RETURN e.serial, e.created_at
Older events are deleted before saving the new one.
Parameterized Replaceable Events
For kinds 30000-39999, we also match on the d-tag:
MATCH (e:Event {kind: $kind, pubkey: $pubkey})-[:TAGGED_WITH]->(t:Tag {type: 'd', value: $dValue})
WHERE e.created_at < $createdAt
RETURN e.serial
Event Deletion (NIP-09)
Delete events (kind 5) are processed via graph traversal:
MATCH (target:Event {id: $targetId})
MATCH (delete:Event {kind: 5})-[:REFERENCES]->(target)
WHERE delete.pubkey = $pubkey OR delete.pubkey IN $admins
RETURN delete.id
Only same-author or admin deletions are allowed.
Comparison with Other Backends
| Feature | Badger | DGraph | Neo4j |
|---|---|---|---|
| Storage Type | Key-value | Graph (distributed) | Graph (native) |
| Query Language | Custom indexes | DQL | Cypher |
| Tag Queries | Index lookups | Graph traversal | Native relationships |
| Scaling | Single-node | Distributed | Cluster/Causal cluster |
| Memory Usage | Low | Medium | High |
| Setup Complexity | Minimal | Medium | Medium |
| Best For | Small relays | Large distributed | Relationship-heavy |
Development Guide
Adding New Indexes
- Update schema.go with new index definition
- Add to
applySchema()function - Restart relay to apply schema changes
Example:
CREATE INDEX event_content_fulltext IF NOT EXISTS
FOR (e:Event) ON (e.content)
OPTIONS {indexConfig: {`fulltext.analyzer`: 'english'}}
Custom Queries
To add custom query methods:
- Add method to query-events.go
- Build Cypher query with parameterization
- Use
ExecuteRead()orExecuteWrite()as appropriate - Parse results with
parseEventsFromResult()
Testing
Due to Neo4j dependency, tests require a running Neo4j instance:
# Start Neo4j via Docker
docker run -d --name neo4j-test \
-p 7687:7687 \
-e NEO4J_AUTH=neo4j/test \
neo4j:5.15
# Run tests
ORLY_NEO4J_URI="bolt://localhost:7687" \
ORLY_NEO4J_USER="neo4j" \
ORLY_NEO4J_PASSWORD="test" \
go test ./pkg/neo4j/...
# Cleanup
docker rm -f neo4j-test
Future Enhancements
- Full-text Search: Leverage Neo4j's full-text indexes for content search
- Graph Analytics: Implement social graph metrics (centrality, communities)
- Advanced Queries: Support NIP-50 search via Cypher full-text capabilities
- Clustering: Deploy Neo4j cluster for high availability
- APOC Procedures: Utilize APOC library for advanced graph algorithms
- Caching Layer: Implement query result caching similar to Badger backend
Troubleshooting
Connection Issues
# Test connectivity
cypher-shell -a bolt://localhost:7687 -u neo4j -p password
# Check Neo4j logs
docker logs neo4j
Performance Issues
// View query execution plan
EXPLAIN MATCH (e:Event) WHERE e.kind = 1 RETURN e LIMIT 10
// Profile query performance
PROFILE MATCH (e:Event)-[:AUTHORED_BY]->(a:Author) RETURN e, a LIMIT 10
Schema Issues
// List all constraints
SHOW CONSTRAINTS
// List all indexes
SHOW INDEXES
// Drop and recreate schema
DROP CONSTRAINT event_id_unique IF EXISTS
CREATE CONSTRAINT event_id_unique FOR (e:Event) REQUIRE e.id IS UNIQUE
References
- Neo4j Documentation
- Cypher Query Language
- Neo4j Go Driver
- Graph Database Patterns
- Nostr Protocol (NIP-01)
License
This Neo4j backend implementation follows the same license as the ORLY relay project.