mleku/next.orly.dev

Fork 1

Files

mleku 86481a42e8

initial draft of neo4j database driver

2025-11-17 08:19:44 +00:00

10 KiB

Raw Blame History

Neo4j Database Backend for ORLY Relay

Overview

The Neo4j database backend provides a graph-native storage solution for the ORLY Nostr relay. Unlike traditional key-value or document stores, Neo4j is optimized for relationship-heavy queries, making it an ideal fit for Nostr's social graph and event reference patterns.

Architecture

Core Components

Main Database File (pkg/neo4j/neo4j.go)
- Implements the database.Database interface
- Manages Neo4j driver connection and lifecycle
- Uses Badger for metadata storage (markers, identity, subscriptions)
- Registers with the database factory via init()
Schema Management (pkg/neo4j/schema.go)
- Defines Neo4j constraints and indexes using Cypher
- Creates unique constraints on Event IDs and Author pubkeys
- Indexes for optimal query performance (kind, created_at, tags)
Query Engine (pkg/neo4j/query-events.go)
- Translates Nostr REQ filters to Cypher queries
- Leverages graph traversal for tag relationships
- Supports prefix matching for IDs and pubkeys
- Parameterized queries for security and performance
Event Storage (pkg/neo4j/save-event.go)
- Stores events as nodes with properties
- Creates graph relationships:
  - AUTHORED_BY: Event → Author
  - REFERENCES: Event → Event (e-tags)
  - MENTIONS: Event → Author (p-tags)
  - TAGGED_WITH: Event → Tag

Graph Schema

Node Types

Event Node

(:Event {
  id: string,           // Hex-encoded event ID (32 bytes)
  serial: int,          // Sequential serial number
  kind: int,            // Event kind
  created_at: int,      // Unix timestamp
  content: string,      // Event content
  sig: string,          // Hex-encoded signature
  pubkey: string,       // Hex-encoded author pubkey
  tags: string          // JSON-encoded tags array
})

Author Node

(:Author {
  pubkey: string        // Hex-encoded pubkey (unique)
})

Tag Node

(:Tag {
  type: string,         // Tag type (e.g., "t", "d")
  value: string         // Tag value
})

Marker Node (for metadata)

(:Marker {
  key: string,          // Unique key
  value: string         // Hex-encoded value
})

Relationships

(:Event)-[:AUTHORED_BY]->(:Author) - Event authorship
(:Event)-[:REFERENCES]->(:Event) - Event references (e-tags)
(:Event)-[:MENTIONS]->(:Author) - Author mentions (p-tags)
(:Event)-[:TAGGED_WITH]->(:Tag) - Generic tag associations

How Nostr REQ Messages Are Implemented

Filter to Cypher Translation

The query engine in query-events.go translates Nostr filters to Cypher queries:

1. ID Filters

{"ids": ["abc123..."]}

Becomes:

MATCH (e:Event)
WHERE e.id = $id_0

For prefix matching (partial IDs):

WHERE e.id STARTS WITH $id_0

2. Author Filters

{"authors": ["pubkey1...", "pubkey2..."]}

Becomes:

MATCH (e:Event)
WHERE e.pubkey IN $authors

3. Kind Filters

{"kinds": [1, 7]}

Becomes:

MATCH (e:Event)
WHERE e.kind IN $kinds

4. Time Range Filters

{"since": 1234567890, "until": 1234567900}

Becomes:

MATCH (e:Event)
WHERE e.created_at >= $since AND e.created_at <= $until

5. Tag Filters (Graph Advantage!)

{"#t": ["bitcoin", "nostr"]}

Becomes:

MATCH (e:Event)
OPTIONAL MATCH (e)-[:TAGGED_WITH]->(t0:Tag)
WHERE t0.type = $tagType_0 AND t0.value IN $tagValues_0

This leverages Neo4j's native graph traversal for efficient tag queries!

6. Combined Filters

{
  "kinds": [1],
  "authors": ["abc..."],
  "#p": ["xyz..."],
  "limit": 50
}

Becomes:

MATCH (e:Event)
OPTIONAL MATCH (e)-[:TAGGED_WITH]->(t0:Tag)
WHERE e.kind IN $kinds
  AND e.pubkey IN $authors
  AND t0.type = $tagType_0
  AND t0.value IN $tagValues_0
RETURN e.id, e.kind, e.created_at, e.content, e.sig, e.pubkey, e.tags
ORDER BY e.created_at DESC
LIMIT $limit

Query Execution Flow

Parse Filter: Extract IDs, authors, kinds, times, tags
Build Cypher: Construct parameterized query with MATCH/WHERE clauses
Execute: Run via ExecuteRead() with read-only session
Parse Results: Convert Neo4j records to Nostr events
Return: Send events back to client

Configuration

Environment Variables

# Neo4j Connection
ORLY_NEO4J_URI="bolt://localhost:7687"
ORLY_NEO4J_USER="neo4j"
ORLY_NEO4J_PASSWORD="password"

# Database Type Selection
ORLY_DB_TYPE="neo4j"

# Data Directory (for Badger metadata storage)
ORLY_DATA_DIR="~/.local/share/ORLY"

Example Docker Compose Setup

version: '3.8'
services:
  neo4j:
    image: neo4j:5.15
    ports:
      - "7474:7474"  # HTTP
      - "7687:7687"  # Bolt
    environment:
      - NEO4J_AUTH=neo4j/password
      - NEO4J_PLUGINS=["apoc"]
    volumes:
      - neo4j_data:/data
      - neo4j_logs:/logs

  orly:
    build: .
    ports:
      - "3334:3334"
    environment:
      - ORLY_DB_TYPE=neo4j
      - ORLY_NEO4J_URI=bolt://neo4j:7687
      - ORLY_NEO4J_USER=neo4j
      - ORLY_NEO4J_PASSWORD=password
    depends_on:
      - neo4j

volumes:
  neo4j_data:
  neo4j_logs:

Performance Considerations

Advantages Over Badger/DGraph

Native Graph Queries: Tag relationships and social graph traversals are native operations
Optimized Indexes: Automatic index usage for constrained properties
Efficient Joins: Relationship traversals are O(1) lookups
Query Planner: Neo4j's query planner optimizes complex multi-filter queries

Tuning Recommendations

Indexes: The schema creates indexes for:
- Event ID (unique constraint + index)
- Event kind
- Event created_at
- Composite: kind + created_at
- Tag type + value
Cache Configuration: Configure Neo4j's page cache and heap size:

# neo4j.conf
dbms.memory.heap.initial_size=2G
dbms.memory.heap.max_size=4G
dbms.memory.pagecache.size=4G

Query Limits: Always use LIMIT in queries to prevent memory exhaustion

Implementation Details

Replaceable Events

Replaceable events (kinds 0, 3, 10000-19999) are handled in WouldReplaceEvent():

MATCH (e:Event {kind: $kind, pubkey: $pubkey})
WHERE e.created_at < $createdAt
RETURN e.serial, e.created_at

Older events are deleted before saving the new one.

Parameterized Replaceable Events

For kinds 30000-39999, we also match on the d-tag:

MATCH (e:Event {kind: $kind, pubkey: $pubkey})-[:TAGGED_WITH]->(t:Tag {type: 'd', value: $dValue})
WHERE e.created_at < $createdAt
RETURN e.serial

Event Deletion (NIP-09)

Delete events (kind 5) are processed via graph traversal:

MATCH (target:Event {id: $targetId})
MATCH (delete:Event {kind: 5})-[:REFERENCES]->(target)
WHERE delete.pubkey = $pubkey OR delete.pubkey IN $admins
RETURN delete.id

Only same-author or admin deletions are allowed.

Comparison with Other Backends

Feature	Badger	DGraph	Neo4j
Storage Type	Key-value	Graph (distributed)	Graph (native)
Query Language	Custom indexes	DQL	Cypher
Tag Queries	Index lookups	Graph traversal	Native relationships
Scaling	Single-node	Distributed	Cluster/Causal cluster
Memory Usage	Low	Medium	High
Setup Complexity	Minimal	Medium	Medium
Best For	Small relays	Large distributed	Relationship-heavy

Development Guide

Adding New Indexes

Update schema.go with new index definition
Add to applySchema() function
Restart relay to apply schema changes

Example:

CREATE INDEX event_content_fulltext IF NOT EXISTS
FOR (e:Event) ON (e.content)
OPTIONS {indexConfig: {`fulltext.analyzer`: 'english'}}

Custom Queries

To add custom query methods:

Add method to query-events.go
Build Cypher query with parameterization
Use ExecuteRead() or ExecuteWrite() as appropriate
Parse results with parseEventsFromResult()

Testing

Due to Neo4j dependency, tests require a running Neo4j instance:

# Start Neo4j via Docker
docker run -d --name neo4j-test \
  -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/test \
  neo4j:5.15

# Run tests
ORLY_NEO4J_URI="bolt://localhost:7687" \
ORLY_NEO4J_USER="neo4j" \
ORLY_NEO4J_PASSWORD="test" \
go test ./pkg/neo4j/...

# Cleanup
docker rm -f neo4j-test

Future Enhancements

Full-text Search: Leverage Neo4j's full-text indexes for content search
Graph Analytics: Implement social graph metrics (centrality, communities)
Advanced Queries: Support NIP-50 search via Cypher full-text capabilities
Clustering: Deploy Neo4j cluster for high availability
APOC Procedures: Utilize APOC library for advanced graph algorithms
Caching Layer: Implement query result caching similar to Badger backend

Troubleshooting

Connection Issues

# Test connectivity
cypher-shell -a bolt://localhost:7687 -u neo4j -p password

# Check Neo4j logs
docker logs neo4j

Performance Issues

// View query execution plan
EXPLAIN MATCH (e:Event) WHERE e.kind = 1 RETURN e LIMIT 10

// Profile query performance
PROFILE MATCH (e:Event)-[:AUTHORED_BY]->(a:Author) RETURN e, a LIMIT 10

Schema Issues

// List all constraints
SHOW CONSTRAINTS

// List all indexes
SHOW INDEXES

// Drop and recreate schema
DROP CONSTRAINT event_id_unique IF EXISTS
CREATE CONSTRAINT event_id_unique FOR (e:Event) REQUIRE e.id IS UNIQUE

References

License

This Neo4j backend implementation follows the same license as the ORLY relay project.

10 KiB Raw Blame History

Neo4j Database Backend for ORLY Relay

Overview

Architecture

Core Components

Graph Schema

Node Types

Relationships

How Nostr REQ Messages Are Implemented

Filter to Cypher Translation

1. ID Filters

2. Author Filters

3. Kind Filters

4. Time Range Filters

5. Tag Filters (Graph Advantage!)

6. Combined Filters

Query Execution Flow

Configuration

Environment Variables

Example Docker Compose Setup

Performance Considerations

Advantages Over Badger/DGraph

Tuning Recommendations

Implementation Details

Replaceable Events

Parameterized Replaceable Events

Event Deletion (NIP-09)

Comparison with Other Backends

Development Guide

Adding New Indexes

Custom Queries

Testing

Future Enhancements

Troubleshooting

Connection Issues

Performance Issues

Schema Issues

References

License

10 KiB

Raw Blame History