5.3 KiB
Pubkey Graph System
Overview
The pubkey graph system provides efficient social graph queries by creating bidirectional, direction-aware edges between events and pubkeys in the ORLY relay.
Architecture
1. Pubkey Serial Assignment
Purpose: Compress 32-byte pubkeys to 5-byte serials for space efficiency.
Tables:
pks|pubkey_hash(8)|serial(5)- Hash-to-serial lookup (16 bytes)spk|serial(5)→ 32-byte pubkey (value) - Serial-to-pubkey reverse lookup
Space Savings: Each graph edge saves 27 bytes per pubkey reference (32 → 5 bytes).
2. Graph Edge Storage
Bidirectional edges with metadata:
EventPubkeyGraph (Forward)
epg|event_serial(5)|pubkey_serial(5)|kind(2)|direction(1) = 16 bytes
PubkeyEventGraph (Reverse)
peg|pubkey_serial(5)|kind(2)|direction(1)|event_serial(5) = 16 bytes
3. Direction Byte
The direction byte distinguishes relationship types:
| Value | Direction | From Event Perspective | From Pubkey Perspective |
|---|---|---|---|
0 |
Author | This pubkey is the event author | I am the author of this event |
1 |
P-Tag Out | Event references this pubkey | (not used in reverse) |
2 |
P-Tag In | (not used in forward) | I am referenced by this event |
Location in keys:
- EventPubkeyGraph: Byte 13 (after 3+5+5)
- PubkeyEventGraph: Byte 10 (after 3+5+2)
Graph Edge Creation
When an event is saved:
-
Extract pubkeys:
- Event author:
ev.Pubkey - P-tags: All
["p", "<hex-pubkey>", ...]tags
- Event author:
-
Get or create serials: Each unique pubkey gets a monotonic 5-byte serial
-
Create bidirectional edges:
For author (pubkey = event author):
epg|event_serial|author_serial|kind|0 (author edge) peg|author_serial|kind|0|event_serial (is-author edge)For each p-tag (referenced pubkey):
epg|event_serial|ptag_serial|kind|1 (outbound reference) peg|ptag_serial|kind|2|event_serial (inbound reference)
Query Patterns
Find all events authored by a pubkey
Prefix scan: peg|pubkey_serial|*|0|*
Filter: direction == 0 (author)
Find all events mentioning a pubkey (inbound p-tags)
Prefix scan: peg|pubkey_serial|*|2|*
Filter: direction == 2 (p-tag inbound)
Find all kind-1 events mentioning a pubkey
Prefix scan: peg|pubkey_serial|0x0001|2|*
Exact match: kind == 1, direction == 2
Find all pubkeys referenced by an event (outbound p-tags)
Prefix scan: epg|event_serial|*|*|1
Filter: direction == 1 (p-tag outbound)
Find the author of an event
Prefix scan: epg|event_serial|*|*|0
Filter: direction == 0 (author)
Implementation Details
Thread Safety
The GetOrCreatePubkeySerial function uses:
- Read transaction to check for existing serial
- If not found, get next sequence number
- Write transaction with double-check to handle race conditions
- Returns existing serial if another goroutine created it concurrently
Deduplication
The save-event function deduplicates pubkeys before creating serials:
- Map keyed by hex-encoded pubkey
- Prevents duplicate edges when author is also in p-tags
Edge Cases
- Author in p-tags: Only creates author edge (direction=0), skips duplicate p-tag edge
- Invalid p-tags: Silently skipped if hex decode fails or length != 32 bytes
- No p-tags: Only author edge is created
Performance Characteristics
Space Efficiency
Per event with N unique pubkeys:
- Old approach (storing full pubkeys): N × 32 bytes = 32N bytes
- New approach (using serials): N × 5 bytes = 5N bytes
- Savings: 27N bytes per event (84% reduction)
Example: Event with author + 10 p-tags:
- Old: 11 × 32 = 352 bytes
- New: 11 × 5 = 55 bytes
- Saved: 297 bytes (84%)
Query Performance
- Pubkey lookup: O(1) hash lookup via 8-byte truncated hash
- Serial generation: O(1) atomic increment
- Graph queries: Sequential scan with prefix optimization
- Kind filtering: Built into key ordering, no event decoding needed
Testing
Comprehensive tests verify:
- ✅ Serial assignment and deduplication
- ✅ Bidirectional graph edge creation
- ✅ Multiple events sharing pubkeys
- ✅ Direction byte correctness
- ✅ Edge cases (invalid pubkeys, non-existent keys)
Future Query APIs
The graph structure supports efficient queries for:
-
Social Graph Queries:
- Who does Alice follow? (p-tags authored by Alice)
- Who follows Bob? (p-tags referencing Bob)
- Common connections between Alice and Bob
-
Event Discovery:
- All replies to Alice's events (kind-1 events with p-tag to Alice)
- All events Alice has replied to (kind-1 events by Alice with p-tags)
- Quote reposts, mentions, reactions by event kind
-
Analytics:
- Most-mentioned pubkeys (count p-tag-in edges)
- Most active authors (count author edges)
- Interaction patterns by kind
Migration Notes
This is a new index that:
- Runs alongside existing event indexes
- Populated automatically for all new events
- Does NOT require reindexing existing events (yet)
- Can be backfilled via a migration if needed
To backfill existing events, run a migration that:
- Iterates all events
- Extracts pubkeys and creates serials
- Creates graph edges for each event