replicatr is a nostr relay written in pure Go, aimed at becoming a single, modular, and extensible reference implementation of the nostr protocol as described in the nostr NIP (nostr implementation possibilities) specification.

In its initial form it will use a badger data store, interface with the internet computer database for out-of-band replication and potentially ICP based relay subscription payments, and implement a publish/subscribe system based on the Scionic Merkle DAG content addressing scheme, aggregating user events with associated media and complex data types such as Git repositories and immutable filesystem tree structured data, for its primary mechanism of event propagation.

on-demand data distribution and replication

Social networks function as a generalised publish-subscribe distributed database system.

In order to implement this, users are connected to one or a few nodes in the network, a primary and secondaries that pick up the slack if the primary is unresponsive.

Their messages are stored on their nodes, and other users who are subscribing to their published events, their nodes set up subscriptions with the nodes the user is attached to and when new events are published, they are propagated to the subscribing relays in order to deliver them to the users.

This model applies no matter whether all of the replicas are owned by one organisation, who is distributing the data geographically for storage and network efficiency, or if there is many organisations and individuals running replicas of data on the network.

There is quite some challenges involved in engineering these systems to not over-replicate data to the point that individual nodes are overly burdened with data that is not actually used or propagated from them, and on the other side, to prevent data from becoming unavailable or insufficiently replicated.

economising replication costs

Thus, replicatr adds additional logic to the standard nostr relay model that efficiently provides a compact distributed database index of users and their aggregated event streams, allows relays to effectively collaborate to ensure availability without excessively adding redundancy and burdening the network with idle data.

The indexing is done using a new technique called "Scionic Merkle DAGs" ( SMD) https://github.com/HORNET-Storage/scionic-merkletree that enable relays to quickly query each other for copies of data that their users are requesting or are interlinking with their data (via reply threads and tagging) to efficiently replicate and distribute the data of users published events while enabling users to efficiently shard their data storage and minimise infrastructure costs and propagation delays. This covers not just simple text based events but media files and complex data structures such as Git repositories.

extensible storage and network implementations

The standard connectivity protocol of nostr is websockets, and the standard encoding is JSON. These were chosen as a baseline to enable the easy integration of web browser javascript execution engines, which use these two technologies as their primary methods of transferring data.

However, for bulk replication, and generally for scalability, JSON is a terrible, extremely inefficient format with a complex syntax, and websockets are excessively complex layering on top of HTTP and TCP, and for rapid synchronisation between to nodes on a network, a UDP based, binary encoded format is far more suitable. For this, any number of implementations could be added, at minimum, QUIC and Protocol Buffers should be added, as these are widely supported and more efficient.

Likewise, for cases of bulk synchronisation without a latency minimisation requirement, the efficiently encoded form of the map of data objects associated with user accounts, more complex, larger multiplexed storage schemes would be more suitable than piecemeal delivery of individual events, and discovering the missing pieces on either side of a connection efficiently so as to minimise traffic and processing.

leveraging existing distributed data storage systems

In addition to aiming to provide a framework within replicatrto enable a multiplicity of data storage and data replication systems, a key target in the initial release of this project is to integrate it to one or more highly consistent distributed database ledger protocols, aka "blockchains", both as reliable replicas of the indexes of data, as well as directly storing the data, where the protocol in question has this facility built into it.

This will beimplemented in the initial release of replicatr on the internet computer protocol.

multimedia distribution

In addition, with the assistance of the SMD data chunking/indexing system, it is possible to not only distribute the data of simple text based events, but also associated media.

Two main functionalities are targeted for this, one is the capture and caching of data stored on referenced CDNs, and relays creating special event types that aggregate these references so relays can distribute the data directly instead of forcing the clients to provide access timing metadata to such CDNs, a serious privacy risk and surveillance method (aka "web bugs") in addition to enabling clients to publish the data directly to their relays.

In this way, we can have the data remain purely distributed across the relays as their primary network location, and enabling users to aggregate their access without leaking their access metadata to potentially malicious surveillance operators.

As well as simple multimedia, it should be possible and simple for users to publish complex data types, such as filesystem archives, and the more complex branched merkle DAG based Git style filesystem with change history, to enable also the distribution of software and source code with auditability via open source, without centralised silos with the potential to suppress the distribution of user data against the interests of the users and community at large.

conclusion

replicatr aims to build a resilient, efficiently redundant distribution system for event publication and associated media, in a way that is loosely coupled, extensible, and largely self-healing, that can be easily monetised directly via subscription fees and per-access micropayment systems.

structure of this repository

The pieces from which this repository is composed are taken from a scattered collection of repositories mostly written by fiatjaf and gathered together to become a single reference point for implementing nostr relays in pure Go.

pkg/nostr

This directory contains a revised form of the content found at github.com/nbd-wtf/go-nostr rewritten to correct, idiomatic and properly documented Go code, for all things required by replicatr to implement the nostr protocol.