Netfarm is an attempt at creating a distributed, trustless object system. We develop Netfarm to ﬁll a perceived void in the rapidly-expanding “decentralisation” bubble, to utilise the greater fault tolerance of a distributed hash table and simplify programming via a client language’s meta object protocol, and to devise less hierarchical moderation techniques that do not leave them at the whim of their servers’ operators and administrators.
The design of Netfarm is not ideal due to historical and planning issues; but it should serve well as an introduction to trustless object systems, which we believe can be easier to program and have better throughput than some of the models that are used frequently today, such as blockchains (demonstrated to be usable for programming in Ethereum) and various federated protocols (such as ActivityPub, notably used in Mastodon among other “fediverse” blogging servers).
To summarise the design, Netfarm implements a synchronous message-passing object system (like Smalltalk or Self) hosted on a distributed hash table. A Netfarm network can reach eventual consistency as state is either so unlikely to collide, such as objects which are stored by hashes, which are probably less likely to collide than for a computer to incorrectly compute a hash; or it is simple to unify, such as computed values. However, Netfarm does not decide on an ordering of computed values, and so has fairly weak consistency guarantees.
Some of these points are made redundant by the preceding paragraphs, but it may be worthwhile to explain why we made those claims, and why they are desirable.
It is also not uncommon for federated systems to have clustering problems. If the developers of a system run a server in that system, it is likely they will encourage users to use their server, or not encourage users to use other servers. In one egregious system, the developers sell servers for their network. It’s likely that they’re all hosted in the same data centres; what happens if something there goes down? Ouch. This phenomenon is not only inﬂuenced by the developers, though. When one person tells their friends that they are moving onto a federated system, and provide their username (which includes the domain name of the server), it is likely their friends will use that server too.
If one is unhappy with the way their server is moderated, some ever-so-wise users remind them they are “free” to leave for another, which is incredibly impolite. We feel obligated to state that it’s analogous to "suggesting" that one should leave their country if they are not happy with how it is governed, which is usually considered by them to be a very lousy argument, and one we consider to be similar in sentiment to theirs (except that moving country is often harder than moving server).
The questions of node operation and moderation in Netfarm are fully separated, allowing users to decide with case-by-case granularity on what content they want to see, and to aggregate their peers’ decisions with various techniques.
We see the Netfarm system as a synthesis of various components:
However, a user should be able to modify and use any component separately, and should have extensive control over many aspects of the system, to implement any changes while still acting as a “client” of the system, that does not modify the system itself.
We would like to thank Luke Nuttall1 for providing a drawing of the Netfarm mascot Demeter.
We would also like to thank Robert Strandh for providing advice on optimising the Netfarm script machine and how Netfarm should interact with client code, and Gilbert Baumann for also suggesting optimisations for the script machine and several stylistic concerns, among other many other insightful discussions.
The boundaries that separate a Netfarm implementation and its host language are not yet clear. Several questions remain, such as how much a Netfarm node must know about the data it is verifying, and what it is expected to do for its clients.
We forsee that a new distributed object system will have to be created, to overcome ﬂaws that were introduced to Netfarm through its age and attempts to persue multiple goals. One such problem was that we chose the means of interaction with the network to be sending and receiving data, which creates an awkward split in mechanisms, between the creation of an object and objects sending messages to each other.
As such, we would like to remind the reader that they should take care to form their own opinions on how such a system should be structured from any problems they have noticed, and that Netfarm is not the end-all of distributed object systems. We would hope for something like Pierre-Joseph Proudhon’s dream in this quotation:
Je rêve une société où je serai guillotiné comme conservateur.
(I dream of a world in which I would be executed as a reactionary.)
A “block” in decentralise2 is represented in two ways, depending on how it is used:
multiple-value-call can be used to convert from the second representation to the ﬁrst, but there are currently no reasons why that should be useful.
The interesting set and uninteresting set for a decentralise2 system are represented as predicates, which usually should not change.
When the interesting set must be changed, the system is notiﬁed using a function which takes two set speciﬁers. A set speciﬁer is either:
The latter is very useful when sets are almost “inﬁnite”, such as in a distributed hash table where the change (from “rehashing” some types of distributed hash tables, or the operator deciding they can commit to storing more objects) can well exceed 2100, and is impractical to represent as a pair of lists.
Strings are always encoded using UTF-8 in the Netfarm systems. No buts, no ifs there. The sequence of code points used to form a string must be preserved in encoding and decoding.1
Integers are encoded in binary data in big endian format, i.e. such that the most signiﬁcant octet (or other unit) is written ﬁrst, and the least signiﬁcant is written last. This format is most common in networking applications, and is the most convenient for users of human languages that write left-to-right to read.
Octet vectors are written in hexadecimal, with the ﬁrst octet to be written or read appearing at the leftmost position in a vector, and the last at the right.
For example, 10 01 is the vector consisting of the octets 16 and 1, and can be interpreted as the integer 4097.
The ﬁrst component that was developed for Netfarm was the decentralise1 library, which allows a programmer to implement some kind of distributed network by writing the components that they are most interested in, and using provided implementations of the components they are not interested in writing.
A decentralise user may implement one or more of:
In a peer-to-peer context, servers and clients do not exist, as there is no distinction between them. This is still true for decentralise2, as servers do not distinguish between servers, clients, and any other program creating connections and sending and receiving messages. However, the client is written to retrieve speciﬁc objects for a user and provide a synchronous interface to them, and a system retrieves many objects asynchronously, and stores them in some kind of database; so it may be argued that there is something that looks like a client, and something that looks like a server.
On the other hand, a server also may handle accepting connections and parsing messages, and thus a system would not be a server, as it does not do those things; so we should stick with the term system to be precise in what a component does.
A connection connects a client and a server together, allowing the two to exchange messages between each other. decentralise2 is able to handle many diﬀerent types of connections, including in-memory passing connections and socketed connections that use the operating system’s networking capabilities, which only have to implement a simple protocol.
⇒connection [Protocol Class]
Note that a connection is implicitly started when it is instantiated.
⇒stop-connection connection [Generic Function]
Stop the connection, which should free any resources that creating the connection acquired, such as threads and sockets. This generic function uses the progn method combination, and thus every method should be qualiﬁed with progn (or :around).
⇒handle-message connection message [Generic Function]
Handle a message that was sent to the connection.
⇒handle-message (connection connection) message [Method]
The default method calls the message handler of the connection with the message.
⇒connection-message-handler connection [Accessor]
The message handler is a function that should be called with each received message. The default handler will wait for a new handler to be set, in order to prevent some race conditions, where a message could be received before the client has set a message handler.
⇒write-message connection message [Generic Function]
Write a message to a connection. The implementation of this must be thread safe, such that two threads can call write-message simultaneously, and will correctly write both messages.
The types of data that a connection can send are represented by a connection’s class, not unlike the stream classes such as fundamental-character-stream and fundamental-binary-stream in the Gray streams de-facto standard.
Note that, unlike Gray streams, all connections are expected to be bidirectional, but a connection that drops all outgoing messages, or never receives any incoming messages is arguably still correct.
⇒character-connection [Protocol Class]
A connection that can send blocks with vectors of characters (strings) as data.
⇒binary-connection [Protocol Class]
A connection that can send blocks with vectors of octets as data.
⇒threaded-connection [Protocol Class]
A connection that uses a thread to read messages.
⇒read-message connection [Generic Function]
Read a message from the connection, blocking until one is present, then returning it and T, or return something and NIL if a message cannot be read ever again (e.g the other node closed the connection).
⇒listener-loop connection [Generic Function]
The thread of the connection will call this function, and stop when it returns.
⇒listener-loop (connection threaded-connection) [Method]
Repeatedly calls read-message, handling its return values appropriately.
A passing connection passes its messages to another connection, sending messages that are written to the connection to the target. Two of these connections in a loop can be used to create a hidden socket.
Hidden sockets are very useful for testing, as they function entirely in the Lisp image, without acquiring external resources or having to encode and decode any messages.
⇒passing-connection-target connection [Accessor]
The connection that the passing connection should pass its messages to.
⇒passing-connection-direction connection [Reader]
The “direction” of a passing connection. Any object that provides context to the use of a connection may be supplied, and this object is printed when printing a passing connection.
The “name” of a passing connection, a string which provides context to the use of a connection. This defaults to "Passing-Connection", and is printed when printing a passing connection.
⇒make-hidden-socket &key class name initargs [Function]
Create two passing connections that are each others’ targets. The connections are instances of the provided class (defaulting to passing-connection), and are instantiated with the provided initargs.
A character connection that exchanges messages using a simple textual format, based on the wire protocol Nettle used.
As the name suggests, this was intended to be the default connection class for Netfarm; but the creation of Netfarm’s much more eﬃcient binary format suggests that we should use a binary connection class as default.
Nonetheless, it is quite easy to read, and superﬁcially looks like the HTTP/1.0 wire protocol. Messages begin with unique “verbs”, and are separated by new lines.
|boolean||::=||yes ∣ no|
|integer||::=||(0 … 9)*|
|length-preﬁxed-string||::=||integer : character*|
|block-header||::=||block block-name version line-count channel-name*|
|error||::=||error block-name reason|
|node||::=||node uri id|
|subscription||::=||subscription block-name version channel-name*|
The base decentralise2 messages are represented as follows:
|boolean||::=||00 ∣ 01|
|byte||::=||00 … ff|
|long-integer||::=||short-integer × 8|
|metadata||::=||block-name long-integer channel-list|
|character-block||::=||02 metadata long-string|
|binary-block||::=||03 metadata long-bytes|
|error||::=||05 block-name long-string|
|announce||::=||09 short-string short-string|
A message is an abstract object that is sent over a connection, and can be considered the lowest-level communication protocol decentralise2 is concerned with.
⇒message-case message &body case* [Macro]
|case||::=||( pattern form* )|
|pattern||::=||( keyword variable* )|
|dont-care||::=||∼ ∣ - ∣ _ ∣ NIL|
|variable||::=||dont-care ∣ symbol|
Match the value of message against some patterns, which look like the arguments to the function message, and evaluate the forms of the ﬁrst matching case, with all variables that don’t match the rule dont-care bound to the arguments.
⇒message keyword &rest value* [Function]
Create a message from the type designated by the keyword with the given values. The number of values must be the same as the number of accessors of the type.
⇒get-blocks (:get names) [Message Type]
⇒put-block (:block name version channels data) [Message Type]
⇒ok-response (:ok name) [Message Type]
⇒error-response (:error name reason) [Message Type]
⇒new-subscriptions (:subscription name version channels) [Message Type]
⇒subscription (:subscribe channels) [Message Type]
⇒announcement-control (:allow-announcement allow?) [Message Type]
⇒node-announcement (:announce uri id) [Message Type]
⇒define-message-type keyword type-speciﬁer constructor-name &rest accessors [Macro]
Deﬁne a message type named by the given keyword, which corresponds to the given Common Lisp type speciﬁer, can be constructed by calling the function named by the constructor-name, and can be destructured using all the functions named in accessors.
The content of messages (particuarly :block messages) is often subject to translation, so that logic in a system can be separated cleanly from serializing and deserializing for connections.
⇒object->data object target source [Generic Function]
“Serialize” or “render” an object (which came from, or represents an instance of the source), into a value that can be sent as data on the target.
⇒data->object data source target [Generic Function]
“Deserialize” or “parse” the data (which represents an instance of, or something the target can consume), into an object, which was received from the source.
Note that all three arguments should be specialized, else weird things can occur. For example, an ansible-connection may be both a binary-connection and a character-connection, so it is possible it will receive either character or binary messages. If you write methods to parse some data with specializers (t binary-connection target) and (t character-connection target), either method may be selected, based on the class precedence list of ansible-connection, independent of the type of data in the message; and the wrong parser may be used. Instead, you should specialize on (string character-connection target) and (vector binary-connection target), so that the correct method will be selected.
An acceptor creates connections by accepting them from something that faces the outside world, such as a server socket, and gives them to a system.
⇒acceptor [Protocol Class]
The protocol class for an acceptor.
⇒threaded-acceptor [Protocol Class]
The protocol class for an acceptor that uses a thread to give a system connections.
⇒give-system-connection system connection [Generic Function]
Give a connection to the system the acceptor accepts connections for.
An acceptor may have to deﬁne :after methods on:
⇒start-acceptor acceptor system [Generic Function]
Start the acceptor, allowing it to create connections on behalf of the system.
⇒start-acceptor (acceptor threaded-acceptor) system [Method]
Start a thread that repeatedly accepts connections by calling acceptor-loop.
⇒stop-acceptor acceptor [Generic Function]
⇒stop-acceptor (acceptor acceptor) [Method]
An acceptor must deﬁne a method on:
⇒accept-connection (acceptor) [Generic Function]
Return a new connection that should be added to the acceptor’s system.
An acceptor should usually only have to deﬁne a primary method on:
⇒acceptor-loop acceptor [Generic Function]
⇒acceptor-loop (acceptor acceptor) [Method]
Forever accept connections and add them to a system.
An acceptor that accepts connections from a socket, also wrapping the connections in SSL.
The class to make instances of. This must be a subclass of either character-connection or binary-connection.
The hostname and port to bind to. These default to "0.0.0.0" and 1892, respectively.
Pathnames for the certiﬁcate and key ﬁles.
⇒start-acceptor (acceptor socket-acceptor) system [:Before Method]
This method starts listening on the given hostname and port.
⇒stop-acceptor (acceptor socket-acceptor) system [:Before Method]
This method stops listening on the given hostname and port.
⇒accept-connection (acceptor socket-acceptor) system [Method]
A socket method accepts a connection by accepting a connection from the socket, and instantiating an instance of the acceptor’s connection class.
Connections are instantiated with initargs :socket and :stream, with values for the raw socket and the SSL-wrapped stream, respectively.
A system handles the requests sent to a node in a distributed system, and sends requests for blocks it has not yet retrieved to synchronise itself. A block is a tuple consisting of a name, version, channel set, and some data; of which the metadata can be used by a system to determine what it must retrieve and update.
⇒system [Protocol Class]
The protocol class for a system. A system is maintained by some maintenance threads. One of which is the leader thread, which runs a leader loop that schedules requests for synchronisation and stops itself when the system is stopped.
TODO: The standard-system should track maintenance threads like it tracks the leader thread, and eventually, this should all be integrated into the bailout thread supervision library.
A system that uses the database and synchronisation protocols to implement the basic behaviour of a node.
A silly system that responds to any messages other than invalid syntax with the message sent.
⇒system-id system [Generic Function]
The ID of a system, as an integer in the range [0,2256)
The acceptors of a system.
The database protocol is part of the system protocol, because a system may process the data sent over the wire into some other format that should be stored and serialised by some other protocol. For example, the Netfarm system implements get-block in terms of netfarm-system:get-vague-object, and provides the rendered vague object and all the object names that aﬀected it via computed values.
It is possible to separate the synchronisation and database protocol implementations,
and combine them into a complete system implementation using multiple inheritance
when using the reference implementation:
The simplest database implementation, which stores block data in a concurrent hash table.
A condition type that is raised when get-block or get-block-metadata are called with the name of a block that is not stored by the system.
The following generic functions must be implemented by a database:
⇒get-block system name [Generic Function]
Return block structure for the block named name from the system, or signal a condition of type not-found.
⇒put-block system name version channels data [Generic Function]
Store a block with the provided block structure in the system, or signal a condition of type error.
⇒map-blocks function system [Generic Function]
Call function repeatedly with block structure for the metadata of every object in the system.
⇒get-block-metadata system name [Generic Function]
Return block structure for the metadata of the block named name from the system, or signal a condition of type not-found.
This generic function will be called for each block’s metadata sent to a system, so adding a method which does not have to load a block’s data is highly recommended. Otherwise, for the implementors’ convenience, the following method will be used:
⇒get-block-metadata (system system) name [Method]
The default method for get-block-metadata calls get-block, and returns the metadata from the data it retrieved, but implementations which are even slightly concerned with eﬃciency should implement a method for get-block-metadata that does not have to read the data for a block.
When metadata about a block is received, it is categorised into either the interesting set or the uninteresting set, or discarded.
These two sets are represented by predicates, and changes between the sets can be represented in several ways (see Set speciﬁers for a reference of these set speciﬁers). The interesting and uninteresting set predicates must usually remain referentially transparent and not change, to allow the system to cache block metadata and schedule requests; but the sets can be modiﬁed when the system is notiﬁed, using update-system-for-new-interesting-block-predicate.
⇒interesting-block-p system name version channels [Generic Function]
Determine if a block should be requested, based on its metadata.
⇒interesting-block-p (system standard-system) name version channels [Method]
Everything is interesting by default.
⇒uninteresting-block-p system name version channels [Generic Function]
Determine if a block’s metadata should be kept, in case it may become interesting later, due to update-system-for-new-interesting-block-predicate.
⇒uninteresting-block-p (system standard-system) name version channels [Method]
Everything that has a newer version than the system stores is uninteresting to the system.
⇒update-system-for-new-interesting-block-predicate system interesting-set uninteresting-set [Generic Function]
Update the system’s cache of the interesting set, adding every member of the interesting-set to the interesting set and removing every member of the uninteresting-set from it.
The leader loop of a system performs tasks that are not easy to delegate to a connection, such as watching for requests that timed out, and replacing them with new requests.
⇒leader-loop system [Generic Function]
Run the leader loop for the system. This is expected to return only when the system is stopped.
⇒leader-loop (system standard-system) [Method]
The leader loop of a standard-system implements the description we gave of a leader loop.
⇒start-system system [Generic Function]
Start a system, by starting all its acceptors and a leader thread.
⇒stop-system system [Generic Function]
Stop a system by closing all its acceptors and connections, and advising maintenance threads to stop.
⇒system-stopping-p system [Generic Function]
Should the threads maintaining the system stop now?
The time (in seconds) that the scheduler should wait for a request to be responded to. This defaults to 10 seconds.
The number of requests that the scheduler can wait for at a time. This defaults to 80 requests.
TODO: It may be useful to add a(n optional) limit on how many requests can be sent to one connection at a time.
If a block names a special block, the :get function for the special block is called. Otherwise, the block data is retrieved using get-block.
In either case, a :ok message is sent if no error is signalled. Otherwise, describe-decentralise-error is called to produce a reason string, and a :error message is sent instead.
TODO: A subclass of standard-system could handle :announce in order to implement peer discovery.
Retrieving and storing (using :get and :block messages, respectively, with the names of) some blocks from a standard system will cause them to perform some special behaviour that does not invoke get-block or put-block as per normal.
⇒define-special-block (class-name block-name &optional (data-type-name t)) &key get put [Macro]
Deﬁne a special block for instances of a class, with associated get and put functions.
NOTE: Because we use generic functions to implement special block dispatch, it is possible to use call-next-method. Should this be standard behaviour?
The data returned by the get function is translated by object->data, with the given data-type-name as source, and the connection as target. The data given as an argument to the put function is translated by data->object, with the connection as source, and the given data-type-name as target.
The standard system provides some special blocks:
⇒list [Special block]
This block contains a list of every block name stored (excluding special block names).
On a character connection, the list is encoded with the metadata of a block per line, each line containing the length-preﬁxed name, the version number, and then length-preﬁxed channels, somewhat mirroring the syntax of the header of :block messages when sent from a netfarm-format-connection.
On a binary connection, the list is encoded as repeating metadata records. These records do not have any delimiters between them; as their length can be detected from the record data. Each record contains the name encoded as a short-string, the version as a long-integer, the number of channels as a long-integer, then each channel name as a short-string.
More formally, both encodings can be generated with the following grammars:
On character connections (building upon the rules in Figure 2.1.3):
|block-metadata||::=||block-name version channel-name*|
||||(block-metadata newline)* block-metadata|
On binary connections (building upon the rules in Figure 2.1.4):
|metadata-record||::=||block-name version channel-list|
When a standard system receives a listing, it assumes that the blocks in the listings may be retrieved from the client, and it may send :get requests for those blocks. If the listing cannot be parsed, a system will instead respond with an :error message, for the block list, and with the reason invalid listing.
⇒id [Special block]
This block contains the ID of the node.
On a character connection, the ID is encoded as a 64-character hexadecimal string, with uppercase A through F characters, and the most signiﬁcant digit written ﬁrst. For example, the ID 9781484261347 would be encoded as 000000000000000000000000000000000000000000000000000008E56DE50FE3.
TODO: Add a binary encoding. Currently, all our binary connection classes are also character connections, and are thus capable of sending character blocks, but this may not be ideal. On the other hand, a 50% reduction of the size of a 64-byte block that is only sent once is not anything to write home about.
When a standard system receives an ID, it associates that ID with the connection used to send the ID. This ID appears in node listings, should the connection then allow itself to be announced.
⇒nodes [Special block]
This block contains a listing of all nodes that have announced themselves to the system, with each node record separated by a newline. Each item in the listing has a length-preﬁxed URI, which may be used to connect to a node, a space, and its ID.
TODO: Add a binary encoding. Again, this block is only retrieved once (as nodes are announced after that with another message type), but it would still be nice to have a binary encoding.
⇒client [Protocol Class]
A client which uses a connection to perform actions.
⇒client-connection client [Reader]
The connection the client uses to perform actions.
A client which uses multiple connections to perform actions, deciding on which connections to use and searching for more connections using the Kademlia distributed hash table search algorithm.
Asynchronous execution is handled using a action protocol. An action is some computation that can be composed (in a method similar to monads in functional programming), and run later. Actions have some beneﬁts over some other asynchronous protocols, such as callbacks:
⇒chain &body binding* return-form [Macro]
A macro which chains some actions to create a more complex action.
Each binding is either of the form (variable <- action), indicating that the value of the action should be bound to a variable, or (variable = form), indicating that the value of a form should be bound to a variable. The package of <- or = does not matter, much like the keywords in extended loop (but :<- doesn’t look very nice, and := wouldn’t really re-assign anything). The value of the action is produced by the return-form.
The method for expanding chain is quite similar to the method for desugaring do notation. The expansion of chain is deﬁned by a recursive function:
|E[(chain value)]||= value|
|E[(chain (v = f) . rest)]||= (let ((v f)) E[(chain . rest)])|
|E[(chain (v ← a) . rest)]||= (then a (lambda (v) E[(chain . rest)])|
⇒run action [Function]
Run an action, either returning the value it succeeded with, or signalling the condition it failed with.
⇒run-away action [Function]
Run an action in the background, returning nothing of interest immediately.
⇒%run action success failure [Generic Function]
Implement run by setting up the action to be run somewhere, which should end up calling either success with a value, or failure with a condition. This probably should not block.
We have picked these actions as primitives to compose other actions out of. It is possible to implement every other action in the client, in terms of these actions, but this is not necessary. (NOTE: We don’t.)
⇒finish value [Function]
An action which always returns value.
⇒functional-action function [Function]
An action which calls function with a success continuation and a failure continuation. If the function calls the success continuation with a value, it succeeds with that value. If the function calls the failure continuation with a value, it fails with that value. If the function signals an error, the action fails with that error.
From these basic actions, it is possible to derive more useful actions.
⇒then action action-producer [Function]
Produce an action which runs action, then calls action-producer with the value action produced to produce another action, then runs that action.
then could be implemented so that (then a p) ≡ (functional-action (lambda (s f) (%run a (lambda (v) (%run (funcall p v) s f)) f))).
⇒then* action1 action2 [Function]
Produce an action which runs action1 and discards its result, then calls action2 and produces its result.
then* could be implemented so that (then* a1 a2) ≡ (then* a1 (constantly a2)).
⇒parallel actions [Function]
Run a sequence of actions in parallel. If all actions succeed, this action succees with a vector of all the values of the actions, or if one action fails, this action fails with the ﬁrst condition.
Evidently (parallel empty-sequence) ≡ (finish #()), and (parallel (singleton-sequence-of a)) ≡ (chain (x <- a) (finish (vector a))). The other cases are much more hairy.
⇒attempt action failure-producer [Function]
Similar to then, except that instead failure-producer will be called with a failure condition, and the action will succeed with the value of calling failure-producer.
The implementation can be similar to that of then, but extending the failure continuation instead of the success continuation.
⇒get-block* client name &key &allow-other-keys [Generic Function]
An action which retrieves a block from a client.
⇒put-block* client name version channels data &key fail-if-too-old &allow-other-keys [Generic Function]
An action which puts a block to a client. If fail-if-too-old is true, we expect that this block should be the newest version, and fail if it is too old. If fail-if-too-old is false, then we consider the action successful if a node has a newer version of the block..
The second component that was developed was the Netfarm object system. The Netfarm object system introduces a text format for representing an object, as well as code to serialize and deserialize objects.
The part of the Netfarm system that implements an object system is also named Netfarm. It may be helpful to describe this component as the “Netfarm object system”, and the entire system as the “Netfarm software stack” or words to those meanings.
As an example of how Netfarm enables its users to exchange objects with little
friction, we will imagine that there are two car enthusiasts, Adrian and Bob, who
want to discuss their cars. Here is a class deﬁnition for a car that they could have
used in a client program:
This class functions identically to any other class deﬁned using the Common Lisp
We will discuss how Adrian can send that description of a car to Bob later.
The Netfarm object system has objects, which are instances of schemas, which are themselves objects.
This system maps well to the Common Lisp object system, and we can describe schemas using classes that are instances of a special metaclass.
The class of a Netfarm class.
The class of a Netfarm object.
A Netfarm object also has a list of signatures, and a source function, that is called to retrieve referenced objects lazily.
⇒object-signatures object [Accessor]
The signatures of an object, stored as an association list of user object-signature data pairs.
⇒object-source object [Accessor]
A function that takes an object name, and returns the object with that object name. This function typically requests the object from a client.
Much like objects in the Common Lisp meta-object protocol, Netfarm objects are represented as vectors of slots; an object is comprised of a vector of “normal” slots, and a vector of computed slots.
Each Netfarm slot maps to a Common Lisp slot, but some Common Lisp slots don’t map to Netfarm slots1.
To avoid some potential non-determinism in how slot lists of schemas expressed with Common Lisp classes are ordered when read by Netfarm, slots are sorted by string< on their names in slot vectors.
⇒slot-netfarm-name slot-deﬁnition [Reader]
The name of a slot, a string or nil if the slot should not be present in the Netfarm object. This defaults to the downcased name of the slot-definition-name of the slot.
Multiple slots of a class cannot have the same name.
⇒slot-renderer slot [Reader]
⇒slot-parser slot [Reader]
The names of functions which are used to render and parse slot values. The parser is called when calling apply-class, and is used to parse a Netfarm representation into the preferred representation of your program. The renderer is called when calling render-object or binary-render-object, and is used to render your representation as a Netfarm representation.
For all values v, it is expected that (netfarm-equal (funcall (slot-renderer slot) (funcall (slot-parser slot) v)) v); the renderer and parser should round trip. If this is not the case, weird things can happen when processing objects, and you should instead use :netfarm-name to create duplicate slots with your preferred representation, preserving the Netfarm representation in the original slot.
⇒fix-up-copied-object object [Generic Function]
Fix up the slots of an object that has been loaded (called from apply-class). This should be used like initialize-instance; you should add :after methods to it, and should probably be used to reproduce your preferred representation slots from the Netfarm representation slots.
⇒script [Protocol Class]
⇒netfarm-class-scripts class [Reader]
The scripts of a class.
Slots are inherited as usual for Common Lisp.
The presentation script, if not provided, is inherited from the ﬁrst superclass of the class that has a presentation script. The message script, if not provided, is inherited from the ﬁrst superclass of the class that has a message script.
The initialization scripts of a class are the union of the direct initialization scripts of a class, and the initialization scripts of the superclasses of the class, with duplicates removed.
Inbuilt schemas do not have meaningful values for the documentation slot, because portable code could depend on the values otherwise, and we would like to be able to modify the documentation if it is unclear or incorrect.
⇒inbuilt@schema[Inbuilt instance of inbuilt@schema]
|Slot name|| |
(("computed-slots") ("documentation") ("scripts") ("slots"))
Schemas are special objects that describes the behaviour and storage of their instances. Note that the schema inbuilt@schema describes itself.
⇒inbuilt@user[Inbuilt instance of inbuilt@schema]
|Slot name|| |
Users are the primary form of authentication in Netfarm, as object signatures are veriﬁed with the keys of the users attached with them.
A user object stores a public Ed52219 signing key (in sign-key), and a Curve25519 exchange key (in ecdh-key).
⇒inbuilt@script[Inbuilt instance of inbuilt@schema]
|Slot name|| |
(("entry-points") ("methods") ("program") ("variables"))
Scripts are “programs” that describe the behaviour of an object. A script has entry points that describe internal procedures, methods that describe external procedures, an octet vector encoding instructions in the program, and a list of the initial values of its variables.
⇒inbuilt@user-script[Inbuilt instance of inbuilt@script]
|Slot name|| |
(("add-datum" 2 0))
#(7 73 1 12 134 71 16 0 1 0 0 8 2 0 4 1 0 130 8)
This script can be assembled with
⇒inbuilt@mirror-schema[Inbuilt instance of inbuilt@schema]
|Slot name|| |
The schema of an object that can ‘reﬂect’ messages in a way that makes the messages anonymous, by making the reference to the sender useless.
For example, this object could be used in an authentication scheme where one object has to challenge another to present a reference to itself. Without a mirror, the other object could cheat by always presenting the sender object; but with a mirror, it cannot do so.
This schema cannot be initialized by the user.
⇒inbuilt@mirror[Inbuilt instance of inbuilt@mirror-schema]
The only instance of inbuilt@mirror-schema.
Objects are usually serialized in order to transmit them to another client that does not share memory with the origin of the object, and that client must then deserialize the serialized form to produce a usable object.
In Netfarm, serialization is done by converting an object to a vague object, which has as much context of the object removed as possible, which can then be serialized into a character or octet vector. These vectors can then be deserialized by deserializing the vector into a vague object, and then re-introducing the context that the origin of the object supplied.
The octet-vector representation is also used for generating and verifying signatures.
The class of a vague object.
⇒apply-class vague-object class [Function]
Recontextualize a vague object by “applying” a class to it, producing an object.
⇒render value &optional stream [Function]
Render a value. If a stream is given, then write the value to that stream, else return the rendered value as a string.
⇒render-object object &optional stream [Function]
Render an object. If a stream is given, then write the object to that stream, else return the rendered object as a string.
⇒parse stream-or-string [Function]
Parse a value from a stream or a string.
⇒parse-block string &key source name [Function]
Parse an object from a string. The returned vague object has the supplied source and name.
|byte||::=||00 … ff|
|string||::=||01 length byte*|
|byte-vector||::=||02 length byte*|
|list||::=||05 length value*|
|reference||::=||06 length byte*|
|value||::=||string ∣ byte-vector ∣ positive-integer|
|∣ negative-integer ∣ list|
|∣ reference ∣ true ∣ false|
|slot||::=||value ∣ slot-unbound|
|metadata-element||::=||string ∣ value|
|object||::=||metadata slots computed-slots|
The integer 0 must be represented as a positive-integer, and integers must
be represented with as few bytes as possible; for example, 12,345 must be
represented as 03 02 30 39, and not 03 03 00 30 39 or with more zero bytes.
⇒binary-render value &optional function [Function]
Render a value. If a function is given, then write the value by calling it with octet vectors that it should write, else return the rendered value as an octet vector.
⇒binary-render-object object &key function (emit-computed-values t) (emit-signatures t) [Function]
Render an object. If a function is given, then write the value by calling it with octet vectors that it should write, else return the rendered value as an octet vector.
emit-computed-values and emit-signatures control if computed values and signatures will be rendered, respectively. The resulting vector will not be parseable if computed values and/or signatures are not rendered. These arguments are present to facilitate hashing and signing, and usually should be left to their defaults.
⇒binary-parse function-or-vector [Function]
Parse a value from a function (called with no arguments, which returns octets) or a vector.
⇒binary-parse-block function-or-vector &key source name [Function]
Parse an object from a function (as previously described for binary-parse) or a vector.
Netfarm uses cryptographic hashes to “name” objects, so that they can be referenced and retrieved by clients that do not already have them stored. Hashes are also used in the process of signing objects.
An object is hashed by rendering it in the binary format, and then hashing the resulting octet vector using the SHA-256 hash algorithm. Usually, computed values are not included in the render, as computed values must not aﬀect object identity. Furthermore, when hashes are used in signing objects, signatures are not included, as they would need to be already present in the object to be signed, or break identity (by changing the resultant hash, or by allowing signatures to be added or removed to an object while preserving hash value).
⇒hash-object object &key (emit-signatures t) [Function]
Produce a binary hash. Note that inbuilt objects cannot be hashed, and thus hashing them will signal an error. emit-signatures will control if signatures are included in the hash.
The most obvious inbuilt object that could not possibly be hashed is inbuilt@schema, which is its own schema. Of course, you could hash it by referencing its inbuilt name and not its hash, but the renderer would have to have a special case to use inbuilt objects’ names instead of the result of hash-object*, which felt worse than disallowing hashing inbuilt objects somehow.
⇒hash-object* object [Function]
Produce a textual hash, equivalent to (bytes->base64 (hash-object object)) for non-inbuilt objects, and in the form inbuilt@name for inbuilt objects.
⇒with-hash-cache () &body body [Macro]
Cache hashes produced with both hashing functions in the body.
Hashing can result in accidentally quadratic runtime2 when traversing graphs of objects and hashing them (such as saving a graph of objects with the Netfarm client). with-hash-cache will reduce hashing operations that have already been done (usually by recursive calls between hashing and the renderer) to hash-table lookups, reducing that kind of pattern to roughly linear runtime.
Hashes used in signing are currently not cached (TODO: though they probably should be.)
If objects that have been hashed are modiﬁed, then the results of hash-object and hash-object* are undeﬁned; but the dynamic extent of with-hash-cache should allow for fairly easy analysis of where hashes can be cached.
Object hashes are stored in a weak hash table, so they will be removed when the objects are garbage, and the garbage collector runs.
⇒with-lexical-hash-cache () &body body [Macro]
Bind the hash caches in a way that allows for them to be closed over, and restored using the local macro with-restored-lexical-hash-cache. This is intended to be used to share cached hashes between threads; which takes some more deliberation to ensure correctness, but can be beneﬁcial.
The Netfarm server uses a hash cache shared between all threads; and this is safe as objects are never mutated inside the server.
⇒with-restored-lexical-hash-cache () &body body [Macro]
Restore the hash caches from with-lexical-hash-cache.
Netfarm uses a small network protocol to exchange objects and updates to objects. This protocol is implemented on decentralise2, but it is not necessary at all for a Netfarm implementation to use any given networking protocol. The basic requirements of the network protocol are:
Any other extensions or mechanisms are not necessary to communicate Netfarm objects, but some may be very useful, such as permitting chunking; so that a user may begin using partially transmitted information before it has all been rendered and received.
The netfarm-client system provides functions to read and write Netfarm objects from a decentralise client, as well as a client class which can search through a network for an object.
Continuing our car example, Adrian can create a client and use it to save his
save-object returns the hash of the object, which can be used by Bob to retrieve
the object. Having to pass around a hash is not very convienent, and we can attach
our car to some other object Bob may already know about, but we will begin by
using this hash to retrieve the car.
While it is a component that is packaged with the Netfarm object system, the Netfarm script machine is a suﬃciently large and complex component that it deserves its own chapter. Scripts are portable programs that a Netfarm node can run to handle the behaviour of objects.
The script machine is an interpreter for a
mostly-functional bytecode, which is modelled on the SECD machine, which may be
familiar to functional language compiler writers. A simple assembler is provided with
the reference Netfarm implementation, which can compile programs such
While this (symbolic) assembly code looks much diﬀerent to a program written in a high-level language (and also unlike assembly code for real processors), it is similar to the bytecode generated by CLISP for Common Lisp programs, and CPython for Python programs.
Scripts are usually written to implement methods for objects, which describe the behaviour of an object. Some common methods include:
A method call may also materialise as a message, if there is no method available to handle a method call.
The actions (typically, side eﬀects) which a script machine may perform are controlled by a capability list. Some instructions will only execute correctly when a capability has been provided to the script machine. These capabilities currently are:
Scripts running on networks only receive the :write-computed-values capability, as reading introduces non-determinism into the system, and there is no recommender to query. However, scripts running on, for example, a graphical environment used by the end-user to interact with objects, may have all three capabilities.
⇒setup-interpreter script arguments &optional capabilities [Function]
⇒run-interpreter interpreter &key cons-limit cycle-limit print-cycles step [Function]
⇒run-script script &rest arguments [Function]
TODO: Add a dynamic variable holding the capabilities the script machine should have.
⇒run-script-machines object &key apply-side-eﬀects recipient-ﬁlter [Function]
TODO: This entire section.
Presentations are the other use of scripts in Netfarm. A presentation is a graphical representation of an object that retains its identity. A presentation model allows Netfarm to exchange machine-readable objects between clients, while also being able to present human-readable descriptions of objects. This model was introduced by [Cic84], and has been used in the Common Lisp Interface Manager to great eﬀect.
Presentations are generated by presentation methods that are implemented using the script machine.
An interpreter may produce side eﬀects to interact with other objects. Currently, there is only one category of side eﬀect: side eﬀects which modify the computed value sets of an object.
The ﬁnal state of an object is produced by applying all the applicable side eﬀects. Depending on the nature of the implementation, it may or may not be appropriate to implement side eﬀects as suggested.
The order of which side eﬀects are applied does not aﬀect the produced object. It is for this reason that Netfarm exhibits strong eventual consistency: how objects are synchronised does not aﬀect the computation performed, and so all nodes storing an object will eventually converge on objects with the same values.
We maintain a mapping of computed values to counts for every slot of every object. When all side eﬀects have been considered, the table is traversed, and each computed value is added according to the count if it is not negative, and no computed values are added when a count is negative.
⇒add-computed-value :target :name :value [Side eﬀect]
Increment the count of the given computed value.
⇒remove-computed-value :target :name :value [Side eﬀect]
Decrement the count of the given computed value.
A related technique for implementing these eﬀects, which is useful when saving eﬀects to a disk, is to add each eﬀect to a log, and then compute the computed values by applying side eﬀects when loading an object.
It is trivial to show that computed values exhibit strong eventual consistency. For each relevant eﬀect, we add 1 or add -1 to each counter, and addition of numbers is commutative, i.e. eﬀects can be performed in any order to retrieve the same result. It is also possible to remove eﬀects that cancel out, and to compress the result into one ﬁnal state periodically. However, the objects-aﬀecting graph would have to be maintained outside the eﬀect log for it to still be correct.
The script machine is a stack machine with a representation of lexical environments and automatic memory management. The major changes to the SECD machine are additional registers for convenience, and to facilitate method calls between multiple objects, and a more common bytecode-based instruction format, rather than an instruction format based on S-expressions.
The script machine has these registers:
A stack containing intermediate data.
The lexical environment, represented as a list of vector “frames”.
The position the interpreter is reading instructions from.
A stack telling the interpreter where to return to.
A vector of global variables.
The script the interpreter is current reading instructions from.
Flags that allow the machine to perform operations that only make sense in some situations.
The current object, or the empty list, if we have not called a method.
The last object, which was the value of O before calling a method.
Eﬀects that the interpreter is going to make on the outside world.
A script has these values:
An octet vector that encodes instructions to execute.
A vector of descriptions for procedures, which are combined with environments by some instructions to produce procedures.
A vector of descriptions for methods.
A vector of the initial values of global variables.
A procedure has these values:
The script to execute instructions from.
The environment to extend when called.
|P0||Initial program counter||
The program counter to start executing from.
The number of arguments that this procedure must be called with.
The object that the procedure belongs to.
We write the state of a running interpreter as the tuple ⟨S,E,P,D⟩. We sometimes reference the registers G, C, A, O and O′ outside the tuple, because they are seldom used by instructions, and so they are just tedious to write most of the time.
We write the states of a halted interpreter as halt[S], should it halt correctly, or error[message, cause] should it fail. (Only the data stack is preserved when an interpreter stops.)
We write the eﬀects of interpreting an instruction as
old state → new state
Should an instruction concern itself with only the Store, we may write eﬀects in the form
i1…im — o1…on
(borrowing from Forth conventions) which can be translated to the other notation as
⟨(im…i1 . S),E,P,D⟩ → ⟨(on…o1 . S),E,P + length,D⟩
where length is the total length of the instruction. Note that the equivalent transition has the stack preﬁxes “reversed”; the last input is popped ﬁrst, and the ﬁrst output is pushed ﬁrst. In Forth, we may write a program 3 2 - (⇒ 1) and the stack before evaluating - as a list would be (2 3). To maintain a natural argument order for the programmer, the machine handles arguments “backwards” when described using lists.
To access a value of a structure without destructuring it, we use the notation SA to get the value of the variable (slot) A of S. We also use subscripts to retrieve values from lists and vectors; Sn denotes the nth value in S. The ﬁrst element of a sequence is retrieved with a zero subscript, as with most programming languages; and retrieving an element that would be past the last element of a sequence causes interpretation to fail.
We create many procedures while running programs, so we write a procedure in the form proc[C,E,P0,A,O].
The machine ﬁrst reads the instruction to execute, by reading the Pth octet in CP, and then incrementing P. The instruction arguments are then read the same way. Finally, a transition is chosen based on the state of the machine at this point. Unless the machine has transitioned to a halt state or unwound entirely to a error state, it then proceeds to execute another instruction.
|getenv[(_.E),variable,frame]||= getenv[E,variable,frame − 1]|
⇒1: get-proc n [Instruction]
⟨S,E,P,D⟩ → ⟨(proc[C,E,P0,A,O] . S),E,P + 2,D⟩
where (P0 A) = (CE)n
This closes over the current environment, whereas get-proc* does not.
⇒2: get-value n [Instruction]
⟨S,E,P,D⟩ → ⟨(Gn . S),E,P + 2,D⟩
⇒3: set-value! n [Instruction]
⟨(value.S),E,P,D⟩ → ⟨S,E,P + 2,D⟩
where Gn ← value
⇒4: get-env variable frame [Instruction]
⟨S,E,P,D⟩ → ⟨S,E,P + 3,(value.D)⟩
where value = getenv[E,variable,frame]
⇒5: set-env! variable frame [Instruction]
⟨(value . S),E,P,D⟩ → ⟨S,E′,P + 3,D⟩
where getenv[E,variable,frame] ← value
⇒6: get-proc* n [Instruction]
⟨S,E,P,D⟩ → ⟨(proc[C,(),P0,A,O] . S),E,P + 2,D⟩
where (P0 A) = (CE)n
⇒11: byte n [Instruction]
⇒8: return [Instruction]
⟨S,E,P,()⟩ → halt[S]
⟨S,E,P,(normalframe[E′,P′,C′,O′].D)⟩ → ⟨S,E′,P′,D⟩
where O ← O′,C ← C′
⟨S,E,P,(methodframe[E′,P′,C′,O′,O2′,F′,S].D)⟩ → ⟨(("ok" . S) . S′),E′,P′,D⟩
where O ← O′,O2 ← O2′,F ← F′
This transition handles half of the “isolation” of method calls, returning a value that is distinguishable from a value produced by propagating an error (with unwind).
⇒9: call n [Instruction]
⟨(proc[C′,E,P0,count,O′] An…A1 . S),E,P,D⟩ → ⟨S,((A1…An) . E′),P0,(normalframe[E,P + 2,C,O] . D)⟩
when count = n, where O ← O′,C ← C′
Note that the return address must be placed at the instruction after the call instruction.
⇒10: tail-call n [Instruction]
⟨(proc[C′,E,P0,count,O′] An…A1 . S),E,P,D⟩ → ⟨S,((A1…An) . E′),P0,D⟩
when count = n, where O ← O′,C ← C′
⇒16: jump-cond thenhigh thenlow elsehigh elselow [Instruction]
⟨(#f.S),E,P,D⟩ → ⟨S,E,P + 5 + else,D⟩
where else = elsehigh × 256 + elselow
⟨(v.S),E,P,D⟩ → ⟨S,E,P + 5 + then,D⟩
where then = thenhigh × 256 + thenlow,v≠#f
Jumps are relative to the end of the instruction, so jumping 0 bytes basically does nothing.
⇒17: jump high low [Instruction]
⟨S,E,P,D⟩ → ⟨S,E,P + 3 + high × 256 + low,D⟩
⇒19: call-method n [Instruction]
⟨S,E,P,(Name Object An…A1…. D)⟩ → ⟨(),((Methods A1…An)),P0,(methodframe[E,P,C,O,O2,F,S].D)⟩
where primary[(... . Methods)] = methods[Object,Name] then O ← Object,O2 ← O
TODO: How to compute methods, and how to handle does-not-understand methods.
Every script of the class of an object contains a list of lists (Name P0 A′), where A′ is the number of “real” arguments. Methods have an implicit “next method list” argument. When calling a method, the next method list is the ﬁrst argument, and every other method is shifted over one, so the ﬁrst argument appears in the second position in the new frame.
The global variable vectors for each ⟨object, script⟩ pair must be the same. This ensures that one method on an object and one script can observe the changes made by another method on that object and that script.
It is usually a good idea to cache computation of methods; method caching in old Smalltalk was a fairly cheap way to get a reasonable performance gain, as approximately every method call requires dispatching. This is true to a lesser extent for Netfarm code, which has fewer, but still many, method calls.
⇒24: error [Instruction]
⟨(cause message.S),E,P,D⟩ → error[message, cause]
⇒34: drop [Instruction]
⇒35: dup [Instruction]
x — x x
⇒36: swap [Instruction]
x y — y x
⇒64: cons [Instruction]
car cdr — (car . cdr)
cdr must be a list; we can’t represent improper lists in Netfarm, so we don’t give the script machine improper lists.
⇒65: car [Instruction]
(car . _) — car
⇒66: cdr [Instruction]
(_ . cdr) — cdr
⇒67: null [Instruction]
() — #t
_ — #f
⇒68: consp [Instruction]
(_ . _) — #t
_ — #f
⇒69: append [Instruction]
a b — a++b
⇒71: equal [Instruction]
a b — equal[a,b]
|equal[||(aa . ad),||(ba . bd)]||= equal[aa,ba] ∧ equal[ad,bd]|
|equal[||a ∈ Integer,||b ∈ Integer]||= a = b|
|equal[||a ∈ String,||b ∈ String]||= a = b|
|equal[||a ∈ Object,||b ∈ Object]||= hash[a] = hash[b]|
|equal[||proc[aa,ae,ac],||proc[ba,be,bc]]||= aa = ba ∧ ae = be ∧ ac = bc|
⇒72: string= [Instruction]
a b — equal[a,b]
when a ∈ String,b ∈ String
⇒73: list n [Instruction]
e1…en — (e1…en)
⇒96: + [Instruction]
a b — a+b
⇒97: - [Instruction]
a b — a-b
⇒98: * [Instruction]
a b — a × b
⇒99: / [Instruction]
a b — ⌊⌋
The results of division are rounded to the closest integer below the actual value; i.e -5 2 / ⇒−3
⇒100: abs [Instruction]
a — |a|
⇒112: = [Instruction]
a b — a = b
when a ∈ Integer,b ∈ Integer
The Netfarm server is a decentralise2 system, which retrieves a partition of all the objects available to it, and makes them available to its connections.
In the introduction to decentralise2, we argued that the terms “client” and “server” do not have enough meaning to be useful when discussing Netfarm software; but we ended up calling this module a “server”.
The class of a Netfarm server.
A system that stores everything in memory. Compare this to decentralise2’s memory-database-mixin.
A system that stores everything in memory, using the simplest feasible implementations of the database protocol.
A mixin that stores objects in a ﬁlesystem.
Objects are stored in a “trie” constructed of directories, to avoid slowdowns that occur on some ﬁle systems, should one directory contain many ﬁles. For each object, three ﬁles are stored:
Some ﬁles are also stored in the storage directory:
The directory that objects should be stored in.
A mixin that caches some objects in memory, but defers to another database implementation when it does not have an object stored.
The maximum rendered size of objects that can be cached at once, in bytes. This defaults to 100 megabytes (108 bytes).
The size that vague objects should be assumed to take when rendered, should the real database not provide a size.
⇒put-vague-object system vague-object [Generic Function]
⇒get-vague-object system name [Generic Function]
⇒map-vague-objects function system [Generic Function]
⇒vague-object-stored-p system name [Generic Function]
⇒count-vague-objects system [Generic Function]
⇒presentable-name-p system name [Accessor]
⇒count-vague-objects system [Generic Function]
⇒apply-side-effect system name cause side-eﬀect-type &key [Generic Function]
⇒map-computed-values-caused-by function cause-name system [Generic Function]
⇒objects-affected-by system name [Generic Function]
Return a list of the object hashes aﬀected by a given object.
⇒objects-affecting system name [Generic Function]
Return a list of the object hashes which aﬀect a given object.
⇒objects-affecting-hash system name [Generic Function]
Compute a hash value for the objects-aﬀecting list, by
This can be implemented incrementally, but the default method computes it from the entire objects-aﬀecting list. (Recall multiplication and modular multiplication are commutative - a database could store the last hash, and then compute h′≡ h × a (mod 2256) when adding a.)
(We were initially going to reduce logxor over the set, but bitwise exclusive-or, among all additive groups, can be attacked by [Wag02]. It is unclear what such an attack could achieve, but we are going to avoid it by using a multiplicative group instead.)
⇒add-other-interesting-block system name [Generic Function]
⇒remove-other-interesting-block system name [Generic Function]
⇒other-interesting-block-p system name [Generic Function]
A Netfarm server maintains a dependency graph, consisting of a set of dependency edges, which each store the name of a dependent (a string), the name of a dependency (another string), and a type (a keyword).
For the avoidance of doubt (because we frequently got confused by the names): when a relevant edge exists, storing a dependent object is prevented until a dependency object is retrieved.
All functions that take edge types except for add-dependency-edge also accept the type :all to retrieve all edges, regardless of type.
⇒add-dependency-edge system type dependent dependency [Generic Function]
⇒remove-dependents system dependency [Generic Function]
⇒map-dependents function system type dependency [Generic Function]
⇒map-dependencies function system type dependencies [Generic Function]
⇒map-dependency-edges function system [Generic Function]
⇒no-dependencies-p system type name [Generic Function]
A mixin that implements a dependency graph by storing a list of its edges. This mixin may be slower than dependency-table-mixin, but it is easier to read about, as it does not do any tricky indexing.
A mixin that implements a dependency graph by storing the edges in hash tables. This may look up dependencies and dependents more eﬃciently than dependency-table-mixin, especially with many dependencies to manage.
We have not thoroughly tested this voting-web system, but we should describe it, because it is a very useful but infrequently applied technique for moderating an online system. This system may be seen as an extension of many Usenet clients’ kill lists, which contain a set of patterns that match posts the user does not want to see; but it can also be used as a whitelist, or just to rank some objects in some order.
It is very important that a distributed system has a means of distributed “moderation”. We wonder why distributed systems are built without distributed moderation of some form, and we believe that using centralised moderation on such a system would be an equivalent of the “people’s stick” described in [Bak73]. Unlike some kinds of moderation in the real world, we are not required to give computers some kind of “objective” truth, and we do not attempt that with Netfarm. Freeing ourselves from that obligation allows us to approximate the desires of each user with greater precision.
This system allows a user to maintain a map of “facts” that they assert to be true or false. A presentation script may query the voting-web system while rendering an object, to decide what related objects it should render, or a client program may just refuse to render some presentation forms of an object that has certain facts about it asserted.
In the absence of a fact asserted by the user, we believe it is possible to determine the probability that the user would assert the fact, as a function of some delegates’ assertions. Suppose Pr(F is true to U) is the probability that a user U takes the fact F to be true, and that delegates(U) denotes the set of delegates of the user U. We can deﬁne the probability to be:
|Pr(F is true to U)||= 1||(if U asserts F to be true)|
|Pr(F is true to U)||= 0||(if U asserts F to be false)|
|Pr(F is true to U)||=|
We also introduced a function correlation(U,D) which estimates the correlation coeﬃcient between two users, which we have appropriated from the memory-based collaborative ﬁltering techniques used in recommender systems. This could be deﬁned as:
|and facts||= the rules U asserts ∩ the rules D asserts|
In the absence of a large number of common facts, setting the correlation coeﬃcient to be a small constant may yield results that are not very good, but are better than not attempting to calculate a probability.
It may also be wise to bias the correlation and resultant probabilities towards zero, to produce less conﬁdent values for fewer common facts, and fewer delegates, respectively. This can be done by adding a small positive integer to the denominators of our deﬁnitions of correlation(U,D) and Pr(F is true to U).
How we write Common Lisp code is relatively uninteresting. We mostly follow the SICL style guide (chapter 30 of http://metamodular.com/SICL/sicl-specification.pdf), with the exception of making package designators keywords. This is not set in stone, and using uninterned symbol syntax (eg #:decentralise-system) is also acceptable. Generally, we attempt to use one ﬁle per concept (such as “scheduling requests”, or “rendering an object to an octet vector”), and one package per module (such as “the client” or “the script machine”).
Most of our style is derived from the purposes we write code for, and this is reﬂected in the protocols we write. There are some principles behind how we decide on these protocols:
Abstraction is not the enemy! It is much easier to reason about an abstract protocol, than more concrete and lower-level constructs.
We are not the only users of any layer of our system, so a seemingly “over-abstracted” design may be beneﬁcial for another user, when someone else attempts to use a layer for a diﬀerent purpose. This is not something that can easily spiral out of control, contrary to popular belief. As the layers of our Netfarm implementation are completely separated, it is easy to observe what abstractions and generalisations must be provided to support the other layers of this system.
Performing thought experiments and attempting to design systems for other use cases may provide some insight into what features should be included. However, it is likely that armchair philosophy won’t provide details which would only be seen by testing such a system; so prototype it and watch what happens! If this is diﬃcult to do, then we have failed further in designing an easily adaptable system. The contrapositive, however, is that designing an easily adaptable system makes such experiments easy to do. This further provides insights into how to design an easily adaptable system, and we beneﬁt greatly from this feedback loop.
What is the smallest feature set you can use to implement as much as possible? Then, what do you need to add to have users want to attempt as much as possible?
We are not particuarly interested in keeping the system simple, because that tends to mean making stupid systems to many people. Overspecialising the system, by prescribing a use case to the system, such as presenting documents, or creating a chat application, is also not desirable. To repeat part of the introduction, the aim is, by “using schemas and their scripts, it [should be] possible to implement many diﬀerent programs atop Netfarm.” Implementing a paradigm or program possible should be made possible at almost all costs, but making implementation of one program easier, at the cost of making other programs harder to implement, is never worth it.
If we need to remove a feature from a module, we must consider what a user of that feature could use instead, and test if the resulting designs are still as eﬀective (or, in the case that the feature was of poor quality, more eﬀective). If the programming style is too hampered by modifying a feature, it should not be modiﬁed, and an alternative should be found.
For example, we took eﬀorts to understand and analyse the uses of the old client protocol before updating the protocol. The implementation of the protocol in decentralise2 and the one known user (the Netfarm client) were examined, and we compared the ergonomics and performance of either protocol. Constructs were introduced where we couldn’t easily express features of the old protocol in the new, such as introducing run-away to run an action “in the background”, which would occur by creating a request but not waiting for its value, and we also observed what code could be simpliﬁed by using features of the new protocol, such as being able to generate actions just like any other value, using loops and conditionals, as they are run independently of being created.
Ignoring all other connotations with the word modern, use any features that you would expect a modern programming language to provide.
There should be some concern towards portability, between implementations and programming languages. The system should not be based on “magic” constructs which can only be used in few implementations or programming languages.
However, it is not a goal to support implementations which are designed without useful constructs, such as automatic memory management and exception handling, which greatly decrease the work required to implement a system that is designed with them in mind. There are features in Netfarm which require these constructs, such as the script machine, which does not perform explicit memory management, and also has unwinding, which is easier to program with exceptions; and most of the object semantics demand being able to create multiple references to an object, with non-trivial lifetimes.
Anything which is almost universally considered easy to compile today could probably be expected. Counter-examples include call-with-current-continuation and fexprs, for which new compilation techniques are being produced, but the performance is rarely close to the performance of systems without these constructs. (Also consider that most programming language implementations are old, and if they implement these constructs, they may or may not be using the latest techniques.) Note that this is not related to wide-spread adoption, which is not strictly required. For example, many so-called object-oriented languages, such as C++ and Java, do not feature metaobject protocols, and a convenient way to create new classes at runtime. Implementing systems, such as the Common Lisp meta-object protocol, relatively eﬃciently has been understood almost since it was introduced. (Okay, compiling with call/cc has been understood for a long time, notably with Steele’s RABBIT Scheme compiler and Appel’s book Compiling with Continuations, but an eﬃcient implementation of CLOS is not much slower than an eﬃcient implementation of a worse object system, such as the Java object system.)
accept-connection Generic Function, 25
accept-connection Method, 26
acceptor Protocol Class, 28
acceptor-loop Generic Function, 29
acceptor-loop Method, 30
add-computed-value Side eﬀect, 32
add-dependency-edge Generic Function, 33
add-other-interesting-block Generic Function, 34
announcement-control Message Type, 35
apply-class Function, 36
apply-side-effect Generic Function, 37
attempt Function, 38
caching-mixin Class, 45
capability list, 46
chain Macro, 47
character-connection Protocol Class, 48
client Protocol Class, 49
client-connection Reader, 50
computed value sets, 51
connection Protocol Class, 53
connection-client Class, 54
connection-message-handler Accessor, 55
count-vague-objects Generic Function, 56, 57
data->object Generic Function, 58
decentralise-kademlia:kademlia-client Class, 59
define-message-type Macro, 60
define-special-block Macro, 61
dependency edges, 62
dependency graph, 63
dependency-list-mixin Class, 64
dependency-table-mixin Class, 65
get-block Generic Function, 73
get-block* Generic Function, 74
get-block-metadata Generic Function, 75
get-block-metadata Method, 76
get-blocks Message Type, 77
get-vague-object Generic Function, 78
give-system-connection Generic Function, 79
id Special block, 86
inbuilt@mirror Inbuilt object, 87
inbuilt@mirror-schema Inbuilt object, 88
inbuilt@schema Inbuilt object, 89
inbuilt@script Inbuilt object, 90
inbuilt@user Inbuilt object, 91
inbuilt@user-script Inbuilt object, 92
interesting set, 93
interesting-block-p Generic Function, 94
interesting-block-p Method, 95
maintenance threads, 103
make-hidden-socket Function, 104
map-blocks Generic Function, 105
map-computed-values-caused-by Generic Function, 106
map-dependencies Generic Function, 107
map-dependency-edges Generic Function, 108
map-dependents Generic Function, 109
map-vague-objects Generic Function, 110
memory-database-mixin Class, 111
memory-system Class, 112
message, 113, 114
message Function, 115
message handler, 116
message-case Macro, 117
Netfarm object system, 119
Netfarm server, 120
netfarm-class Metaclass, 121
netfarm-class-scripts Reader, 122
netfarm-format-connection Class, 123
netfarm-system Class, 124
new-subscriptions Message Type, 125
no-dependencies-p Generic Function, 126
node-announcement Message Type, 127
nodes Special block, 128
not-found Condition, 129
object Class, 130
object->data Generic Function, 131
object-signatures Accessor, 132
object-source Accessor, 133
objects-affected-by Generic Function, 134
objects-affecting Generic Function, 135
objects-affecting-hash Generic Function, 136
ok-response Message Type, 137
other-interesting-block-p Generic Function, 138
parallel Function, 139
parse Function, 140
parse-block Function, 141
passing connection, 142
passing-connection Class, 143
passing-connection-direction Reader, 144
passing-connection-target Accessor, 145
presentable-name-p Accessor, 146
presentation methods, 148
put-block Generic Function, 149
put-block Message Type, 150
put-block* Generic Function, 151
put-vague-object Generic Function, 152
read-message Generic Function, 153
remove-computed-value Side eﬀect, 154
remove-dependents Generic Function, 155
remove-other-interesting-block Generic Function, 156
render Function, 157
render-object Function, 158
run Function, 159
run-away Function, 160
run-interpreter Function, 161
run-script Function, 162
run-script-machines Function, 163
script machine, 164
script Protocol Class, 165
SECD machine, 167
set speciﬁers, 169
setup-interpreter Function, 170
side eﬀects, 171
simple-memory-system Class, 172
slot-netfarm-name Reader, 173
slot-parser Reader, 174
slot-renderer Reader, 175
socket-acceptor Class, 176
standard-system Class, 178
start-acceptor :Before Method, 179
start-acceptor Generic Function, 180
start-acceptor Method, 181
start-system Generic Function, 182
stop-acceptor :Before Method, 183
stop-acceptor Generic Function, 184
stop-acceptor Method, 185
stop-connection Generic Function, 186
stop-system Generic Function, 187
strong eventual consistency, 188
subscription Message Type, 189
system Protocol Class, 191
system-id Generic Function, 192
system-stopping-p Generic Function, 193
[Cic84] Eugene C. Ciccarelli. Presentation based user interface. https://dspace.mit.edu/handle/1721.1/6946, 1984.
[Eth19] Ethereum developers. Sharding faq. https://github.com/ethereum/wiki/wiki/Sharding-FAQ, 2019.
[Rou17] Indhi Rousseau. Mastodon instances. https://instances.social/list/advanced, 2017.
[Wag02] David Wagner. A generalised birthday problem. https://www.iacr.org/archive/crypto2002/24420288/24420288.pdf, 2002.
1I believe the Common Lisp babel library preserves code points, as the encoding and decoding functions are quite simple, but some guarantees on how code points are treated would be nice.
1We do need a more unique and less vague name for this...
1This may, for example, be useful to separate internal and external representations; it would be wise to represent a mapping of objects as a hash table in Lisp, but provide an association list for Netfarm to work with.
1The instruction set is neither overly minimalist, as it provides instructions for list, number and string processing, nor does it provide any particular “shortcut” instructions, such as retrieving and then calling a procedure, like a real CISC processor may have.