Major progress on capability-based syndicate-rkt implementation
I’ve been working on the novy
branch of syndicate-rkt
(Update: this is now the main branch), following the new
design I developed for the novy-syndicate
TypeScript prototype, driving the design further and
working out new syntax ideas.
Syndicate/rkt example
Here’s an example program, box-and-client.rkt
in the new Syndicate/rkt language:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#lang syndicate
(message-struct set-box (new-value))
(assertion-struct box-state (value))
(module+ main
(actor-system/dataspace (ds)
(spawn #:name 'box
(define-field current-value 0)
(at ds
(assert (box-state (current-value)))
(on (message (set-box $new-value))
(log-info "box: taking on new-value ~v" new-value)
(current-value new-value)))
(stop-on-true (= (current-value) 10)
(log-info "box: terminating")))
(spawn #:name 'client
(at ds
(stop-on (retracted (Observe (:pattern (set-box ,_)) _))
(log-info "client: box has gone"))
(on (asserted (box-state $v))
(log-info "client: learned that box's value is now ~v" v)
(send! ds (set-box (+ v 1))))
(on (retracted (box-state _))
(log-info "client: box state disappeared"))))))
The program consists of two actors, 'box
and 'client
. The box
actor publishes the value of its current-value
field, wrapped in a
box-state
record constructor, to the dataspace (line 11). It reacts
to set-box
messages sent by peers (lines 12–14); in this case, the
client actor, which sends set-box
to increment the value each time
it learns of an updated value from the box (lines 22–24).
The box actor terminates once current-value
reaches 10
. The client
notices the termination of the box actor in two ways (just to show
them off): first, by noticing that the box-state
record was
unpublished from the dataspace (lines 25–26); and second, by noticing
that all subscribers to set-box
messages have vanished (lines
20–21).
What’s different? What’s new?
Explicit object references
The most notable change from previous dataspace programs is the
explicit reference to the dataspace, ds
. Assertions and
subscriptions are now located at a specific (possibly remote)
object, usually but not always a dataspace.
Capability-based security
Related is support for macaroon-style “sturdy references” (analogous to the SturdyRef concept from E). Here’s an example from a secure* chat demo app:
1
2
3
4
5
6
7
<ref "syndicate" [[<or [
<rewrite <bind p <compound <rec Present 1> {0: <lit "tonyg">}>> <ref p>>,
<rewrite <bind p <compound <rec Says 2> {
0: <lit "tonyg">,
1: String
}>> <ref p>>
]>]] #[oHFy7B4NPVqhD6zJmNPbhg==]>
The oid
("syndicate"
on line 1) identifies the target object. The
patterns (lines 2–6) attenuate the authority of the capability to
only permit transmission of Present
and Says
records. The
signature (line 7) proves to the target object that the capability is
genuine and untampered-with.
I’ve implemented most of the necessary plumbing for these, but have
yet to complete the client/server portion of the system that actually
makes use of them. For an example of their use,
see novy-syndicate
.
Schema support
Another interesting change is support for (the relatively new)
Preserves Schema. You can use assertion-struct
and message-struct
as in previous dialects, or you can use Schema-defined types to
establish subscriptions and place assertions with a peer.
Full pattern-matching dataspace implementation
Unlike the novy-syndicate
prototype, this implementation is the
first capability-based design to have a proper “skeleton”-based
dataspace that supports the full range of dataspace patterns. This
allows us to write, for example, subscriptions like
1
(on (retracted (box-state _)) ...)
which only fires when all box-state
assertions are withdrawn.
Patterns over hash-tables
Previous implementations could only match fields in records (with
constant labels) and elements of arrays/lists. This new implementation
is also able to express and match patterns over named-key elements in
Dictionary
Value
s.
This lets actors express patterns over JSON-like Preserves documents, for example.
Pattern quasiquotation
One of the issues I hoped the new architecture would shed light on is pattern quotation. In order to express interest in interest expressed by some other party, you need to be able to describe the subscriptions that are of interest to you. That means you must be able to write patterns over patterns.
Previous implementations didn’t get this right. It was not possible to precisely express interest in subscriptions that bound (or did not bind) certain portions of their input; and it was not possible to precisely express the difference between being interested in a binding or binding a portion of the pattern to be matched itself.
The new design solves these issues with a quasiquote-like facility. Here’s a pattern that matches “subscriptions to unary set-box records”:
1
(Observe (:pattern (set-box ,_)) _)
The :pattern
wrapper introduces a quoted pattern, and
unquote-discard (“,_
”) pops back out a level to say that we don’t
care what the subscriber has put in their pattern at that position.
For example, they may have elected to bind the value inside the
set-box
, or they may have elected to ignore it, or they may have
elected to match only certain values of it, and so on. By discarding
that portion of the pattern, we ignore the specific choice the
matching subscriber made.
If instead we use unquote-bind (“,$
id”), we extract a portion of
the pattern each subscriber placed in the dataspace:
1
(Observe (:pattern (set-box ,$value-pat)) _)
For example, if some subscriber is binding the value in the set-box
to an identifier new-value
, but otherwise placing no constraints on
it, we will be given the following value for value-pat
:
1
<bind new-value <_>>
If, on the other hand, a subscriber is completely ignoring the value
in the set-box
, caring only about the set-box
wrapper itself, we
will be given <_>
, the “discard” pattern.
Crucially, we are now able to distinguish between binding-a-portion-of-the-matched-pattern and matching-a-portion-that-is-a-binding. We’ve seen the former already with unquote-bind; the latter is accomplished by using unquote in the structured syntax for a binding:
1
(Observe (:pattern (set-box ($ ,$their-id ,$further-constraint))) _)
Here we unquote twice. The ($ ...)
constructor itself specifies
that we require matching subscriptions to have a binding at this
position. The first unquote extracts the name in the binding, and the
second extracts the subpattern for the binding. For the example above,
we would end up with their-id
bound to the symbol new-value
and
further-constraint
bound to the subpattern <_>
.
Finally, let’s examine a couple of alternatives that don’t work. This
one is missing the :pattern
wrapper, meaning that instead of asking
about patterns over set-box
records, it is asking about observers
that (mistakenly?) specified an actual set-box
record instead of a
pattern!
1
(Observe (set-box ,_) _)
The compiler won’t actually let you use this version, because the unquote-discard is out of place. There’s no quasiquotation to escape from, so this is a syntax error.
We might try repairing this by simply removing the unquote:
1
(Observe (set-box _) _)
But this is still asking the wrong question, and will never receive any interesting matches from other subscribers in the system.
How does it perform?
Very well! At present it is roughly twice as fast as the previous Racket implementation. Running a benchmark based on the example program above yields the following on one thread of my Ryzen 3960X:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
syndicate/actor: #<actor:0:dataspace> booting
syndicate/task: #<engine:0> starting
syndicate/actor: #<actor:3:box> booting
syndicate/actor: #<actor:6:client> booting
Box got 100000 (66222.3817422824 Hz)
Box got 200000 (68740.1257393843 Hz)
Box got 300000 (68724.52087884837 Hz)
Box got 400000 (68670.19562623161 Hz)
Box got 500000 (68786.55545277878 Hz)
syndicate/actor: #<actor:3:box> terminated OK
Client detected box termination
syndicate/actor: #<actor:6:client> terminated OK
syndicate/task: #<engine:0> stopping
cpu time: 7330 real time: 7330 gc time: 68
It means that the program is able to do ~68,000 complete round-trips per second of update and signalling between the box and client actors.
Preserves and Preserves Schema
The new implementation depends heavily on Preserves and Preserves
Schema, so I’ve ended up doing a fair bit of work on those in order to
get things working in Syndicate/rkt. (Among other things, fixing the
raco pkg install
process for the preserves
and syndicate
Racket
packages!)
First, one nice bit of news is a new Preserves implementation, preserves-nim by Emery Hemingway, for the Nim programming language. I’ve linked the various implementations of Preserves and Syrup on the main Preserves webpage.
There have also been changes to the Schema language and tooling. The
main change to the Schema language is a reappraisal of the role of
Embedded
values in schemas. Previously, they were treated as black
boxes - given just enough machinery to parse them out of and serialize
them back into a Value
, but nothing more. Now, they’re given both a
(de)serializer and an “interface type”; the idea is that an
Embedded
represents a capability to some behavioural object - a
closure, an object pointer, an actor reference, a web service, that
kind of thing - and so there may be an associated API that can be
usefully schematized. This makes schematization of Embedded
values
something closely related to the
higher-order contracts of Dimoulas;
see the bit on
future work in the spec
for some additional thoughts along these lines, as well as a little
example.
The main change to the Schema tooling is support for plugins in
the Schema compiler, allowing Syndicate/rkt to supply a plugin for
generating dataspace patterns from parsed Schema values. The #lang
preserves-schema
support has been likewise extended so you can supply
plugins in the #lang
line.
Last (and probably least), here’s a fun little schema example:
1
2
3
4
5
6
7
8
9
10
version 1 .
JSON =
/ @string string
/ @integer int
/ @double double
/ @boolean JSONBoolean
/ @null =null
/ @array [JSON ...]
/ @object { string: JSON ...:... } .
JSONBoolean = =true / =false .
It recognises the JSON-interoperable subset of Preserves Value
s!
Licensing
A note about licensing: I’ve chosen LGPL 3.0+ as the license for Syndicate/rkt. Many thanks to Massimo Zaniboni for pointing out the lack of license, discussing various options with me, and helping sort out the per-file license headers.