Major progress on capability-based syndicate-rkt implementation

I’ve been working on the novy branch of syndicate-rkt (Update: this is now the main branch), following the new design I developed for the novy-syndicate TypeScript prototype, driving the design further and working out new syntax ideas.

Syndicate/rkt example

Here’s an example program, box-and-client.rkt in the new Syndicate/rkt language:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#lang syndicate

(message-struct set-box (new-value))
(assertion-struct box-state (value))

(module+ main
  (actor-system/dataspace (ds)
    (spawn #:name 'box
           (define-field current-value 0)
           (at ds
             (assert (box-state (current-value)))
             (on (message (set-box $new-value))
               (log-info "box: taking on new-value ~v" new-value)
               (current-value new-value)))
           (stop-on-true (= (current-value) 10)
             (log-info "box: terminating")))

    (spawn #:name 'client
           (at ds
             (stop-on (retracted (Observe (:pattern (set-box ,_)) _))
               (log-info "client: box has gone"))
             (on (asserted (box-state $v))
               (log-info "client: learned that box's value is now ~v" v)
               (send! ds (set-box (+ v 1))))
             (on (retracted (box-state _))
               (log-info "client: box state disappeared"))))))

The program consists of two actors, 'box and 'client. The box actor publishes the value of its current-value field, wrapped in a box-state record constructor, to the dataspace (line 11). It reacts to set-box messages sent by peers (lines 12–14); in this case, the client actor, which sends set-box to increment the value each time it learns of an updated value from the box (lines 22–24).

The box actor terminates once current-value reaches 10. The client notices the termination of the box actor in two ways (just to show them off): first, by noticing that the box-state record was unpublished from the dataspace (lines 25–26); and second, by noticing that all subscribers to set-box messages have vanished (lines 20–21).

What’s different? What’s new?

Explicit object references

The most notable change from previous dataspace programs is the explicit reference to the dataspace, ds. Assertions and subscriptions are now located at a specific (possibly remote) object, usually but not always a dataspace.

Capability-based security

Related is support for macaroon-style “sturdy references” (analogous to the SturdyRef concept from E). Here’s an example from a secure* chat demo app:

1
2
3
4
5
6
7
<ref "syndicate" [[<or [
  <rewrite <bind p <compound <rec Present 1> {0: <lit "tonyg">}>> <ref p>>,
  <rewrite <bind p <compound <rec Says 2> {
    0: <lit "tonyg">,
    1: String
  }>> <ref p>>
]>]] #[oHFy7B4NPVqhD6zJmNPbhg==]>

The oid ("syndicate" on line 1) identifies the target object. The patterns (lines 2–6) attenuate the authority of the capability to only permit transmission of Present and Says records. The signature (line 7) proves to the target object that the capability is genuine and untampered-with.

I’ve implemented most of the necessary plumbing for these, but have yet to complete the client/server portion of the system that actually makes use of them. For an example of their use, see novy-syndicate.

Schema support

Another interesting change is support for (the relatively new) Preserves Schema. You can use assertion-struct and message-struct as in previous dialects, or you can use Schema-defined types to establish subscriptions and place assertions with a peer.

Full pattern-matching dataspace implementation

Unlike the novy-syndicate prototype, this implementation is the first capability-based design to have a proper “skeleton”-based dataspace that supports the full range of dataspace patterns. This allows us to write, for example, subscriptions like

1
(on (retracted (box-state _)) ...)

which only fires when all box-state assertions are withdrawn.

Patterns over hash-tables

Previous implementations could only match fields in records (with constant labels) and elements of arrays/lists. This new implementation is also able to express and match patterns over named-key elements in Dictionary Values.

This lets actors express patterns over JSON-like Preserves documents, for example.

Pattern quasiquotation

One of the issues I hoped the new architecture would shed light on is pattern quotation. In order to express interest in interest expressed by some other party, you need to be able to describe the subscriptions that are of interest to you. That means you must be able to write patterns over patterns.

Previous implementations didn’t get this right. It was not possible to precisely express interest in subscriptions that bound (or did not bind) certain portions of their input; and it was not possible to precisely express the difference between being interested in a binding or binding a portion of the pattern to be matched itself.

The new design solves these issues with a quasiquote-like facility. Here’s a pattern that matches “subscriptions to unary set-box records”:

1
(Observe (:pattern (set-box ,_)) _)

The :pattern wrapper introduces a quoted pattern, and unquote-discard (“,_”) pops back out a level to say that we don’t care what the subscriber has put in their pattern at that position. For example, they may have elected to bind the value inside the set-box, or they may have elected to ignore it, or they may have elected to match only certain values of it, and so on. By discarding that portion of the pattern, we ignore the specific choice the matching subscriber made.

If instead we use unquote-bind (“,$id”), we extract a portion of the pattern each subscriber placed in the dataspace:

1
(Observe (:pattern (set-box ,$value-pat)) _)

For example, if some subscriber is binding the value in the set-box to an identifier new-value, but otherwise placing no constraints on it, we will be given the following value for value-pat:

1
<bind new-value <_>>

If, on the other hand, a subscriber is completely ignoring the value in the set-box, caring only about the set-box wrapper itself, we will be given <_>, the “discard” pattern.

Crucially, we are now able to distinguish between binding-a-portion-of-the-matched-pattern and matching-a-portion-that-is-a-binding. We’ve seen the former already with unquote-bind; the latter is accomplished by using unquote in the structured syntax for a binding:

1
(Observe (:pattern (set-box ($ ,$their-id ,$further-constraint))) _)

Here we unquote twice. The ($ ...) constructor itself specifies that we require matching subscriptions to have a binding at this position. The first unquote extracts the name in the binding, and the second extracts the subpattern for the binding. For the example above, we would end up with their-id bound to the symbol new-value and further-constraint bound to the subpattern <_>.

Finally, let’s examine a couple of alternatives that don’t work. This one is missing the :pattern wrapper, meaning that instead of asking about patterns over set-box records, it is asking about observers that (mistakenly?) specified an actual set-box record instead of a pattern!

1
(Observe (set-box ,_) _)

The compiler won’t actually let you use this version, because the unquote-discard is out of place. There’s no quasiquotation to escape from, so this is a syntax error.

We might try repairing this by simply removing the unquote:

1
(Observe (set-box _) _)

But this is still asking the wrong question, and will never receive any interesting matches from other subscribers in the system.

How does it perform?

Very well! At present it is roughly twice as fast as the previous Racket implementation. Running a benchmark based on the example program above yields the following on one thread of my Ryzen 3960X:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
syndicate/actor: #<actor:0:dataspace> booting
syndicate/task: #<engine:0> starting
syndicate/actor: #<actor:3:box> booting
syndicate/actor: #<actor:6:client> booting
Box got 100000 (66222.3817422824 Hz)
Box got 200000 (68740.1257393843 Hz)
Box got 300000 (68724.52087884837 Hz)
Box got 400000 (68670.19562623161 Hz)
Box got 500000 (68786.55545277878 Hz)
syndicate/actor: #<actor:3:box> terminated OK
Client detected box termination
syndicate/actor: #<actor:6:client> terminated OK
syndicate/task: #<engine:0> stopping
cpu time: 7330 real time: 7330 gc time: 68

It means that the program is able to do ~68,000 complete round-trips per second of update and signalling between the box and client actors.

Preserves and Preserves Schema

The new implementation depends heavily on Preserves and Preserves Schema, so I’ve ended up doing a fair bit of work on those in order to get things working in Syndicate/rkt. (Among other things, fixing the raco pkg install process for the preserves and syndicate Racket packages!)

First, one nice bit of news is a new Preserves implementation, preserves-nim by Emery Hemingway, for the Nim programming language. I’ve linked the various implementations of Preserves and Syrup on the main Preserves webpage.

There have also been changes to the Schema language and tooling. The main change to the Schema language is a reappraisal of the role of Embedded values in schemas. Previously, they were treated as black boxes - given just enough machinery to parse them out of and serialize them back into a Value, but nothing more. Now, they’re given both a (de)serializer and an “interface type”; the idea is that an Embedded represents a capability to some behavioural object - a closure, an object pointer, an actor reference, a web service, that kind of thing - and so there may be an associated API that can be usefully schematized. This makes schematization of Embedded values something closely related to the higher-order contracts of Dimoulas; see the bit on future work in the spec for some additional thoughts along these lines, as well as a little example.

The main change to the Schema tooling is support for plugins in the Schema compiler, allowing Syndicate/rkt to supply a plugin for generating dataspace patterns from parsed Schema values. The #lang preserves-schema support has been likewise extended so you can supply plugins in the #lang line.

Last (and probably least), here’s a fun little schema example:

1
2
3
4
5
6
7
8
9
10
version 1 .
JSON =
     / @string string
     / @integer int
     / @double double
     / @boolean JSONBoolean
     / @null =null
     / @array [JSON ...]
     / @object { string: JSON ...:... } .
JSONBoolean = =true / =false .

It recognises the JSON-interoperable subset of Preserves Values!

Licensing

A note about licensing: I’ve chosen LGPL 3.0+ as the license for Syndicate/rkt. Many thanks to Massimo Zaniboni for pointing out the lack of license, discussing various options with me, and helping sort out the per-file license headers.