Journal entries

An Atom feed Atom feed of these posts is also available.

Syndicated Actors for Python 3

Previously, the mini-syndicate package for Python 3 implemented an older version of the Syndicate network protocol.

A couple of weeks ago, I dusted it off, updated it to the new capability-oriented Syndicate protocol, and fleshed out its nascent Syndicated Actor Model code to be a full implementation of the model, including capabilities, object references, actors, facets, assertions and so on.

The new implementation makes heavy use of Python decorators to work around Python’s limited lambda forms and its poor support for syntactic extensibility. The result is surprisingly not terrible!

The revised codebase is different enough to the previous one that it deserves its own new git repository:

git clone https://git.syndicate-lang.org/syndicate-lang/syndicate-py

It’s also available on pypi.org, as package syndicate-py.

Updated Preserves for Python

As part of the work, I updated the Python Preserves implementation (for both python 2 and python 3) to include the text-based Preserves syntax as well as the binary syntax, plus implementations of Preserves Schema and Preserves Path. Version 0.7.0 or newer of the preserves package on pypi.org has the new features.

A little “chat” demo

(Based on chat.py in the repository.)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
import sys
import asyncio
import random
import syndicate
from syndicate import patterns as P, actor, dataspace
from syndicate.schema import simpleChatProtocol, sturdy

Present = simpleChatProtocol.Present
Says = simpleChatProtocol.Says

ds_capability = syndicate.parse('<ref "syndicate" [] #[pkgN9TBmEd3Q04grVG4Zdw==]>')

@actor.run_system()
def main(turn):
    root_facet = turn._facet

    @syndicate.relay.connect(turn, '<tcp "localhost" 8001>', sturdy.SturdyRef.decode(ds_capability))
    def on_connected(turn, ds):
        me = 'user_' + str(random.randint(10, 1000))

        turn.publish(ds, Present(me))

        @dataspace.during(turn, ds, P.rec('Present', P.CAPTURE), inert_ok=True)
        def on_presence(turn, who):
            print('%s joined' % (who,))
            turn.on_stop(lambda turn: print('%s left' % (who,)))

        @dataspace.on_message(turn, ds, P.rec('Says', P.CAPTURE, P.CAPTURE))
        def on_says(turn, who, what):
            print('%s says %r' % (who, what))

        @turn.linked_task()
        async def accept_input(f):
            reader = asyncio.StreamReader()
            await actor.find_loop().connect_read_pipe(lambda: asyncio.StreamReaderProtocol(reader), sys.stdin)
            while line := (await reader.readline()).decode('utf-8'):
                actor.Turn.external(f, lambda turn: turn.send(ds, Says(me, line.strip())))
            actor.Turn.external(f, lambda turn: turn.stop(root_facet))

Syndicated Actors for Rust (and a new extensible server implementation)

For my system layer project, I need a fast, low-RAM-usage, flexible, extensible daemon that speaks the Syndicate network protocol and exposes dataspace and system management services to other processes.

So I built one in Rust.

git clone https://git.syndicate-lang.org/syndicate-lang/syndicate-rs

It uses the Syndicated Actor Model internally to structure its concurrent activities. The actor model implementation is split out into a crate of its own, syndicate, that can be used by other programs. There is also a crate of macros, syndicate-macros, that makes working with dataspace patterns over Preserves values a bit easier.1

The syndicate crate is reasonably extensively documented. The server itself is documented here.

The implementation includes:

  1. Future work for syndicate-macros is to add syntactic constructs for easily establishing handlers for responding to assertions and messages, cutting out the boilerplate, in the same way that Racket’s syndicate macros do. In particular, having a during! macro would be very useful. 

Rust Preserves v1.0.0 released

As part of my other implementation efforts, I made enough improvements to the Rust Preserves implementation to warrant releasing version 1.0.0.

This release supports the Preserves data model and the binary and text codecs. It also includes a Preserves Schema compiler for Rust (for use e.g. in build.rs) and an implementation of Preserves Path.

There are four crates:

Preserves Path: a query language inspired by XPath

At the beginning of August, I designed a query language for Preserves documents inspired by XPath. Here’s the draft specification. To give just a taste of the language, here are a couple of example selectors:

1
2
3
.annotations ^ Documentation . 0 /

// [.^ [= Test + = NondeterministicTest]] [. 1 rec]

They’re properly explained in the Examples section of the spec.

So far, I’ve used Preserves Path to access portions of network packets received from the server in an implementation of the Syndicate network protocol for bash.

Syndicate for Bash

🙂 Really! The Syndicate network protocol is simple, easily within reach of a shell script plus e.g. netcat or openssl.

Here’s the code so far. The heart of it is about 90 SLOC, showing how easy it can be to interoperate with a Syndicate ecosystem.

Instructions are in the README, if you’d like to try it out!

The code so far contains functions to interact with a Syndicate server along with a small demo, which implements an interactive chat program with presence notifications.

Because Syndicate’s network protocol is polyglot, and the bash demo uses generic chat assertions, the demo automatically interoperates with other implementations, such as the analogous python chat demo.

The next step would be to factor out the protocol implementation from the demo and come up with a simple make install step to make it available for system scripting.

I actually have a real use for this: it’ll be convenient for implementing simple system monitoring services as part of a Syndicate-based system layer. Little bash script services could easily publish the battery charge level, screen brightness level, WiFi connection status, etc. etc., by reading files from /sys and publishing them to the system-wide Syndicate server using this library.

Services and Service Activation

One promising application of dataspaces is dependency tracking for orderly service startup.

The problem of service startup appears at all scales. It could be services cooperating within a single program and process; services cooperating as separate processes on a single machine; containers running in a distributed system; or some combination of them all.

Syndicate programs are composed of multiple services running together, with dependencies on each other, so it makes sense to express service dependency tracking and startup within the programming language.

In the following, I’ll sketch service dependency support for cooperating modules within a single program and process. The same pattern can be used in larger systems; the only essential differences are the service names and the procedures for loading and starting services.

A scenario

Let’s imagine we have the following situation:

G program Top level program syndicate/drivers/tcp syndicate/drivers/tcp program->syndicate/drivers/tcp syndicate/drivers/timer syndicate/drivers/timer program->syndicate/drivers/timer syndicate/drivers/stream syndicate/drivers/stream syndicate/drivers/tcp->syndicate/drivers/stream

A program we are writing depends on the “tcp” service, which in turn depends on the “stream” service. Separately, the top-level program depends on the “timer” service.

Describing the data and protocol

A small protocol for services and service activations describes the data involved:

RequireService = <require-service @service-name any>.
ServiceRunning = <service-running @service-name any>.

An asserted RequireService record indicates demand for a running instance of the named service; an asserted ServiceRunning record indicates presence of the same; and interest in a ServiceRunning implies assertion of a RequireService.

A library “service manager” process, started alongside the top level program, translates observed interest in ServiceRunning into RequireService, and then translates observed RequireService assertions into service startup actions and provision of matching ServiceRunning assertions.

1
2
3
4
5
6
(during (Observe (:pattern (ServiceRunning ,(DLit $service-name))) _)
  (assert (RequireService service-name)))

(during/spawn (RequireService $service-name)
  ;; ... code to load and start the named service ...
  )

Putting these pieces together, we can write a program that waits for a service called 'some-service-name to be running as follows:

1
(during (ServiceRunning 'some-service-name) ...)

When the service appears, the facet in the ellipsis will be started, and if the service crashes, the facet will be stopped (and restarted if the service is restarted).

Services can wait for their own dependencies, of course. This automatically gives a topologically sorted startup order.

Modules as services, and macros for declaring dependencies

In the Syndicate/rkt implementation, a few standard macros and functions implement the necessary protocols.

First, services can be required using a with-services macro:

1
2
3
4
(with-services [syndicate/drivers/tcp
                syndicate/drivers/timer]
  ;; ... body expressions ...
  )

Second, each Racket module can offer a service named after the module by using a provide-service macro at module toplevel. For example, in the syndicate/drivers/tcp Racket module, we find the following form:

1
2
3
4
5
(provide-service [ds]
  (with-services [syndicate/drivers/stream]
    (at ds
      ;; ... set up tcp driver subscriptions ...
      )))

Finally, the main entry point to a Syndicate/rkt program can use a standard-actor-system macro to arrange for the startup of the “service manager” process and a few of the most frequently-used library services:

1
2
3
4
(standard-actor-system [ds]
  ;; ... code making use of a pre-made dataspace (ds) and
  ;;     preloaded standard services ...
  )

Implementing the SSH protocol in Syndicate

This past week I have been dusting off my old implementation of the SSH protocol in order to exercise the new Syndicated Actor design and implementations. You can find the new SSH code here. (You’ll need the latest Syndicate/rkt implementation to run it.)

The old SSH code used a 2014 dataspace-like language dialect called “Marketplace”, which shared some concepts with modern Syndicate but relied much more heavily on a pure-functional programming style within each actor. The new code uses mutability within each actor where it makes sense, and takes advantage of Syndicate DSL features like facets and dataflow variables that I didn’t have back in 2014 to make the code more modular and easier to read and work with.

Big changes include:

plus a couple of small Syndicate/rkt syntax changes (renaming when to on, making the notion of “event expander” actually useful, and a new once form for convenient definition of state-machine-like behaviour).

Diagram of Syndicate Features

I just found an old diagram, part of a talk on Marketplace I gave at RacketCon back in 2013, which relates a bunch of ideas that all fall under the broader Syndicate umbrella. Here it is:

Syndicate Features

I quite like it. I also think it’s interesting how I had so many of the core ideas already in place as far back as 2013.

Fixing up protocol mismatches on-the-fly

I’ve been fleshing out the syndicate-rkt Racket implementation based on the novy-syndicate TypeScript sketch. I just reached a milestone of TCP-based interoperability between the two implementations (yay!), but there’s an interesting little side track involved that I thought I’d write about.

The novy-syndicate code had a placeholder “dataspace” implementation that had extremely limited pattern-matching. It was only able to offer subscribers the ability to select (1) record assertions having (2) a user-selected, constant label.

For example, a subscriber could elect to receive all records labelled with Present; or with Says. Subscribers were not able to even specify arity of matched records. It really was a placeholder for a proper implementation to come later (ported across from a previous syndicate/js implementation).

By contrast, the syndicate-rkt code has a full-fledged dataspace able to index assertions according to quite sophisticated patterns:

1
2
3
4
5
6
7
8
9
10
11
12
13
; Dataspace patterns: a sublanguage of attenuation patterns.
Pattern = DDiscard / DBind / DLit / DCompound .

DDiscard = <_>.
DBind = <bind @name symbol @pattern Pattern>.
DLit = <lit @value any>.
DCompound = @rec  <compound @ctor CRec  @members { int: Pattern ...:... }>
          / @arr  <compound @ctor CArr  @members { int: Pattern ...:... }>
          / @dict <compound @ctor CDict @members { any: Pattern ...:... }> .

CRec = <rec @label any @arity int>.
CArr = <arr @arity int>.
CDict = <dict>.

Now, I managed to get the novy-syndicate example programs to talk to the full syndicate/rkt dataspace - without changing the code!

The way I did it was to rewrite assertions travelling between the programs on the fly.

And the way I did that was to include “rewrite” statements in the capability I gave to the novy-syndicate client to allow it to connect to the syndicate/rkt server.

The idea was to rewrite assertions-of-interest (subscriptions) from the simple label-only pattern of novy-syndicate to the equivalent full-dataspace pattern of syndicate/rkt, and to rewrite the responses from the dataspace from the arbitrary-arity responses of syndicate/rkt to the simple unary responses of novy-syndicate.

Here’s the rewrite specification,1 which ultimately appears embedded as a “caveat” inside the Macaroon-style capabilities that Syndicate uses:2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[ <or [
    <rewrite

     ; Step 1:
     <compound <rec Observe 2> {0: <bind label Symbol>, 1: <bind observer Embedded>}>
     <compound <rec Observe 2> {

       ; Step 1(a):
       0: <compound <rec bind 2> {
         0: <lit assertion>
         1: <compound <rec compound 2> {
           0: <compound <rec rec 2> {0: <ref label>, 1: <lit 1>}>,
           1: <compound <dict> {}>
         }>
       }>

       ; Step 1(b):
       1: <attenuate <ref observer> [
         <rewrite
          <compound <arr 1> {0: <bind v <_>>}>
          <ref v>>
       ]>

     }>>

    ; Step 2:
    <rewrite <bind n <_>> <ref n>>

  ]> ]

It reads:

  1. try matching <Observe L C>, where L is a symbol and C an embedded capability; if it does not match, skip the remainder of this step; otherwise, rewrite it into <Observe <bind assertion ⌜P⌝> f(C)>, where

    1. P is a pattern matching records of the form <L _>, and the quotation operator ⌜·⌝ quotes a pattern over assertions into a term conforming to the Pattern schema above; and

    2. f(C) “attenuates” C by attaching rewrites to it. Any assertion sent to C is required to be of the form [V], and is rewritten into just V.

  2. if the rewrite in step 1 didn’t apply then match anything; call it n; and rewrite it to itself.

The net effect is that when the simple chat example from novy-syndicate asserts

<Observe Present #!C>

the syndicate-rkt server actually sees

<Observe <bind assertion ⌜<Present _>⌝> #!f(C)>

and when syndicate-rkt replies with an actual concrete presence record, for example3

[<Present "Tony">]

the novy-syndicate client will actually receive just

<Present "Tony">

Cool huh?

Now, this works great for Present, which is unary, but not so well for the client’s subscription to Says, which is binary: <Says who what>. So our interoperability is limited here: the client only sees presence information from its peers, and the actual utterances sent get dropped on the floor for lack of an appropriate pattern at the syndicate-rkt dataspace. To fix this, we could include a more complex rewrite specification that treated Presence and Says subscriptions separately and explicitly, with the correct arity for each. But I’m done for now, and will focus on getting a proper dataspace implementation into novy-syndicate instead.

  1. We need a DSL for these rewrite specifications! I’m working on it. It’ll probably look like the existing Syndicate DSL syntax for patterns. 

  2. Here’s the whole capability, including an “oid” identifying the service to be accessed, the sequence of “caveats” rewriting and attenuating information flowing through the capability, and the signature proving the capability’s validity:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    
    <ref "syndicate" [[<or [
      <rewrite <compound <rec Observe 2> {
        0: <bind label Symbol>,
        1: <bind observer Embedded>
      }> <compound <rec Observe 2> {
        0: <compound <rec bind 2> {
          0: <lit assertion>,
          1: <compound <rec compound 2> {
            0: <compound <rec rec 2> {
              0: <ref label>,
              1: <lit 1>
            }>,
            1: <compound <dict> {}>
          }>
        }>,
        1: <attenuate <ref observer> [<rewrite <compound <arr 1> {0: <bind v <_>>}> <ref v>>]>
      }>>,
      <rewrite <bind n <_>> <ref n>>
    ]>]] #[1oCXyvdXylgpWRhgg0w+iw==]>
    

  3. The single-element list is there because the rewritten pattern included a single binding named assertion, so there’s a single value in the list of potentially-many values sent back to the subscriber. The simplified novy-syndicate patterns included exactly one implicit whole-assertion binding, and so the list wrapper is also implicit in the novy-syndicate variation, which is why it has to be explicitly removed to get interoperability here. 

Major progress on capability-based syndicate-rkt implementation

I’ve been working on the novy branch of syndicate-rkt (Update: this is now the main branch), following the new design I developed for the novy-syndicate TypeScript prototype, driving the design further and working out new syntax ideas.

Syndicate/rkt example

Here’s an example program, box-and-client.rkt in the new Syndicate/rkt language:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
#lang syndicate

(message-struct set-box (new-value))
(assertion-struct box-state (value))

(module+ main
  (actor-system/dataspace (ds)
    (spawn #:name 'box
           (define-field current-value 0)
           (at ds
             (assert (box-state (current-value)))
             (on (message (set-box $new-value))
               (log-info "box: taking on new-value ~v" new-value)
               (current-value new-value)))
           (stop-on-true (= (current-value) 10)
             (log-info "box: terminating")))

    (spawn #:name 'client
           (at ds
             (stop-on (retracted (Observe (:pattern (set-box ,_)) _))
               (log-info "client: box has gone"))
             (on (asserted (box-state $v))
               (log-info "client: learned that box's value is now ~v" v)
               (send! ds (set-box (+ v 1))))
             (on (retracted (box-state _))
               (log-info "client: box state disappeared"))))))

The program consists of two actors, 'box and 'client. The box actor publishes the value of its current-value field, wrapped in a box-state record constructor, to the dataspace (line 11). It reacts to set-box messages sent by peers (lines 12–14); in this case, the client actor, which sends set-box to increment the value each time it learns of an updated value from the box (lines 22–24).

The box actor terminates once current-value reaches 10. The client notices the termination of the box actor in two ways (just to show them off): first, by noticing that the box-state record was unpublished from the dataspace (lines 25–26); and second, by noticing that all subscribers to set-box messages have vanished (lines 20–21).

What’s different? What’s new?

Explicit object references

The most notable change from previous dataspace programs is the explicit reference to the dataspace, ds. Assertions and subscriptions are now located at a specific (possibly remote) object, usually but not always a dataspace.

Capability-based security

Related is support for macaroon-style “sturdy references” (analogous to the SturdyRef concept from E). Here’s an example from a secure* chat demo app:

1
2
3
4
5
6
7
<ref "syndicate" [[<or [
  <rewrite <bind p <compound <rec Present 1> {0: <lit "tonyg">}>> <ref p>>,
  <rewrite <bind p <compound <rec Says 2> {
    0: <lit "tonyg">,
    1: String
  }>> <ref p>>
]>]] #[oHFy7B4NPVqhD6zJmNPbhg==]>

The oid ("syndicate" on line 1) identifies the target object. The patterns (lines 2–6) attenuate the authority of the capability to only permit transmission of Present and Says records. The signature (line 7) proves to the target object that the capability is genuine and untampered-with.

I’ve implemented most of the necessary plumbing for these, but have yet to complete the client/server portion of the system that actually makes use of them. For an example of their use, see novy-syndicate.

Schema support

Another interesting change is support for (the relatively new) Preserves Schema. You can use assertion-struct and message-struct as in previous dialects, or you can use Schema-defined types to establish subscriptions and place assertions with a peer.

Full pattern-matching dataspace implementation

Unlike the novy-syndicate prototype, this implementation is the first capability-based design to have a proper “skeleton”-based dataspace that supports the full range of dataspace patterns. This allows us to write, for example, subscriptions like

1
(on (retracted (box-state _)) ...)

which only fires when all box-state assertions are withdrawn.

Patterns over hash-tables

Previous implementations could only match fields in records (with constant labels) and elements of arrays/lists. This new implementation is also able to express and match patterns over named-key elements in Dictionary Values.

This lets actors express patterns over JSON-like Preserves documents, for example.

Pattern quasiquotation

One of the issues I hoped the new architecture would shed light on is pattern quotation. In order to express interest in interest expressed by some other party, you need to be able to describe the subscriptions that are of interest to you. That means you must be able to write patterns over patterns.

Previous implementations didn’t get this right. It was not possible to precisely express interest in subscriptions that bound (or did not bind) certain portions of their input; and it was not possible to precisely express the difference between being interested in a binding or binding a portion of the pattern to be matched itself.

The new design solves these issues with a quasiquote-like facility. Here’s a pattern that matches “subscriptions to unary set-box records”:

1
(Observe (:pattern (set-box ,_)) _)

The :pattern wrapper introduces a quoted pattern, and unquote-discard (“,_”) pops back out a level to say that we don’t care what the subscriber has put in their pattern at that position. For example, they may have elected to bind the value inside the set-box, or they may have elected to ignore it, or they may have elected to match only certain values of it, and so on. By discarding that portion of the pattern, we ignore the specific choice the matching subscriber made.

If instead we use unquote-bind (“,$id”), we extract a portion of the pattern each subscriber placed in the dataspace:

1
(Observe (:pattern (set-box ,$value-pat)) _)

For example, if some subscriber is binding the value in the set-box to an identifier new-value, but otherwise placing no constraints on it, we will be given the following value for value-pat:

1
<bind new-value <_>>

If, on the other hand, a subscriber is completely ignoring the value in the set-box, caring only about the set-box wrapper itself, we will be given <_>, the “discard” pattern.

Crucially, we are now able to distinguish between binding-a-portion-of-the-matched-pattern and matching-a-portion-that-is-a-binding. We’ve seen the former already with unquote-bind; the latter is accomplished by using unquote in the structured syntax for a binding:

1
(Observe (:pattern (set-box ($ ,$their-id ,$further-constraint))) _)

Here we unquote twice. The ($ ...) constructor itself specifies that we require matching subscriptions to have a binding at this position. The first unquote extracts the name in the binding, and the second extracts the subpattern for the binding. For the example above, we would end up with their-id bound to the symbol new-value and further-constraint bound to the subpattern <_>.

Finally, let’s examine a couple of alternatives that don’t work. This one is missing the :pattern wrapper, meaning that instead of asking about patterns over set-box records, it is asking about observers that (mistakenly?) specified an actual set-box record instead of a pattern!

1
(Observe (set-box ,_) _)

The compiler won’t actually let you use this version, because the unquote-discard is out of place. There’s no quasiquotation to escape from, so this is a syntax error.

We might try repairing this by simply removing the unquote:

1
(Observe (set-box _) _)

But this is still asking the wrong question, and will never receive any interesting matches from other subscribers in the system.

How does it perform?

Very well! At present it is roughly twice as fast as the previous Racket implementation. Running a benchmark based on the example program above yields the following on one thread of my Ryzen 3960X:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
syndicate/actor: #<actor:0:dataspace> booting
syndicate/task: #<engine:0> starting
syndicate/actor: #<actor:3:box> booting
syndicate/actor: #<actor:6:client> booting
Box got 100000 (66222.3817422824 Hz)
Box got 200000 (68740.1257393843 Hz)
Box got 300000 (68724.52087884837 Hz)
Box got 400000 (68670.19562623161 Hz)
Box got 500000 (68786.55545277878 Hz)
syndicate/actor: #<actor:3:box> terminated OK
Client detected box termination
syndicate/actor: #<actor:6:client> terminated OK
syndicate/task: #<engine:0> stopping
cpu time: 7330 real time: 7330 gc time: 68

It means that the program is able to do ~68,000 complete round-trips per second of update and signalling between the box and client actors.

Preserves and Preserves Schema

The new implementation depends heavily on Preserves and Preserves Schema, so I’ve ended up doing a fair bit of work on those in order to get things working in Syndicate/rkt. (Among other things, fixing the raco pkg install process for the preserves and syndicate Racket packages!)

First, one nice bit of news is a new Preserves implementation, preserves-nim by Emery Hemingway, for the Nim programming language. I’ve linked the various implementations of Preserves and Syrup on the main Preserves webpage.

There have also been changes to the Schema language and tooling. The main change to the Schema language is a reappraisal of the role of Embedded values in schemas. Previously, they were treated as black boxes - given just enough machinery to parse them out of and serialize them back into a Value, but nothing more. Now, they’re given both a (de)serializer and an “interface type”; the idea is that an Embedded represents a capability to some behavioural object - a closure, an object pointer, an actor reference, a web service, that kind of thing - and so there may be an associated API that can be usefully schematized. This makes schematization of Embedded values something closely related to the higher-order contracts of Dimoulas; see the bit on future work in the spec for some additional thoughts along these lines, as well as a little example.

The main change to the Schema tooling is support for plugins in the Schema compiler, allowing Syndicate/rkt to supply a plugin for generating dataspace patterns from parsed Schema values. The #lang preserves-schema support has been likewise extended so you can supply plugins in the #lang line.

Last (and probably least), here’s a fun little schema example:

1
2
3
4
5
6
7
8
9
10
version 1 .
JSON =
     / @string string
     / @integer int
     / @double double
     / @boolean JSONBoolean
     / @null =null
     / @array [JSON ...]
     / @object { string: JSON ...:... } .
JSONBoolean = =true / =false .

It recognises the JSON-interoperable subset of Preserves Values!

Licensing

A note about licensing: I’ve chosen LGPL 3.0+ as the license for Syndicate/rkt. Many thanks to Massimo Zaniboni for pointing out the lack of license, discussing various options with me, and helping sort out the per-file license headers.