Multiaddr

In the #spritely IRC channel, mala mentioned the multiaddr specification.

From the site, “Multiaddr is a format for encoding addresses from various well-established network protocols.” The aim is to avoid “[leaving] much to interpretation and side-band context”, and to allow people to “build applications that will work with network protocols of the future, and do not accidentally ossify the stack.”

Syndicate defines something similar (though rudimentary) in its transportAddress module and in its notion of a Route.

Semantics for Multiaddr

Multiaddr informally defines its semantics, and gives both a human-readable and a machine-readable concrete syntax.

Let’s borrow preserves and propose a better specification for multiaddr semantics:

  • A multiaddr value (a Multiaddr) is a Sequence of protocol addresses. The sequence is implicitly understood to describe a layered protocol stack, with leftward addresses (the “bottom” of the stack) acting as substrate for rightward addresses.

  • A protocol address (an Address) is a record with its label being a Symbol representing a protocol name (a ProtocolName), and its fields being zero or more protocol-specific Values.

  • Each kind of Address must define a canonical form. Use of the canonical form is mandatory. For example, an ip6 address could define a rule that the associated string must always conform to IETF IPv6 canonical format.

For example, the multiaddr examples written

1
2
3
4
5
/ip4/127.0.0.1/udp/9090/quic
/ip6/::1/tcp/3217
/ip4/127.0.0.1/tcp/80/http/baz.jpg
/dns4/foo.com/tcp/80/http/bar/baz.jpg
/dns6/foo.com/tcp/443/https

could denote the Values

1
2
3
4
5
[<ip4 127 0 0 1> <udp 9090> <quic>]
[<ip6 "::1"> <tcp 3217>]
[<ip4 127 0 0 1> <tcp 80> <http "/baz.jpg">]
[<dns4 "foo.com"> <tcp 80> <http "/bar/baz.jpg">]
[<dns6 "foo.com"> <tcp 443> <https "/">]

Schema for Multiaddr

Much of this structure can be captured by a preserves schema definition. Given such a schema, the multiaddr text and binary syntaxes become special-purpose syntax for preserves Values conforming to the schema.

version 1 .

Multiaddr = [Address ...] .
Address = <<rec> @protocolName symbol @detail [any ...]> .

WellKnownProtocol =
/ <ip4 @a int @b int @c int @d int>
/ <ip6 @addr string>
/ <dns4 @dnsName string>
/ <dns6 @dnsName string>
/ <udp @port int>
/ <tcp @port int>
/ <quic>
/ <http @path string>
/ <https @path string>
/ @unknown Address
.

Side-conditions such as bounding the integers in an ip4 address to the range [0..255] are not currently expressible in preserves schema.

Also, these definitions are too simple, particularly in the case of HTTP(S), where much more of the structure of an HTTP url (username, password, path, query, fragment, etc.) can and should be parsed out.