Universal Binary Format

UBF 2.2 User's Guide

UBF is a framework that permits Erlang and the outside world [UBFPAPER] to talk with each other. The acronym "UBF" stands for "Universal Binary Format", designed and implemented by Joe Armstrong.

This document and the corresponding open-source code repositories hosted on GitHub [UBF] are based on Joe Armstrong’s original UBF site [ORIGUBFSITE] and UBF code with an MIT license file added to the distribution. Since then, a large number of enhancements and improvements have been added.

UBF is a language for transporting and describing complex data structures across a network. It has three components:

  • UBF(a) is a "language neutral" data transport format, roughly equivalent to well-formed XML.
  • UBF(b) is a programming language for describing types in UBF(a) and protocols between clients and servers. This layer is typically called the "protocol contract". UBF(b) is roughly equivalent to Verified XML, XML-schemas, SOAP and WDSL.
  • UBF(c) is a meta-level protocol used between a UBF client and a UBF server.

While the XML series of languages had the goal of having a human readable format the UBF languages take the opposite view and provide a "machine friendly" format. UBF is designed to be easy to implement.

Programming By Contract

Central to UBF is the idea of a "Contract" which regulates the set of legal conversations that can take place between a client and a server. The client-side is depicted in "red" and the server-side is depicted in "blue". The client and server communicate with each other via a TCP/IP connection. All data sent by both the client and the server is verified by the "Contract Manager" (an Erlang process on the "server" side of the protocol). Any data that violates the contract is rejected.

The UBF framework itself is designed to be easy to extend for supporting other data transport formats and other network transports. For example, JSON, Thrift, and Erlang native binary serialization data formats over TCP/IP and JSON-RPC over HTTP are supported alternatives to the original UBF(a) implementation.

UBF(a)

UBF(a) is a transport format. UBF(a) was designed to be easy to parse and to be easy to write with a text editor. UBF(a) is based on a byte encoded virtual machine, 26 byte codes are reserved. Instead of allocating the byte codes from 0, the printable character codes are used to make the format easy to read.

UBF(a) has four primitive types, when a primitive type is recognized it is pushed onto the "recognition stack" in our decoder. The primitive types are Integer, String, Binary, and Atom. UBF(a) has two types of "glue" for making compound objects. The compound types are Tuple and List. Lastly, the operator $ (i.e. "end of object") signifies when objects are finished.

For example, the following UBF(a) object:

'person'>p # {p "Joe" 123} & {p 'fred' 3~abc~} & $

Represents the following UBF(b) term, a list that contains two 3-tuples:

[{'person', 'fred', <<"abc">>}, {'person', "Joe", 123}].

Tip In UBF(a), white space as well as commas are treated as a delimiter.

For this example, the recognition stack for parsing this UBF(a) object would be as follows:

    'person'>p # {p "Joe" 123} & {p 'fred', 3~abc~} & $
             ^ ^ ^^     ^   ^^ ^                  ^ ^ ^
             | | ||     |   || |                  | | |

             1 2 ab     c   d3 4                  5 6 7

Time  Stack

1   'person'

2   []

2a  { ... incomplete
    []

2b  {'person' ... incomplete
    []

2c  {'person', "Joe",  ... incomplete
    []

2d  {'person', "Joe", 123 ... incomplete}
    []

3   {'person', "Joe", 123}
    []

4   [{'person', "Joe", 123}]

5   {'person', 'fred', <<"abc">>}
    [{'person', "Joe", 123}]

6   [{'person', 'fred', <<"abc">>}, {'person', "Joe", 123}]

7   [{'person', 'fred', <<"abc">>}, {'person', "Joe", 123}]

See [ABNF-UBFa] for a formal definition of the UBF(a) syntax.

Caution There is no "Float" primitive type in the original and current UBF(a) implementation. After Joe Armstrong’s original implementation, a "Float" type was added to UBF(b) for use in other network transports other than UBF(a). In future, UBF(a) could be enhanced to support a "Float" primitive type.

Integer: [-][0-9]+

Integers are sequences of bytes which could be described by the regular expression [-][0-9]+, that is an optional minus (to denote a negative integer) and then a sequence of at least one digit.

String: "…"

Strings are written enclosed in double quotes. Within a string two quoting conventions are observed, " must be written \" and \ must be written \\ - no other quotings are allowed.

Binary: [0-9]+ ~…~

Uninterpreted blocks of binary data are encoded. First an integer, representing the length of the binary data is encoded, this is followed by a tilde, the data itself which must be exactly the length given in the integer and than a closing tilde. The closing tilde has no significance and is retained for readability. White space can be added between the integer length and the data for readability.

Atom: '…'

Atoms are encoded as strings, only using a single quote instead of a double quote. Atoms are commonly found in symbolic languages like Lisp, Prolog or Erlang. In C, they would be represented by hashed strings. The essential property of an atom is that two atoms can be compared for equality in constant time. These are used for representing symbolic constants.

Tuple: { Obj1 Obj2 … ObjN-1 ObjN }

Tuples represent fixed numbers of objects. The byte codes for "{" and "}" are used to delimit a tuple. Obj1, Obj2, ObjN-1, and ObjN are arbitrary UBF(a) objects.

List: # ObjN & ObjN-1 & … & Obj2 & Obj1

Lists represent variable numbers of objects. The first object in the list is Obj1, the second object in the list is Obj2, etc. Objects are presented in reverse order.

Lisp programmers will recognize # as an operator that pushes NIL (or the end of list) onto the recognition stack and & as an operator that takes the top two items on the recognition stack and replaces them by a list cell.

Term

Terms represent primitive types and compound types.

White space: \s \n \r \t , %…%

For convenience, blank, carriage return, line feed, tab, comma, and comments are treated as white space. Comments can be included in UBF(a) with the syntax %…% and the usual quoting convention applies.

Tag: `…`

In addition any item can be followed by a semantic tag this is written `…` - with in the tag the close quote is quoted as in the strings encoding. This tag has no meaning in UBF(a) but might have a meaning in UBF(b). For example:

12456 ~...~ `jpg`

Represents 12,456 bytes of raw data with the semantic tag "jpg". UBF(a) does not know what "jpg" means - this is passed on to UBF(b) which might know what it means - finally the end application is expected to know what to do with an object of type "jpg", it might for example know that this represents an image. UBF(a) will just encode the tag, UBF(b) will type check the tag, and the application should be able to understand the tag.

Caution Currently, this feature of integrating a "tag" in UBF(a) for the purpose of a "type" in UBF(b) is not implemented. Tags can be specified in UBF(a) but there is no way for the application to act upon this semantic information.

Register: >C C

So far, exactly 26 control characters have been used, namely: %"~'`{}#&\s\n\t\r,-01234567890

This leaves us with 230 unallocated byte codes. These are used as follows:

>C

Where C is not one of the reserved byte codes, > means store the top of the recognition stack in the register C and pop the recognition stack. For caching optimization, subsequent reuse of the single character C means push register C onto the recognition stack.

Object

Objects represent either a Term, a Register push, or a Register pop with an optional Tag. The operator $ signifies "end of object". When $ is encountered there should be only one item on the recognition stack.

UBF(b)

UBF(b) is a language independent type system and protocol description language. The protocol description language allows one to specify client server interaction in terms of a non-deterministic finite state machine. The type system allows one to specify the asynchronous events and synchronous request/response pairs that define transitions of this finite state machine.

The type system and protocol description language together define the basis of "Contracts" between clients and servers. All data sent by both the client and the server is verified by the "Contract Manager" (an Erlang process on the "server" side of the protocol). Any data that violates the contract is rejected.

A UBF(b) contract is defined by 2 mandatory sections and 3 optional sections. The mandatory sections are the "+NAME" and the "+VERSION" of the contract. The optional sections are the "+TYPES", the "+STATE", and the "+ANYSTATE" of the contract.

For example, the following UBF(b) contract having the filename "irc_plugin.con" defines a simple IRC (Internet Relay Chat) protocol between clients and a server:

+NAME("irc").

+VSN("ubf2.0").

+TYPES
info()            :: info;
description()     :: description;
contract()        :: contract;

ok()              :: ok;
bool()            :: true | false;
nick()            :: ubfstring();
oldnick()         :: nick();
newnick()         :: nick();
group()           :: ubfstring();
groups()          :: [group()];

logon()           :: logon;
proceed()         :: {ok, nick()};
listGroups()      :: groups;
joinGroup()       :: {join, group()};
leaveGroup()      :: {leave, group()};
changeNick()      :: {nick, nick()};
msg()             :: {msg, group(), ubfstring()};

msgEvent()        :: {msg, nick(), group(), ubfstring()};
joinEvent()       :: {joins, nick(), group()};
leaveEvent()      :: {leaves, nick(), group()};
changeNameEvent() :: {changesName, oldnick(), newnick(), group()}.

+STATE start
   logon()       => proceed() & active. %% Nick randomly assigned

+STATE active
   listGroups()  => groups() & active;
   joinGroup()   => ok() & active;
   leaveGroup()  => ok() & active;
   changeNick()  => bool() & active;
   msg()         => bool() & active;    %% False if you have not joined a group

   EVENT         => msgEvent();         %% Group sends me a message
   EVENT         => joinEvent();        %% Nick joins group
   EVENT         => leaveEvent();       %% Nick leaves group
   EVENT         => changeNameEvent().  %% Nick changes name

+ANYSTATE
   info()        => ubfstring();
   description() => ubfstring();
   contract()    => term().





See [ABNF-UBFb] for a formal definition of the UBF(b) syntax.

Note The astute reader (and otherwise :) ) may notice that UBF(a) and UBF(b) are Erlang-centric. By design, the two languages are supposed to be language neutral and yet by design the two are highly influenced by Erlang. For example, the difference between a string type and a binary type is directly due to Erlang’s implementation of binaries and strings. Similarly, the reason for supporting a record type and extended record type is also directly due to Erlang’s implementation of records.

Name: +NAME("…").

The name of the contract is specified as a double-quoted string.

Version: +VSN("…").

The version of the contract is specified as a double-quoted string.

Types: +TYPES.

The UBF(b) type system has user-defined types, builtin types, and predefined types. All types are either primitive types or complex types.

The primitive types are Integer, Range, Float, Binary, String, Atom, and Reference. The complex types are Alternative, Tuple, Record, Extended Record, and List. Builtin and User-defined "complex types" are defined recursively.

Definition: X() :: T

New types are defined by the notation:

X() :: T;

and the last type of new types must be defined by the notation:

X() :: T.

The name of the type is X and the type’s definition T is either a user-defined type or a predefined type.

Integer: [-][0-9]+ or [0-9]+#[0-9a-f]+

Positive and negative integer constants are expressed as in UBF(a). Integer constants may also be expressed in other bases using Erlang syntax.

Range: [-][0-9]+..[-][0-9]+ or [-][0-9]+.. or ..[-][0-9]+

Bounded, left unbounded, and right unbounded integer ranges are supported.

Float: [-][0-9]+.[0-9]+

Positive and negative float constants are supported for network transports other than UBF(a).

Note In future, the implementation of UBF(b) could be enhanced to specify a float more compactly using scientific notation (e.g. "6.02e23").

Binary: <<"…">>

Binary constants are expressed similarly as strings in UBF(a) but having two leading "less than brackets" and two following "greater than brackets".

String: "…"

String constants are expressed as in UBF(a).

Atom: '…' or [a-z][a-zA-Z0-9_]*

Atom constants are expressed as UBF(a) atoms. Atom constants starting with lowercase letters do not require single quotes.

Reference: R()

Defined types are referenced by the notation:

R()

The name of the type is R.

Alternative: T1 | T2

A type X is of type "T1 | T2" if X is of type T1 or if X is of type T2.

Tuple: {T1, T2, …, Tn}

A type {X1, X2, …, Xn} is of type "{T1, T2, …, Tn}" if X1 is of type T1, X2 is of type T2, … and Xn is of type Tn.

Record: name#{ x::T1, y::T2, …, z::Tn} or name\#D1::T1, y=D2::T2, …, z=Dn::Tn

A record type is syntactic sugar for a tuple of type "{name, T1, T2, …, Tn}" where name, x, y, …, and z are atoms.

Optionally, the default value of a record field can be specified.

Caution Caution is required when using another record as the default value of a record field. This feature has not been fully tested.

Extended Record: name##{ x::T1, y::T2, …, z::Tn} or name#\#D1::T1, y=D2::T2, …, z=Dn::Tn

An extended record type is syntactic sugar for a tuple of type "{name, T1, T2, …, Tn, $fields::[x,y,…,z], $extra::Extra}" where name, x, y, …, and z are atoms and Extra is any valid term.

Optionally, the default value of a record field can be specified.

Caution Caution is required when using another record as the default value of a record field. This feature has not been fully tested.

List: [T] or [T]{} or [T]{[0-9],} or [T]{,} or [T]{[0-9],[0-9]+}

A type [X1, X2, …, Xn] is of type [T] if all of Xi are of type T.

Unbounded, bounded, left unbounded, and right unbounded lists are supported.

Builtin: B()

Builtin types are referenced by the notation:

B()

The name of the builtin type is B and builtin types are of 2 categories: Erlang and UBF. The names used for builtin types are reserved and cannot be used for user-defined types.

Erlang builtin types are a subset of the types defined by EEP8 [EEP8]. EEP8 types are used for authoring Erlang types and specs definitions.

Type Definition
nil() []
term() any()
boolean() false | true
byte() 0..255
char() 0..16#10ffff
non_neg_integer() 0..
pos_integer() 1..
neg_integer() ..-1
number() integer() | float()
string() [char()]
nonempty_string() [char()]+
module() atom()
mfa() {atom(), atom(), byte()}
node() atom()
timeout() infinity | non_neg_integer()
no_return() none()

UBF builtin types are types particular to UBF that can be helpful for encoding and decoding proplists and strings.

Type Definition
ubfproplist() {#P, [{term(), term()}]}
ubfstring() {#S, [byte()]}

Predefined: P() or P(A1, A2, …, An)

Predefined types are referenced by the notation:

P()

or by the notation:

P(A1, A2, ..., An)

The name of the predefined type is P. The names used for predefined types are reserved and cannot be used for user-defined types.

Using the second notation, attributes can be specified to make the predefined type less general and thus more specific when matching objects.

Type ascii asciiprintable nonempty nonundefined
any X X O O
none X X X X
integer X X X X
float X X X X
binary O O O X
atom O O O O
tuple X X O X
list X X O X

The above table summarizes the set of supported predefined types and their respective optional attributes.

The "any" predefined type matches any object.

The "none" predefined type is a placeholder to describe the return value of a function call that does not return to the caller.

The "integer", "float", "binary", "atom", "boolean", "tuple", and "list" predefined types match directly to the corresponding primitive or complex type.

The "ascii" attribute permits matches with binaries and atoms containing only ASCII values [RFC20]. Similarly, the "asciiprintable" attribute permits matches with only printable ASCII values.

The "nonempty" attribute permits matches with binaries, atoms, tuples, lists, and any that are of length greater than zero. The following objects would not be matched with the "nonempty" attribute:

<<"">>
''
{}
[]

The "nonundefined" attribute permits matches with atoms and terms that are not equal to the undefined atom.

Note By convention, the undefined atom is commonly used to indicate a default value or an undefined value in Erlang programs. The purpose of undefined is similar to NULL in C, to None in Python, etc.

State: +STATE.

The "+STATE" sections of UBF(b) defines a finite state machine (FSM) to model the interaction between the client and server. Symbolic names expressed as "atoms" are the states of the FSM.

Transitions expressed as request, response, and next state triplets are the edges of the FSM. Transitions are "synchronous" calls from the client to the server. Any request sent by the client that cannot match at least one valid transition is ignored and a "client broke contract" error response is returned to the client. Likewise, any response returned by the server that cannot match at least one valid transition is ignored and a "server broke contract" error response is returned to the client.

The states of the FSM may also be annotated with events expressed as "asynchronous" casts. Events are asynchronous casts either from the client to the server or from the server to the client. Please see next section for additional details.

Note The terminology of "call" and "cast" to distinguish between synchronous and asynchronous interaction is borrowed from Erlang.

Anystate: +ANYSTATE.

The "+ANYSTATE" section of UBF(b) are used to define request and response pairs and to define events that are valid in all states of the FSM.

Events are checked based on direction first, on the current state’s valid events next, and finally on the valid anystate events. Any cast sent by the client or sent by the server that cannot match at least one valid event is ignored and dropped.

UBF(c)

UBF(c) is a meta-level protocol used between a UBF client and a UBF server. UBF(c) has two primitives: synchronous "calls" and asynchronous "casts".

Calls: Request $ ⇒ {Response, NextState} $

Synchronous calls have the following form for the request:

Request $

and for the response:

{Response, NextState} $

where "Request" is an UBF(a) type sent by the client and "Response" is an UBF(a) type and "NextState" is an UBF(a) atom sent by the server.

If the client sends an invalid request, the server will respond with the following "client broke contract" error:

{{'clientBrokeContract', Request, ExpectsIn}, State} $

where "ExpectsIn" is a UBF(a) type to describe the acceptable list of input types and "State" is an UBF(a) atom.

If the server sends an invalid response, the server will respond with the following "server broke contract" error:

{{'serverBrokeContract', Response, ExpectsOut}, State} $

where "ExpectsOut" is a UBF(a) type to describe the acceptable list of output types and "State" is an UBF(a) atom.

Caution By convention, the 3-tuples {'clientBrokeContract', _, _} and {'serverBrokeContract', _, _} are reserved terms for responses. Please be careful when designing your application not to use either of these 3-tuples.

Casts: {'event_in', Event} $ or {'event_out', Event} $

Asynchronous casts from the client to server have the following form:

{'event_in', Event} $

and from the server to the client have the following form:

{'event_out', Event} $

where "Event" is an UBF(a) type.

If client or server send an invalid event, the event is ignored and dropped by the server.

See [ABNF-UBFc] for a formal definition of the UBF(c) syntax.

Caution By convention, the 2-tuples {'event_in', _} and {'event_out', _} are reserved terms for requests and responses respectively. Please be careful when designing your application not to use either of these two tuples. This limitation introduced unintentionally after the original UBF implementation may be removed in the future.

"Contracts" and "Plugins" are the basic building blocks of an Erlang UBF server. Contracts are a server’s specifications. Plugins are a server’s implementations.

Contract

A contract is a UBF(b) specification stored to a file. By convention, a contract’s filename has ".con" as the suffix part. Since all sections of a UBF(b) specification are optional except for the "+NAME" and "+VERSION" sections, it is possible to have "+TYPES" only contracts, "+STATE" only contracts, "+ANYSTATE" only contracts, or any combination of such contracts.

For example, a "+TYPES" only contract having the filename "irc_types_plugin.con" is as follows:

+NAME("irc_types").

+VSN("ubf2.0").

+TYPES
info()            :: info;
description()     :: description;
contract()        :: contract;

ok()              :: ok;
bool()            :: true | false;
nick()            :: ubfstring();
oldnick()         :: nick();
newnick()         :: nick();
group()           :: ubfstring();
groups()          :: [group()];

logon()           :: logon;
proceed()         :: {ok, nick()};
listGroups()      :: groups;
joinGroup()       :: {join, group()};
leaveGroup()      :: {leave, group()};
changeNick()      :: {nick, nick()};
msg()             :: {msg, group(), ubfstring()};

msgEvent()        :: {msg, nick(), group(), ubfstring()};
joinEvent()       :: {joins, nick(), group()};
leaveEvent()      :: {leaves, nick(), group()};
changeNameEvent() :: {changesName, oldnick(), newnick(), group()}.

For example, a "+STATE" and "+ANYSTATE" contract having the filename "irc_fsm_plugin.con" is as follows:

+NAME("irc").

+VSN("ubf2.0").

+STATE start
   logon()       => proceed() & active. %% Nick randomly assigned

+STATE active
   listGroups()  => groups() & active;
   joinGroup()   => ok() & active;
   leaveGroup()  => ok() & active;
   changeNick()  => bool() & active;
   msg()         => bool() & active;    %% False if you have not joined a group

   EVENT         => msgEvent();         %% Group sends me a message
   EVENT         => joinEvent();        %% Nick joins group
   EVENT         => leaveEvent();       %% Nick leaves group
   EVENT         => changeNameEvent().  %% Nick changes name

+ANYSTATE
   info()        => ubfstring();
   description() => ubfstring();
   contract()    => term().





Plugin

A plugin is just a "normal" Erlang module that follows a few simple rules. For a "+TYPES" only contract, the plugin contains just the name of it’s contract. Otherwise, the plugin contains the name of it’s contract plus the necessary Erlang "glue code" needed to bind the UBF server to the server’s application. In either case, a plugin can also import all or a subset of "+TYPES" from other plugins. This simple yet powerful import mechanism permits sharing and re-use of types between plugins and servers.

Note The necessary Erlang "glue code" is presented later in the [Servers] section.

For the full example IRC contract described in a previous section, the plugin having the filename "irc_plugin.erl" is as follows:

-module(irc_plugin).

-compile({parse_transform,contract_parser}).
-add_contract("irc_plugin").

The plugin for the "+TYPES" only contract having the filename "irc_types_plugin.erl" is as follows:

-module(irc_types_plugin).

-compile({parse_transform,contract_parser}).
-add_contract("irc_types_plugin").

Importing Types

The plugin for the "+STATE" and "+ANYSTATE" contract having the filename "irc_fsm_plugin.erl" is as follows:

-module(irc_fsm_plugin).

-compile({parse_transform,contract_parser}).
-add_types(irc_types_plugin).
-add_contract("irc_fsm_plugin").

The "-add_types('there')" directive imports all "+TYPES" from the plugin named 'there' into the containing plugin. An alternative syntax "-add_types({'elsewhere', ['t1', 't2', …, 'tn']})." for this directive imports a subset of "+TYPEs" from the plugin named 'elsewhere' into the containing plugin. Multiple import directives of either syntax can be freely declared as long as the "-add_types" directives are listed before the "-add_contract" directive. A plugin can have only one "-add_contract" directive.

By using this Erlang "parse transform", the contract is parsed and the imported types (if any) are processed during the compilation of the plugin’s Erlang module. The normal search path used by Erlang’s compiler to locate modules is used to import types from other plugins.

Compilation Errors

The plugin will fail to compile if the plugin’s contract cannot be found, cannot be parsed properly, or if one of the following errors occurs:

{'duplicated_records', L}
One or more records having the same name are found.
{'duplicated_states', L}
One or more states having the same name are found.
{'duplicated_types', L}
One or more types having the same name are found.
{'duplicated_unmatched_import_types', L}
One or more imported types having the same name but different definitions are found. Type duplicates are permitted as long as the type(s) are imported and all duplicates have the same definition.
{'missing_states', L}
One or more states were found to be missing.
{'missing_types', L}
One or more types were found to be missing.
{'unused_types', L}
One or more types were found to be unused in the contract. Unused types are permitted as long as the unused type(s) are imported.

where L is an Erlang list.

Miscellaneous

As a by-product of a plugin’s compilation and if one or more "record" or "extended record" types were declared in a plugin’s contract, an Erlang "header" file containing the plugin’s record definitions is automatically created in an application’s ebin directory. This Erlang "header" file can be included by the plugin module itself or by other Erlang modules used by the server’s application. By convention, this Erlang "header" file has the same base filename as the plugin but having a ".hrl" as the suffix part.

Tip There are 2 experimental prototypes for extending UBF’s type and plugin framework. [UBF_ABNF] is a framework for integrating UBF and ABNF specifications. [UBF_EEP8] is a framework for integrating UBF and EEP8 types.

The original "UBF" network transport is UBF(a) over TCP/IP. Since then, a number of new transports not based on UBF(a) and not based on TCP/IP have been added. Nevertheless, these transports are still considered as part of the overall UBF framework. Most importantly, applications can share and re-use the same UBF contracts and plugins irregardless of the network transport.

TCP/IP

UBF: Universal Binary Format

The name "UBF" is short for "Universal Binary Format". UBF is commonly used to refer to the network transport based on UBF(a) and to the overall UBF framework.

See [UBFa] for further information.

EBF: Erlang Binary Format

EBF is an implementation of UBF(b) but it does not use UBF(a) for the client and server communication. Instead, Erlang-style conventions are used instead:

  • Structured terms are serialized via the Erlang BIFs term_to_binary() and binary_to_term().
  • Terms are framed using the gen_tcp {packet, 4} format: a 32-bit unsigned integer (big-endian?) specifies packet length.
    +-------------------------+-------------------------------+
    | Packet length (32 bits) | Packet data (variable length) |
    +-------------------------+-------------------------------+

The name "EBF" is short for "Erlang Binary Format".

JSF: JavaScript Format

JSF is an implementation of UBF(b) but it does not use UBF(a) for the client and server communication. Instead, JSON [RFC4627] is used instead as the wire format. The name "JSF" is short for "JavaScript Format".

There is no generally agreed upon convention for converting Erlang terms to JSON objects. JSF uses the convention set forth by MochiWeb’s JSON library [MOCHIJSON2]. In addition, there are a couple of other conventions layered on top of MochiWeb’s implementation.

  • The UBF(b) contract checker has been modified to make a distinction between an Erlang record and an arbitrary Erlang tuple. An experienced Erlang developer would view such a distinction either with skepticism or with approval.
  • For the skeptics, the contract author has the option of having the UBF(b) contract compiler automatically generate Erlang -record() definitions for appropriate tuples within the contract. Such record definitions are very convenient for developers on the Erlang side of the world, but they introduce more complication to the JavaScript side of the world. For example, JavaScript does not have a concept of an arbitrary atom, as Erlang does. Also, the JavaScript side must make a distinction between {foo, 42} and {bar, 42} when #foo is a record on the Erlang side but #bar is not.

This extra convention creates something slightly messy-looking, if you look at the raw JSON passed back-and-forth. The examples of the Erlang record {foo, 42} and the general tuple {bar, 42} would look like this:

   record (defined in the contract as "foo() :: #foo{attribute1 :: term()};")

      {"$R":"foo", "attribute1":42}

   general tuple

      {"$T":[{"$A":"bar"}, 42]}

However, it requires very little JavaScript code to convert objects with the "$R", "$T", and "$A" notation (for records, tuples, and atoms) into whatever object is most convenient.

See [UBF_JSONRPC] for further information.

Tip Gemini Mobile Technologies, Inc. has implemented and open-sourced a module for classifying the input character set to detect non-UTF8 JSON inputs [JSFCHARSET].

TBF / FTBF / NTBF / FNTBF: Binary Format - Thrift / Framed Thrift / Native Thrift / Framed Native Thrift

TBF and NTBF is an implementation of UBF(b) but it does not use UBF(a) for the client and server communication. Instead, Thrift [THRIFT] is used instead as the wire format. The name "TBF" is short for "Thrift Binary Format". The name "NTBF" is short for "Native Thrift Binary Format". FTBF and FNTBF are framed versions of TBF and NTBF, respectively.

TBF follows the conventions set forth by the Thrift community by re-using Thrift’s binary wire-protocol except for the following exceptions:

  • The name of Thrift messages are hard-coded to the Thrift name "$UBF".
  • The name of Thrift structs are not removed before being written to the network.
  • TBF does not use nor require a Thrift IDL.
  • TBF by convention requires the client to read a "server hello" message at the start of establishing a new TCP/IP connection.

TBF can encode and decode all UBF(b) objects. Synchronous calls are implemented as Thrift T-CALL and T-REPLY message pairs. Asynchronous casts are implemented as Thrift T-ONEWAY messages.

Caution TBF is not compatible with standard Thrift clients and servers.

NTBF follows all of the conventions set forth by the Thrift community by re-using Thrift’s binary wire-protocol. A standard Thrift client can communicate with a UBF "NTBF" server and a UBF "NTBF" client can communicate with a standard Thrift server.

NTBF cannot encode and decode all UBF(b) objects. There is no straightforward convention for converting Erlang terms to Thrift messages. Synchronous calls are implemented as Thrift T-CALL and T-REPLY message pairs or T-CALL and T-EXCEPTION message pairs. Asynchronous casts are implemented as Thrift T-ONEWAY messages.

The NTBF transport is under active development to enhance, to improve, to simplify the integration of Thrift to the UBF framework. The impedance mismatch between the two approaches of Thrift and UBF can only be addressed by further development.

Caution Currently, NTBF only implements the encoding and decoding of Thrift’s binary wire-protocol. Unlike standard Thrift clients and servers, a NTBF client and server must "manually" implement the features provided by the Thrift IDL.

See [UBF_THRIFT] for further information.

Miscellaneous

It is worthwhile to mention two new TCP/IP transports namely PBF and ABF under investigation. The name "PBF" is short for "Google’s Protocol Buffers Format" [PROTOBUF]. The name "ABF" is short for "Avro Binary Format" [AVRO].

HTTP

JSON-RPC

JSON-RPC [JSONRPC] is a lightweight remote procedure call protocol similar to XML-RPC. The UBF framework implementation of JSON-RPC brings together JSF’s encoder/decoder, UBF(b)'s contract checking, and an HTTP transport.

Programming By Contract w/ Multiple Transports

As previously stated, central to UBF is the idea of a "Contract" which regulates the set of legal conversations that can take place between a client and a server. The client-side is depicted in "red" and the server-side is depicted in "blue". The client and server communicate with each other via a TCP/IP and/or HTTP.

Central to UBF is the idea of contract(s) can be shared and re-used by multiple transports. Any data that violates the same contract(s) is rejected regardless of the transport.

See [UBF_JSONRPC] for further information.

Miscellaneous

Several transports that do not require an explicit network socket have been added to the UBF framework. These transports permit an application to call a plugin directly without the need for TCP/IP or HTTP.

ETF: Erlang Term Format

The concept "ETF" was added to the UBF framework. This transport relies on Erlang’s Native Distribution for synchronous calls and asynchronous casts.

The name "ETF" is short for "Erlang Term Format".

LPC: Local Procedure Call

The concept "LPC" was added to the UBF framework. This transport is a "non-transport" that invokes synchronous calls directly to a plugin. Support for asynchronous casts has not been added (or designed) yet.

The name "LPC" is short for "Local Procedure Call".

Note LPC is used to implement the JSON-RPC transport.

The UBF framework provides two types of Erlang servers: "stateless" and "stateful". The stateless server is an extension of Joe Armstrong’s original UBF server implementation. The "stateful" server is Joe Armstrong’s original UBF server implementation.

UBF servers are introspective - which means the servers can describe themselves. The following commands (described in UBF(a) format) are always available:

'help' $
Help information
'info' $
Short information about the current service
'description' $
Long information about the current service
'services' $
A list of available services
'contract' $
Return the service contract
{'startSession', "Name", Args} $
To start a new session for the Name service. Args are initial arguments for the Name service and is specific to that service.
{'restartService', "Name", Args} $
To restart the Name service. Args are restart arguments for the Name service and is specific to that service.

The "ubf_server" Erlang module implements most of the commonly-used server-side functions and provides several ways to start a server. Configuration options for both types of servers are the same. However, the plugin callback API is different.

-module(ubf_server).

-type name() :: atom().
-type plugins() :: [module()].
-type ipport() :: pos_integer().
-type options() :: [{atom(), term()}].

-spec start(plugins(), ipport()) -> true.
-spec start(name(), plugins(), ipport()) -> true.
-spec start(name(), plugins(), ipport(), options()) -> true.

-spec start_link(plugins(), ipport()) -> true.
-spec start_link(name(), plugins(), ipport()) -> true.
-spec start_link(name(), plugins(), ipport(), options()) -> true.

The start/{2,3,4} and start_link/{2,3,4} functions start a registered server and a TCP listener on ipport() and register all of the protocol implementation modules in the plugins() list. If name() is undefined, the server is not registered. The list of supported options() are as follows:

{'idletimer', non_neg_integer() | 'infinity'}
Maximum time (in milliseconds) that a client connection may remain idle before the server will close the connection. Default: 'infinity'
{'maxconn', non_neg_integer()}
Maximum number of simultaneous TCP connections allowed. Default: 10000.
{'proto', {'ubf' | 'ebf' | 'jsf' | 'tbf' | 'ftbf' | atom()}}
Enable the UBF, EBF, JSF, TBF, FTBF, or an alternative protocol wire format. Default: 'ubf'.
{'proto', {'ubf' | 'ebf' | 'jsf' | 'tbf' | 'ftbf' | atom(), [atom() | tuple()]}}
Enable the UBF, EBF, JSF, TBF, FTBF, or an alternative protocol wire format with options. Default: {'ubf', []}. Supported options:
'safe'
Prevents decoding data that may be used to attack the Erlang system. In the event of receiving unsafe data, decoding fails with a badarg error.
{'registeredname', name()}
Set the name to be registered for the TCP listener. If 'undefined', a default name is automatically registered. Default: 'undefined'.
{'statelessrpc', boolean()}
Run the stateless variety of a UBF(b) contract. A stateless contract is an extension of Joe Armstrong’s original UBF server implementation. Default: 'false'.
{'startplugin', module()}
Set the starting plugin, set after a client first connects to the server. If not set, client may select the service using the startSession() API. There is no default setting.
{'serverhello', ubfstring() | 'undefined'}
Meta contract greeting string, sent when a client first connects to the server. If 'undefined', server hello is not sent to the client. Default: "meta_server".
{'simplerpc', boolean()}
Set the simple RPC mode. If 'true', server returns only the rpc reply to client. If 'false', server returns the rpc reply and next state to client. Default: 'false'.
{'verboserpc', boolean()}
Set the verbose RPC mode. If 'true', server calls the plugin handler with the rpc request and matched contract types. If 'false', server calls the plugin handler only with the rpc request. Default: 'false'.
{'tlog_module', module() | {module(), boolean()}}
Set the transaction log callback module and optionally control the built-in calls by 'contract_manager_tlog' to the 'error_logger' module. If the 2-tuple representation is used and the boolean() member is 'false', then calls to 'error_logger' will not be attempted. Default: 'undefined'. See [TLOG] for further information.
{'process_options', list()}
Specify additional options used for spawning server and/or client related erlang processes. Typically used to specify non-default, garbage collection options. Default: [].

The "ubf_server" Erlang module doesn’t provide a "stop" function. To stop the server, instead stop the TCP listener that controls it. See the "proc_socket_server" Erlang module for extra details.

Note The NTBF and FNTBF transport protocol is indirectly enabled by specifying the following options: [{'proto', 'tbf'}, {'serverhello', 'undefined'}, {'simplerpc', 'true'}] or [{'proto', 'ftbf'}, {'serverhello', 'undefined'}, {'simplerpc', 'true'}].

Stateless

The stateless server provides a simplified callback API and implementation in comparison to Joe Armstrong’s original UBF server. The stateless server is helpful to applications that do not require explicit state management by the UBF server.

The "ubf_plugin_stateless.hrl" Erlang header file defines the callback APIs to be implemented by a stateless plugin. The seven callbacks are mandatory for all stateless plugins.

%% common callback API
-spec info() -> ubfstring().
-spec description() -> ubfstring().
-spec handlerStop(Handler::pid(), Reason::term(), StateData::term()) ->
                  NewStateData::term().

%% stateless callback API
-spec moduleStart(args()::term()) -> any().
-spec moduleRestart(args()::term()) -> any().

-spec handlerStart(Args::term()) ->
                  {accept, Reply::term(), StateName::atom(), StateDate::term()} |
                  {reject, Reply::term()}.
-spec handlerRpc(Call::term()) -> Reply::term().

The info/0 and description/0 functions provide short and long information about the plugin’s service, respectively.

The moduleStart/1 function is called once after the plugin’s service is started. The return value is ignored. The moduleRestart/1 function is called once after each time the plugin’s service is restarted. The return value is ignored.

The handlerStart/1 function is called when starting a new session for the plugin’s service. The plugin may accept or reject the start session request. When accepted, the plugin returns the reply for the client, the name of the state to be used for the entire session, and optional data for the state. When rejected, the plugin returns the error for the client.

The handlerStop/3 function is called when stopping a session of the plugin’s service. The plugin may perform some cleanup inside the handlerStop function.

The handlerRpc/1 function is called when processing a synchronous call.

For example, the following "skeleton" implementation of a [BERTRPC] server implemented by UBF illustrates a typical stateless server. The source code for this implementation can be found on GitHub [UBF_BERTRPC].

%%% -*- mode: erlang -*-
%%% @doc Sample BERT-RPC plugin.
%%%
%%%

-module(ubf_bertrpc_plugin).
-behaviour(ubf_plugin_stateless).

%% Required (except keepalive/0) callback API for UBF stateless
%% implementations.
-export([info/0, description/0, keepalive/0]).
-export([moduleStart/1, moduleRestart/1]).
-export([handlerStart/1, handlerStop/3, handlerRpc/1, handlerEvent/1]).

-import(ubf_plugin_handler, [sendEvent/2, install_handler/2]).

-compile({parse_transform,contract_parser}).
-add_contract("src/ubf_bertrpc_plugin").

-include_lib("ubf/include/ubf.hrl").
-include_lib("ubf/include/ubf_plugin_stateless.hrl").

info() ->
    "I am a BERT-RPC server".

description() ->
    "A BERT-RPC server programmed by UBF".

keepalive() ->
    ok.

%% @doc start module
moduleStart(_Args) ->
    unused.

%% @doc restart module
moduleRestart(Args) ->
    moduleStart(Args).

%% @doc start handler
handlerStart(_Args) ->
    ack = install_handler(self(), fun handlerEvent/1),
    {accept,ok,none,unused}.

%% @doc stop handler
handlerStop(_Pid, _Reason, _StateData) ->
    unused.

%% @doc rpc handler
%% @TODO Implement BERT-RPC 1.0 synchronous events
handlerRpc(Event) when Event==info; Event==description ->
    ?S(?MODULE:Event());
handlerRpc(Event) when Event==keepalive ->
    ?MODULE:Event().

%% @doc event handler
%% @TODO: Implement BERT-RPC 1.0 asynchronous events
handlerEvent(Event) ->
    %% Let's fake it and echo the request
    sendEvent(self(), Event),
    fun handlerEvent/1.

The above example also introduces three new concepts:

  • The install_handler/2 and handleEvent/1 functions illustrate how to receive asynchronous casts sent from the client to the server. The handler fun Fun should be a function of arity 1. When an asynchronous UBF message is received, the callback function is called with the event as its single argument. The Fun is called by the ubf plugin handler process so the Fun can crash and/or block this process. The Fun should also return the same or a new Fun for the next asynchronous event. If the Fun must maintain its own state, then an intermediate anonymous fun must be used to to bind the state.
  • The sendEvent/2 function illustrates how to send asynchronous casts from the server to the client.
  • The "?S(X)" macro definition plus other helpers are located in the "ubf.hrl" Erlang header file. For Erlang, the implementation of a UBF ubfstring() is a two tuple having '#S' as the first element and a list of bytes as the second element. A similar technique is also used for the implementation of a UBF ubfproplist() (i.e. '#P' and "?P(X)).

Stateful

The stateful server is Joe Armstrong’s original UBF server. The stateful server permits a plugin to transition from one state to another and also supports a manager framework for managing application state between multiple clients.

The "ubf_plugin_stateless.hrl" Erlang header file defines the callback APIs to be implemented by a stateful plugin. The eight callbacks are mandatory for all stateful plugins.

%% common callback API
-spec info() -> ubfstring().
-spec description() -> ubfstring().
-spec handlerStop(Handler::pid(), Reason::term(), ManagerData::term()) ->
                  NewManagerData::term().

%% stateful callback API
-spec handlerStart(Args::term(), Manager::pid()) ->
                  {accept, Reply::term(), StateName::atom(), StateDate::term()} |
                  {reject, Reply::term()}.
-spec handlerRpc(StateName::atom(), Call::term(), StateDate::term(), Manager::pid()) ->
                {Reply::term(), NewStateName::atom(), NewStateData::term()}.

-spec managerStart(Args::term()) ->
                   {ok, ManagerData::term()}.
-spec managerRestart(Args::term(), Manager::pid()) ->
                     ok | {error, Reason::term()}.
-spec managerRpc(Args::term(), ManagerData::term()) ->
                 {ok, NewManagerData::term()} | {error, Reason::term()}.

The info/0 and description/0 functions provide short and long information about the plugin’s service, respectively.

The handlerStart/2 function is called when starting a new session for the plugin’s service. The plugin may accept or reject the start session request. When accepted, the plugin returns the reply for the client, the name of the initial state to be used for the session, and optional data for the state. When rejected, the plugin returns the error for the client.

The handlerStop/3 function is called when stopping a session of the plugin’s service. The plugin may perform some cleanup inside the handlerStop function.

The handlerRpc/1 function is called when processing a synchronous call.

The managerStart/1 function is called once at the start of the server’s initialization for each plugin’s service.

The managerRestart/2 function is called to restart a plugin’s service. The callback function is expected to forward the request to the manager process and relay the manager’s reply.

The managerRpc/2 function is called when processing a call from handler. A handler uses the ubf_plugin_handler:ask_manager/2 function API to make a synchronous call to the manager.

For an example stateful server plugin, please see the "test/unit/irc_plugin.erl" Erlang module in the [UBF] repository. This plugin is the actual server-side implementation for the IRC protocol application described earlier.

Erlang

The UBF framework provides two types of Erlang clients: "rpc" and "lpc". The rpc client is the default client that supports TCP/IP-based and ETF transports. The lpc client is an alternative client for making a synchronous local procedure call to a plugin’s implementation.

The "ubf_client" Erlang module implements most of the commonly-used client-side functions and contains the implementation for the two types of Erlang clients.

-module(ubf_client).

-type host() :: nonempty_string().
-type ipport() :: pos_integer().
-type name() :: atom().
-type server() :: name() | pid().
-type plugin() :: module().
-type plugins() :: [plugin()].
-type options() :: [{atom(), term()}].
-type service() :: {'#S', nonempty_string()} | undefined.
-type statename() :: atom().
-type tlogger() :: module().

-spec connect(host() | plugins(), ipport() | server()) ->
              {ok, Client::pid(), service()} | {error, term()}.
-spec connect(host() | plugins(), ipport() | server(), timeout()) ->
              {ok, Client::pid(), service()} | {error, term()}.
-spec connect(host() | plugins(), ipport() | server(), options(), timeout()) ->
              {ok, Client::pid(), service()} | {error, term()}.

-spec rpc(Client::pid(), Call::term()) -> timeout | term() | no_return().
-spec rpc(Client::pid(), Call::term(), timeout()) -> timeout | term() | no_return().

-spec stop(Client::pid()) -> ok.

-spec sendEvent(Handler::pid(), Cast::term()) -> ok | no_return().

-spec install_default_handler(Client::pid()) -> ack.
-spec install_handler(Client::pid(), Fun::fun()) -> ack.

-spec lpc(plugin(), Call::term()) -> term().
-spec lpc(plugin(), Call::term(), statename()) -> term().
-spec lpc(plugin(), Call::term(), statename(), tlogger()) -> term().

The connect/{2,3,4} functions connect to a UBF server. Upon success, the UBF client’s pid() and the name of the UBF server’s service (if known) is returned. For TCP/IP transports, the default method is to connect to the specified host() and TCP ipport(). For the ETF transport, the alternative method is to connect to server() using the specified plugins(). The server() is either the process id or process registered name for an already-started UBF server.

The list of supported options() are as follows:

{'clientport', ipport() | {ipport(), ipport()}}
Specifies the TCP port to be used by the client. If tuple format, a port is automatically selected within the specified range. If 'undefined', a random port is automatically selected. Default: 'undefined'.
{'proto', {'ubf' | 'ebf' | 'jsf' | 'tbf' | 'ftbf' | atom()}}
Enable the UBF, EBF, JSF, TBF, FTBF, or an alternative protocol wire format. Default: 'ubf'.
{'proto', {'ubf' | 'ebf' | 'jsf' | 'tbf' | 'ftbf' | atom(), [atom() | tuple()]}}
Enable the UBF, EBF, JSF, TBF, FTBF, or an alternative protocol wire format with options. Default: {'ubf', []}. Supported options:
'safe'
Prevents decoding data that may be used to attack the Erlang system. In the event of receiving unsafe data, decoding fails with a badarg error.
{'startplugin', module()}
Set the starting plugin, set after a client first connects to the server. If not set, client’s caller may select the service using the startSession() API. Default: 'undefined'.
{'serverhello', true | undefined}
Meta contract greeting string, sent to a client when it first connects to the server. If 'undefined', client does not expect server hello to be sent by the server. Default: 'true'.
{'simplerpc', boolean()}
Set the simple RPC mode. If 'true', client expects only the rpc reply from the server. If 'false', server returns the rpc reply and next state to client. Default: 'false'.

The rpc/{2,3} functions make a synchronous call to the server.

The stop/1 function closes the connection with the server and stops the client.

The sendEvent/2, install_default_handler/1, and install_handler/2 functions behave in the same way as the server-side implementation to send and receive asynchronous casts.

The lpc/{2,3,4} functions make a synchronous local procedure call to a plugin’s implementation. Regarding the tlogger(), see [TLOG] for further information.

Testing

Unit Tests

The unit tests in the "test/unit" directory provide small examples of how to use all of the public API. In particular, the *client*.erl files contain comments at the top with a list of prerequisites and small examples, recipe-style, for starting each server and using the client.

EUnit Tests

The eunit tests in the "test/eunit" directory perform several smoke and error handling uses cases. The stateless_plugin and stateful_plugin test applications are concrete examples on how to integrate one or more UBF listeners into an Erlang/OTP application.

QuickCheck Tests

The quickcheck tests and related helper libraries in the "test/eqc" directory are deprecated until further notice.

See [QUVIQ] for further information about quickcheck.

Utilities

Transaction Logging

For Erlang, the UBF server and the UBF "LPC" client can be configured to generate a transaction log. The transaction log module must implement the following tlog/6 callback API.

-type op() :: rpc | lpc | event_in | event_out.
-type now() :: {pos_integer(), pos_integer(), pos_integer()}.
-type plugin() :: module().

-spec tlog(op(), Start::now(), plugin(), Q::term(), Reply::term(), Status::term()) -> ok.

Canonical Contracts

For documentation purposes, it is helpful to generate a "canonical" version of a UBF contract. This feature is especially helpful when importing UBF(b) types from one or more plugins.

For UBF, the ubf_utils:ubf_contract/{1,2} functions are available for this purpose. For JSON-RPC (and JSF indirectly), the jsf_utils:ubf_contract/{1,2} functions are available for this purpose.

Option 1

To download, build, and test the UBF application in one shot, please follow this recipe:

$ mkdir working-directory-name
$ cd working-directory-name
$ git clone https://github.com/ubf/ubf.git ubf
$ cd ubf
$ make deps clean compile test

For an alternative recipe with other "features" albeit more complex, please read further.

Option 2

This section describes the basic recipes for the following items:

  • UBF Downloading
  • UBF Building, Testing, and Dialyzing
  • UBF Documentation
  • Erlang/OTP System
  • GitHub - Forking Your Own Repositories

Before getting started, review this checklist of tools and software. Please install and setup as needed.

Erlang/OTP (Mandatory)
Git (Mandatory)
AsciiDoc (Optional)

UBF Documentation

This section is the first step to download and to build your own UBF documentation.

  1. Building UBF’s "Guide" and "Website" basic recipe
    $ cd working-directory-name/deps/ubf/priv/doc/src
    $ make clean
    $ make

    UBF’s documentation is authored using AsciiDoc and a few auxiliary tools:

    • Dia

Erlang/OTP System

This section is the first step to download, to build, and to install your own Erlang/OTP system.

  1. Downloading basic recipe
    1. Get and install Git
    2. Download the source code for your Erlang/OTP system
      $ cd working-directory-name
      $ wget http://www.erlang.org/download/otp_src_R16B.tar.gz
    3. Untar the source code for your Erlang/OTP system.
      $ cd working-directory-name
      $ tar -xzf otp_src_R16B.tar.gz
  2. Building basic recipe
    1. Change to your working directory and configure Erlang/OTP
      $ cd working-directory-name/otp_src_R16B
      $ ./configure --prefix=otp-installing-directory-name
    2. Build Erlang/OTP
      $ cd working-directory-name/otp_src_R16B
      $ make
  3. Installing basic recipe
    $ cd working-directory-name/otp_src_R16B
    $ sudo make install

Caution Please make sure "otp-installing-directory-name/bin" is added to your $PATH environment.

GitHub - Forking Your Own Repositories

If you are interested in making your own changes to UBF or to one or more of the other UBF-related repositories, it is a straightforward process to fork and to build your own repositories using [GITHUB]. GitHub provides a friendly and easy to use environment for developers and the like.

  1. If you haven’t already done so, create your own account on GitHub and setup access with your public ssh key. Next using your web browser, login as yourself to GitHub.
  2. Choose all or a subset of the UBF-related repositories that you are interested in forking. For the sake of an example, let’s choose the ubf-bertrpc repository and open the front page of this repository [UBF_BERTRPC] using your web browser.
  3. Click on the "Fork" button near the top of the page. This action creates a clone of the ubf-bertrpc repository in your own account on GitHub.

Acknowledgments

Many, many thanks to Joe Armstrong, UBF’s designer and original implementer.

Gemini Mobile Technologies, Inc. has approved the release of its extensions, improvements, etc. under an MIT license. Joe Armstrong has also given his blessing to Gemini’s license choice.

The MIT License

Copyright (C) 2011-2016 by Joseph Wayne Norton <[email protected]>
Copyright (c) 2009-2011 Gemini Mobile Technologies, Inc.
Copyright (C) 2002 by Joe Armstrong

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

ABNF Definition

The formal syntax for UBF(a), UBF(b), and UBF(c) is defined in ABNF format per [RFC5234] except for one extension - single quoted strings are case-sensitive.

UBF(a)

ubf-a          = *ubf-a-wsp ubf-a-object *ubf-a-wsp "$"

ubf-a-object   = (ubf-a-term / ubf-a-pop / ubf-a-push) *ubf-a-wsp [ubf-a-tag] *ubf-a-wsp

ubf-a-wsp      = ubf-a-comment / ubf-a-ignore

ubf-a-term     = ubf-a-atom
               / ubf-a-string
               / ubf-a-binary
               / ubf-a-integer
               / ubf-a-list
               / ubf-a-tuple

ubf-a-pop      = ">" ubf-a-register

ubf-a-push     = ubf-a-register

ubf-a-atom     = "'" *(%x20-26 / %x28-5B / %x5D-7E / "\\" / "\'") "'"

ubf-a-string   = '"' *(%x20-21 / %x23-5B / %x5D-7E / '\\' / '\"') '"'

ubf-a-binary   = ubf-a-integer *ubf-a-wsp "~" *OCTET "~"

ubf-a-integer  = ["-"] 1*DIGIT

ubf-a-list     = "#" *ubf-a-wsp [ubf-a-object *ubf-a-wsp "&"]

ubf-a-tuple    = "{" *ubf-a-wsp [ubf-a-object *ubf-a-wsp] "}"

ubf-a-tag      = "`" 1*(%x20-5B / %x5D-5F / %x61-7E / "\\" / "\`") "`"

ubf-a-comment  = "%" *(%x20-24 / %x26-5B / %x5D-7E / "\\" / "\%") "%"

ubf-a-ignore   = SP    ;; %x20
               / LF    ;; %x0A
               / CR    ;; %x0D
               / HTAB  ;; %x09
               / ","   ;; %x2C

ubf-a-control  = "%"   ;; %x25
               / '"'   ;; %x22
               / "~"   ;; %x7E
               / "'"   ;; %x27
               / "`"   ;; %x60
               / "{"   ;; %x7B
               / "}"   ;; %x7D
               / "#"   ;; %x23
               / "&"   ;; %x26
               / "-"   ;; %x2D
               / DIGIT ;; %x30-39
               / ubf-a-ignore

ubf-a-register = %x21  ;; any octet except ubf-a-control
               / %x00-08
               / %x0B-0C
               / %x0E-1F
               / %x23-24
               / %x28-2B
               / %x2F
               / %x3A-5F
               / %x61-7A
               / %x7C
               / %x7F

UBF(b)

ubf-b          = ubf-b-name ubf-b-vsn [ubf-b-type] *ubf-b-state [ubf-b-anystate]

ubf-b-name     = "+" 'NAME' "(" NONEMTPYSTRING ")" dot
ubf-b-vsn      = "+" 'VSN' "(" NONEMTPYSTRING ")" dot
ubf-b-type     = "+" 'TYPES' 1*WSP types dot
ubf-b-state    = "+" 'STATE' 1*WSP statename 1*WSP transitions dot
ubf-b-anystate = "+" 'ANYSTATE' 1*WSP anyrules dot

dot            = "." *c-wsp c-nl
semi           = ";" *c-wsp c-nl
comment        = "%" *(WSP / VCHAR) CRLF
c-nl           = comment / CRLF
c-wsp          = WSP / (c-nl WSP)

statename      = NONEMTPYATOM
typename       = NONEMTPYATOM
recordname     = NONEMTPYATOM
fieldname      = NONEMTPYATOM

types          = typedef
               / (typedef semi types)

typedef        = typeref *c-wsp "::" *c-wsp type [1*WSP annotation] *c-wsp

transitions    = transition
               / (transition semi transitions)

transition     = typeref *c-wsp "=>" *c-wsp outputs *c-wsp
               / event

anyrules       = anyrule
               / (anyrule semi anyrules)

anyrule        = typeref *c-wsp "=>" *c-wsp typeref *c-wsp
               / event

event          = 'EVENT' *c-wsp ("=>" / "<=") *c-wsp typeref *c-wsp

type           = primtype
               / (primtype *c-wsp "|" *c-wsp type)

annotation     = TAG / STRING / BINARY

outputs        = output
               / (output *c-wsp "|" *c-wsp outputs)

output         = typeref *c-wsp "&" *c-wsp statename

primtype       = (typeref [ "?" ])
               / ("{" [typeseq] "}")
               / ("#" recordname "{" [typerec] "}")
               / ("##" recordname "{" [typerec] "}")
               / typelist
               / (INTEGER *WSP ".." *WSP INTEGER)
               / (".." *WSP INTEGER)
               / (INTEGER *WSP "..")
               / ATOM
               / BINARY
               / FLOAT
               / INTEGER
               / STRING
               / (predefinedtype [ "?" ])

typelist       = ("[" [type] "]" [ "?" / "+" / ("{" listrange "}") ])

typeref        = typename "()"

typeseq        = type
               / (type *WSP "," *WSP typeseq)

typerec        = (fieldname *WSP "::" *WSP type)
               / (fieldname *WSP "::" *WSP type "," *WSP typerec)
               / (fieldname *WSP "=" *WSP default *WSP "::" *WSP type)
               / (fieldname *WSP "=" *WSP default *WSP "::" *WSP type "," *WSP typerec)

default        = ("{" [defaultseq] "}")
               /  ("[" [defaultseq] "]")
               / ATOM
               / BINARY
               / FLOAT
               / INTEGER
               / STRING
defaultseq     = default
               / (default *WSP "," *WSP defaultseq)

listrange      = (1*DIGIT)
               / (1*DIGIT *WSP ",")
               / ("," *WSP 1*DIGIT)
               / (1*DIGIT *WSP "," *WSP 1*DIGIT)

ATOM           = (%x61-7A *(ALPHA / DIGIT / "_" / "@")) ;; a-z
               / ("'" *(%x20-26 / %x28-7E) "'")

NONEMTPYATOM   = (%x61-7A 1*(ALPHA / DIGIT / "_" / "@")) ;; a-z
               / ("'" 1*(%x20-26 / %x28-7E) "'")

BINARY         = "<<" STRING ">>"

FLOAT          = ["-"] 1*DIGIT "." 1*DIGIT

INTEGER        = (["-"] 1*DIGIT)
               / (1*DIGIT "#" 1*(DIGIT / 'a' / 'b' / 'c' / 'd' / 'e' / 'f'))

BTICK          = %x60

TAG            = BTICK *(%x20-5F / %x61-7E) BTICK

STRING         = DQUOTE *(%x20-21 / %x23-7E) DQUOTE

NONEMTPYSTRING = DQUOTE 1*(%x20-21 / %x23-7E) DQUOTE

predefinedtype = ('any' "(" [anyattrs] ")")
               / ('none' "(" [noneattrs] ")")
               / ('atom' "(" [atomattrs] ")")
               / ('binary' "(" [binaryattrs] ")")
               / ('float' "(" [floatattrs] ")")
               / ('integer' "(" [integerattrs] ")")
               / ('list' "(" [listattrs] ")")
               / ('tuple' "(" [tupleattrs] ")")

anyattrs       = anyattr
               / (anyattr *WSP "," *WSP anyattrs)

noneattrs      = *WSP

atomattrs      = atomattr
               / (atomattr *WSP "," *WSP atomattrs)

binaryattrs    = binaryattr
               / (binaryattr *WSP "," *WSP binaryattrs)

floatattrs     = *WSP

integerattrs   = *WSP

listattrs      = listattr
               / (listattr *WSP "," *WSP listattrs)

tupleattrs     = tupleattr
               / (tupleattr *WSP "," *WSP tupleattrs)

anyattr        = 'nonempty' / 'nonundefined'
atomattr       = 'ascii' / 'asciiprintable' / 'nonempty' / 'nonundefined'
binaryattr     = 'ascii' / 'asciiprintable' / 'nonempty'
listattr       = 'nonempty'
tupleattr      = 'nonempty' / 'nonundefined'

UBF(c)

ubf-c           = ubf-c-rpc-req
                / ubf-c-rpc-res
                / ubf-c-event-in
                / ubf-c-event-out

ubf-c-rpc-req   = ubf-msg "$"

ubf-c-rpc-res   = "{" (ubf-msg / ubf-error) "," ubf-nextstate "}" "$"

ubf-c-event-in  = "{" 'event_in' "," ubf-msg "}" "$"

ubf-c-event-out = "{" 'event_out' "," ubf-msg "}" "$"

ubf-msg         = ubf-a-term

ubf-nextstate   = ubf-a-atom

ubf-error       = ubf-client-error
                / ubf-server-error

ubf-client-error = "{" 'clientBrokeContract' "," ubf-msg "," ubf-expects-in "}" "$"

ubf-server-error = "{" 'serverBrokeContract' "," ubf-msg "," ubf-expects-out "}" "$"

ubf-expects-in   = ubf-a-term  ;; list of acceptable input types (for debugging purposes)

ubf-expects-out  = ubf-a-term  ;; list of acceptable output types (for debugging purposes)