Skip to content

line-o/xbow

Repository files navigation

xBow

xBow logo

Test and Release semantic-release

XQuery helper function library to be used with the arrow operator. Should be read as crossbow, a tool to shoot arrows fast and accurately.

Usage

The library provides a small number of useful functions when working with the arrow operator with sequences or sequences in general.

Install the XAR and

import module namespace xbow="http://line-o.de/xq/xbow";

Below you'll find some key things xbow has to offer. For more, have a look at the examples and tests.

Even if the module was developed with arrow expressions in mind, all of it can be used without this language feature, as well.

Requirements

  • XQuery version 3.1+
  • eXist-db version 4.7.0+

Accessors

Many of the functions in this module have a second signature, expecting an accessor function. This concept was inspired by D3, since it is an excellent tool to mangle data.

The idea is that you tell the operation you want to perform how to retrieve the value it should operate on. Maybe you want to filter user elements by let's sey their karma points property, but you do not want to lose the elements itself in the process. Then, passing an accessor to the comparison function would do the trick. Since the key function (or accessor) knows how to deal with the datatype of your sequence items, most operations can be used on atomics, nodes, maps and arrays. All item()* welcome.

In the section Grouping is a code example.

There are two functions, xbow:pluck and xbow:pluck-deep, that can help making use of accessors. The second example in Filtering uses pluck.

Filtering

(0 to 9) => filter(xbow:gt(4)) => filter(xbow:lt(6))

The above code outputs 5 since all numbers less and greater were filtered out.

xbow implements the usual comparison functions eq, ne, lt, le, gt and ge.

They all return a function, so that they can be used in combination with filter and for-each. They all accept an accessor as a second argument.

Here it is in action (this time using an array as input):

[
  map { 'a': 1, 'b': 2 },
  map { 'a': 2, 'b': 1 },
  map { 'a': 3, 'b': 1 }
] 
    => array:filter(xbow:eq(1, xbow:pluck('b')))
    => array:for-each(xbow:pluck('a'))
    (: yields (2, 3) :)

Grouping

Grouping items after an arrow would require some boilerplate code. This is where xbow:groupBy comes in handy.

(
  <user first='Mike' last='Hill'/>,
  <user first='Paula' last='Moon'/>,
  <user first='Carla' last='Harlowe'/>,
  <user first='Fela' last='Kuti'/>
)
  => xbow:groupBy(function ($item as element()) {
      substring($item/@last, 1, 1) })

The second parameter is an accessor function. A concept shamelessly copied from D3.

The function will always return a map. It will have all items of the input sequence. Any value produced by the accessor function, will be a key in the resulting map.

map {
  'H': (
    <user first='Mike' last='Hill'/>,
    <user first='Carla' last='Harlowe'/>
  )
  'M': <user first='Paula' last='Moon'/>,
  'K': <user first='Fela' last='Kuti'/>
}

Sorting

This is just a small wrapper around the normal sort function. Mainly, to not have to remember to add an empty sequence, if you just want to use a sorting function.

This will sort numerical entries in descending order.

(0, 3, 9, 8) => xbow:descending()

Returns (9, 8, 3, 0)

Using fn:sort produces the same output.

(0, 3, 9, 8) => sort((), function ($a) { -$a })

Folding

fold-left, fold-right

Sometimes you want to test a sequence of items if all, some or none of them meet a certain condition.

All

xbow:all returns true, if the testing function returns true() for each item.

(1 to 4) => xbow:all(xbow:lt(5))

Some

xbow:some returns true, if the testing function returns true() for at least one item.

('1', '2', '3') => xbow:some(xbow:eq('2'))

None

xbow:none returns true, if the testing function returns false() for each item.

(['1', '1'], ['1', '2'], ['1', '3']) => xbow:none(xbow:eq('2', xbow:pluck(1)))

Nodes

There is a number of functions to output or operate on nodes, attributes and elements.

For example: Wrapping a single value in an element of a certain type with wrap-element or each item in a sequence (wrap-each).

(1 to 3) 
  => xbow:wrap-each('item')
  => xbow:wrap-element('root')

outputs

<root>
  <item>1</item>
  <item>2</item>
  <item>3</item>
</root>

Utility

Some handy expressions in XQuery, like ?* to convert an array into a sequence, cannot be easily used after the arrow operator. xbow wraps those in functions for you so that you do not have to.

[1,2] => xbow:to-sequence(),
(1,2) => xbow:to-array()

FLWOR operations allow you to operate on the position of each element. Since fn:for-each does not allow that, xbow:for-each-index was added. It can operate on a sequence or an array.

(2 to 4)
  => xbow:for-each-index(function ($v, $i) {
    if ($i = 2) then ($v - $i) else ($v)
  })

results in (2, 1, 4). Only the value for the second item changed.

xbow:array-fill lets you create an arrayfill an array with a value

xbow:array-fill(3, true())

or the return value of a function

xbow:array-fill(10, function ($value, $position) {
  xbow:lt(($value - 5))
})

The above will output an array of 10 anonymous functions. Each of them comparators suitable to use in xbow:categorize. The range will be from xbow:lt(-5) to xbow:lt(5).

xbow:last is the latest addition to the utility functions. It returns the last element of a sequence or array. If the list does not contain elements an empty sequence is returned.

[1, 2, 9] => xbow:last() (: returns 9 :)
(1, 2, 9) => xbow:last() (: returns 9 :)
(1, 2, []) => xbow:last() (: returns [] :)

xbow:get-type will return type information of the provided item (not a sequence).

NOTE: It will look deep into the inner structure of the item, if it is nested. Sequences as well as arities are not detected. Whenever mixed content is found "*" is returned.

1 => xbow:get-type() (: returns "xs:integer" :)
[1, 2, true()] => xbow:get-type() (: returns "array(*)" :)
[map {1: "one", 2: "two", 9:"nine"}, map {1: "eins", 2: "zwei", 9:"neun"}] 
  => xbow:get-type() (: returns "array(map(xs:integer, xs:string))" :)

General Arrow Syntax

Remember:

The arrow operator must be followed by a function expression. The first argument will be the left hand side, there is no way around that.

(0 to 9) => sum(),
(0 to 9) => (function($a) { sum($a) })()

is fine.

(0 to 9) => concat(?)

will return a function with an arity of 1.

But

(0 to 9) => sum(.),

(0 to 9) => sum#1 or

(0 to 9) => function ($a) { sum($a) }

will throw an exception!

Roadmap / TODOs

  • Add convenience functions for boolean tests on sequence items (all, none, some)

  • Rename xbow:map-reverse (to xbow:map-flip for example)

  • Change namespace to xb for brevity

  • Test/support elements with namespaces in wrap*

  • More examples

  • Extend documentation on concepts

  • Generate XQDoc documentation at build

  • Replace ant with gulp-exist +watcher

  • Create packages for other XQuery runtimes

  • Split up into modules with separate scopes (DOM, utility, ...)

  • Add element constructors (inspired by Micheal Kays proposal)

  • Rename xbow:groupBy to xbow:group-by to adhere to the XQuery naming convetions

Compatibility

The xBow module tests are written for eXist-db and this is also its build target. In theory the core library module should run on any XQuery 3.1 processor. It depends on functions only available in XQuery version 3.1 (or higher).

The xBow module is compatible with Saxon 10 (HE) since v1.2.0. Older versions of the Home Edition of Saxon do not allow the use of higher order functions. Saxon 9 PE and 9 EE might work as well.

You can run xBow on baseX (tested with version 9.4.5). To install the most recent version run REPO INSTALL https://raw.githubusercontent.com/line-o/xbow/master/src/content/xbow.xqm in basex REPL.

The released package requires eXist-db 4.7.0 or higher, due to a bug that caused unpredictable behaviour when using xbow:wrap-element and related functions (see original issue for details).

Performance

We are trading readability, maintainability and comfort for speed with this module. FLOWR-expressions are well optimized and execute at least twice as fast. This module will only really shine when the runtime is able to parallelize function calls and would also benefit heavily from lazy evaluation.