View the project on GitHub. jakob-schuster/matchbox

Navigation

Expressions

Expressions are the smallest unit of matchbox syntax.


Variables

A variable name, defined earlier in the script.

Variable names can include letters, numbers and underscores. A name must start with a letter or an underscore. Variable names must be unique, and can't be overwritten.

a = 'hello'
# expressions can refer to bound variables
b = a

# variables bound in patterns are accessible 
# inside the body of the branch
if read matches [fst:|5| _] => fst.seq.stdout!()

# but NOT outside it! 
fst.seq.len().average!()

Boolean literals

There are only two Bool literals: true and false.


Numeric literals

Numeric literals can be negative, and can include a decimal component.

a = 10000

# b will be -1000
b = a / -10

# c will be -999.99
c = b + 0.01

String literals

Strings must be constructed using single quotes; for example, 'hello world'. Values can be inserted into strings using curly braces {}.

message = 'read named {read.id} is {read.seq.len()} bases'

stdout!(message)

List literals

List literals are a set of square brackets, containing a number of values. All values must be of the same type.

# creating a list of Str
barcodes = [AAAA, CCCC, GTGT]

# accessing a field of a record with a dot
if read matches [_ b _] for b in barcodes => read.id |> stdout!()

Record literals

Record literals are a set of curly braces, containing a number of fields. Each field has a name and an expression, separated by equals.

Fields can be accessed using a dot.

# creating a record, assigning values to some fields
rec = {
    primer = AAGTCGATGCTAGTG,
    output = 'out.fq',
}

# accessing a field of a record with a dot
if read matches [_ rec.primer _] => read.out!(rec.output)

Function application

A function can be applied by writing the function name followed by parentheses enclosing a comma-separated list of arguments.

n = len(read.seq)

Alternatively, a function can be applied with the first argument in front and a dot before the function name. All of the remaining arguments are still written inside the parentheses.

# equivalent
n = read.seq.len()

Similarly, the pipe operator |> can also be used to apply functions.

# also equivalent
n = read.seq |> len()

# useful when chaining functions together!
read 
    |> tag('length={len(read.seq)}') 
    |> out!('file.fq')

Some functions take optional named arguments. These must be given after all the mandatory arguments. The optional arguments themselves can then be given in any order.

read.describe(
    { polya = AAAAAAAAAA }, 
    reverse_complement = true
).count!()

Operators

A number of built-in common operators can be used. They are applied prefix or infix as appropriate.

# + and > are both operators
if 10 + 2 > 11 {
    'basic maths' |> stdout()
}

Some operators bind more tightly than others. The full list of operators is given below, from tightest to loosest precedence.

Precedence Operators
0 All other expressions
1
-Num Unary negation
-( Str | Read ) Reverse-complementation
not Bool Logical NOT
2
Num * Num Multiplication
Num / Num Division
Num % Num
Modulo
3
Num + Num Addition
Num - Num Subtraction
4
Num < Num Less than
Num > Num Greater than
Num <= Num Less than or equal
Num >= Num Greater than or equal
5
Any == Any Equality
Any != Any Inequality
6
Bool and Bool Logical AND
7
Bool or Bool Logical OR