View the project on GitHub. jakob-schuster/matchbox
Expressions are the smallest unit of matchbox syntax.
A variable name, defined earlier in the script.
Variable names can include letters, numbers and underscores. A name must start with a letter or an underscore. Variable names must be unique, and can't be overwritten.
a = 'hello'
# expressions can refer to bound variables
b = a
# variables bound in patterns are accessible
# inside the body of the branch
if read is [fst:|5| _] => fst.seq.stdout!()
# but NOT outside it!
fst.seq.len().average!()
There are only two Bool
literals: true
and false
.
Numeric literals can be negative, and can include a decimal component.
a = 10000
# b will be -1000
b = a / -10
# c will be -999.99
c = b + 0.01
Strings must be constructed using single quotes; for example, 'hello world'
. Values can be inserted into strings using curly braces {}
.
message = 'read named {read.id} is {read.seq.len()} bases'
stdout!(message)
Record literals are a set of curly braces, containing a number of fields. Each field has a name and an expression, separated by equals.
Fields can be accessed using a dot.
# creating a record, assigning values to some fields
rec = {
primer = AAGTCGATGCTAGTG,
output = 'out.fq',
}
# accessing a field of a record with a dot
if read is [_ rec.primer _] => read.out!(rec.output)
New functions can be defined using function literal syntax. Arguments must be declared with their types, in parentheses, separated by commas. The body of the function comes after an arrow =>
. The function's return type is inferred from its body.
Variable names can be assigned to functions, just like any other value.
# a function that numbers
f = (n1: Num, n2: Num) => n1 * 3
# a function which formats the result of f into a Str
g = (n: Num) => '{n} times two equals {f(n)}'
Functions can also take optional named arguments. These must come after positional arguments in the function definition, and they have a default value which is an expression.
print_both = (v1: Str, v2: Str, separator: Str = ' & ') =>
'{v1}{separator}{v3}'
A function can be applied by writing the function name followed by parentheses enclosing a comma-separated list of arguments.
n = len(read.seq)
Alternatively, a function can be applied with the first argument in front and a dot before the function name. All of the remaining arguments are still written inside the parentheses.
# equivalent
n = read.seq.len()
Similarly, the pipe operator |>
can also be used to apply functions.
# also equivalent
n = read.seq |> len()
# useful when chaining functions together!
read
|> tag('length={len(read.seq)}')
|> out!('file.fq')
Some functions take optional named arguments. These must be given after all the mandatory arguments. The optional arguments themselves can then be given in any order.
read.describe(
{ polya = AAAAAAAAAA },
reverse_complement = true
).count!()
A number of built-in common operators can be used. They are applied prefix or infix as appropriate.
# + and > are both operators
if 10 + 2 > 11 => 'basic maths' |> stdout()
Some operators bind more tightly than others. The full list of operators is given below, from tightest to loosest precedence.
Precedence | Operators | ||||||||
---|---|---|---|---|---|---|---|---|---|
0 |
All other expressions |
||||||||
1 |
|
||||||||
2 |
|
||||||||
3 |
|
||||||||
4 |
|
||||||||
5 |
|
||||||||
6 |
|
||||||||
7 |
|