View the project on GitHub. jakob-schuster/matchbox
Expressions are the smallest unit of matchbox syntax.
A variable name, defined earlier in the script.
Variable names can include letters, numbers and underscores. A name must start with a letter or an underscore. Variable names must be unique, and can't be overwritten.
a = 'hello'
# expressions can refer to bound variables
b = a
# variables bound in patterns are accessible
# inside the body of the branch
if read matches [fst:|5| _] => fst.seq.stdout!()
# but NOT outside it!
fst.seq.len().average!()
There are only two Bool literals: true and false.
Numeric literals can be negative, and can include a decimal component.
a = 10000
# b will be -1000
b = a / -10
# c will be -999.99
c = b + 0.01
Strings must be constructed using single quotes; for example, 'hello world'. Values can be inserted into strings using curly braces {}.
message = 'read named {read.id} is {read.seq.len()} bases'
stdout!(message)
List literals are a set of square brackets, containing a number of values. All values must be of the same type.
# creating a list of Str
barcodes = [AAAA, CCCC, GTGT]
# accessing a field of a record with a dot
if read matches [_ b _] for b in barcodes => read.id |> stdout!()
Record literals are a set of curly braces, containing a number of fields. Each field has a name and an expression, separated by equals.
Fields can be accessed using a dot.
# creating a record, assigning values to some fields
rec = {
primer = AAGTCGATGCTAGTG,
output = 'out.fq',
}
# accessing a field of a record with a dot
if read matches [_ rec.primer _] => read.out!(rec.output)
A function can be applied by writing the function name followed by parentheses enclosing a comma-separated list of arguments.
n = len(read.seq)
Alternatively, a function can be applied with the first argument in front and a dot before the function name. All of the remaining arguments are still written inside the parentheses.
# equivalent
n = read.seq.len()
Similarly, the pipe operator |> can also be used to apply functions.
# also equivalent
n = read.seq |> len()
# useful when chaining functions together!
read
|> tag('length={len(read.seq)}')
|> out!('file.fq')
Some functions take optional named arguments. These must be given after all the mandatory arguments. The optional arguments themselves can then be given in any order.
read.describe(
{ polya = AAAAAAAAAA },
reverse_complement = true
).count!()
A number of built-in common operators can be used. They are applied prefix or infix as appropriate.
# + and > are both operators
if 10 + 2 > 11 {
'basic maths' |> stdout()
}
Some operators bind more tightly than others. The full list of operators is given below, from tightest to loosest precedence.
| Precedence | Operators | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| 0 |
All other expressions |
||||||||
| 1 |
|
||||||||
| 2 |
|
||||||||
| 3 |
|
||||||||
| 4 |
|
||||||||
| 5 |
|
||||||||
| 6 |
|
||||||||
| 7 |
|