View the project on GitHub. jakob-schuster/matchbox

Navigation

Fixed-length regions

Sometimes we want to extract a fixed number of bases from a read. Fixed-length regions can be specified using vertical bars:

if read is [first_five:|5| _] => 
    first_five.out!('starts.fq')

Fixed-length regions can be taken from either side of the read:

if read is [_ last_five:|5|] => 
    last_five.out!('ends.fq')

If we also want to separately extract the rest of the sequence:

if read is [start:|5| rest:_ end:|5|] => {
    start.out!('starts.fq')
    rest.out!('rest.fq')
    end.out!('ends.fq')
}

Fixed-length regions around known sequences

Fixed-length regions can also be extracted from either side of known sequences:

primer = AGCTAGCTGATCGATGAGCT

if read is [_ primer umi:|8| _] =>
    read.tag('umi={umi.seq}')
        .out!('with_umis.fq')
info The tag function allows you to easily append to a read's metadata. See Working with metadata.