align

package
v0.0.0-...-6fbfe6d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 4, 2015 License: BSD-3-Clause Imports: 7 Imported by: 0

Documentation

Overview

Package align provide basic sequence alignment types and helpers.

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	ErrMismatchedTypes     = errors.New("align: mismatched sequence types")
	ErrMismatchedAlphabets = errors.New("align: mismatched alphabets")
	ErrNoAlphabet          = errors.New("align: no alphabet")
	ErrNotGappedAlphabet   = errors.New("align: alphabet does not have gap at position 0")
	ErrTypeNotHandled      = errors.New("align: sequence type not handled")
	ErrMatrixNotSquare     = errors.New("align: scoring matrix is not square")
)

Functions

func Format

func Format(a, b seq.Slicer, f []feat.Pair, gap alphabet.Letter) [2]alphabet.Slice

Format returns a [2]alphabet.Slice representing the formatted alignment of a and b described by the list of feature pairs in f, with gap used to fill gaps in the alignment.

Types

type Affine

type Affine struct {
	Matrix  Linear
	GapOpen int
}

An Affine is a basic affine gap penalty alignment description.

type Aligner

type Aligner interface {
	Align(reference, query AlphabetSlicer) ([]feat.Pair, error)
}

An Aligner aligns the sequence data of two type-matching Slicers, returning an ordered slice of features describing matching and mismatching segments. The sequences to be aligned must have a valid gap letter in the first position of their alphabet; the alphabets {DNA,RNA}{gapped,redundant} and Protein provided by the alphabet package satisfy this.

type AlphabetSlicer

type AlphabetSlicer interface {
	Alphabet() alphabet.Alphabet
	Slice() alphabet.Slice
}

type Fitted

type Fitted Linear

Fitted is the linear gap penalty fitted Needleman-Wunsch aligner type.

func (Fitted) Align

func (a Fitted) Align(reference, query AlphabetSlicer) ([]feat.Pair, error)

Align aligns two sequences using a modified Needleman-Wunsch algorithm that finds a local region of the reference with high similarity to the query. It returns an alignment description or an error if the scoring matrix is not square, or the sequence data types or alphabets do not match.

Example
fsa := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("GTTGACAGACTAGATTCACG"))}
fsa.Alpha = alphabet.DNAgapped
fsb := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("GACAGACGA"))}
fsb.Alpha = alphabet.DNAgapped

//		   Query letter
//  	 -	 A	 C	 G	 T
// -	 0	-5	-5	-5	-5
// A	-5	10	-3	-1	-4
// C	-5	-3	 9	-5	 0
// G	-5	-1	-5	 7	-3
// T	-5	-4	 0	-3	 8
fitted := Fitted{
	{0, -5, -5, -5, -5},
	{-5, 10, -3, -1, -4},
	{-5, -3, 9, -5, 0},
	{-5, -1, -5, 7, -3},
	{-5, -4, 0, -3, 8},
}

aln, err := fitted.Align(fsa, fsb)
if err == nil {
	fmt.Printf("%s\n", aln)
	fa := Format(fsa, fsb, aln, '-')
	fmt.Printf("%s\n%s\n", fa[0], fa[1])
}
Output:

[[3,10)/[0,7)=62 [10,12)/-=-10 [12,14)/[7,9)=17]
GACAGACTAGA
GACAGAC--GA

type FittedAffine

type FittedAffine Affine

FittedAffine is the affine gap penalty fitted Needleman-Wunsch aligner type.

func (FittedAffine) Align

func (a FittedAffine) Align(reference, query AlphabetSlicer) ([]feat.Pair, error)

Align aligns two sequences using a modified Needleman-Wunsch algorithm that finds a local region of the reference with high similarity to the query. It returns an alignment description or an error if the scoring matrix is not square, or the sequence data types or alphabets do not match.

Example
fsa := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("ATTGGCAATGA"))}
fsa.Alpha = alphabet.DNAgapped
fsb := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("ATAGGAA"))}
fsb.Alpha = alphabet.DNAgapped

//		   Query letter
//  	 -	 A	 C	 G	 T
// -	 0	-1	-1	-1	-1
// A	-1	 1	-1	-1	-1
// C	-1	-1	 1	-1	-1
// G	-1	-1	-1	 1	-1
// T	-1	-1	-1	-1	 1
//
// Gap open: -5
fitted := FittedAffine{
	Matrix: Linear{
		{0, -1, -1, -1, -1},
		{-1, 1, -1, -1, -1},
		{-1, -1, 1, -1, -1},
		{-1, -1, -1, 1, -1},
		{-1, -1, -1, -1, 1},
	},
	GapOpen: -5,
}

aln, err := fitted.Align(fsa, fsb)
if err == nil {
	fmt.Printf("%s\n", aln)
	fa := Format(fsa, fsb, aln, '-')
	fmt.Printf("%s\n%s\n", fa[0], fa[1])
}
Output:

[[0,7)/[0,7)=3]
ATTGGCA
ATAGGAA

type Linear

type Linear [][]int

A Linear is a basic linear gap penalty alignment description. It is a square scoring matrix with the first column and first row specifying gap penalties.

type NW

type NW Linear

NW is the linear gap penalty Needleman-Wunsch aligner type.

func (NW) Align

func (a NW) Align(reference, query AlphabetSlicer) ([]feat.Pair, error)

Align aligns two sequences using the Needleman-Wunsch algorithm. It returns an alignment description or an error if the scoring matrix is not square, or the sequence data types or alphabets do not match.

Example
nwsa := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("AGACTAGTTA"))}
nwsa.Alpha = alphabet.DNAgapped
nwsb := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("GACAGACG"))}
nwsb.Alpha = alphabet.DNAgapped

//		   Query letter
//  	 -	 A	 C	 G	 T
// -	 0	-5	-5	-5	-5
// A	-5	10	-3	-1	-4
// C	-5	-3	 9	-5	 0
// G	-5	-1	-5	 7	-3
// T	-5	-4	 0	-3	 8
needle := NW{
	{0, -5, -5, -5, -5},
	{-5, 10, -3, -1, -4},
	{-5, -3, 9, -5, 0},
	{-5, -1, -5, 7, -3},
	{-5, -4, 0, -3, 8},
}

aln, err := needle.Align(nwsa, nwsb)
if err == nil {
	fmt.Printf("%s\n", aln)
	fa := Format(nwsa, nwsb, aln, '-')
	fmt.Printf("%s\n%s\n", fa[0], fa[1])
}
Output:

[[0,1)/-=-5 [1,4)/[0,3)=26 [4,5)/-=-5 [5,10)/[3,8)=12]
AGACTAGTTA
-GAC-AGACG

type NWAffine

type NWAffine Affine

NWAffine is the affine gap penalty Needleman-Wunsch aligner type.

func (NWAffine) Align

func (a NWAffine) Align(reference, query AlphabetSlicer) ([]feat.Pair, error)

Align aligns two sequences using the Needleman-Wunsch algorithm. It returns an alignment description or an error if the scoring matrix is not square, or the sequence data types or alphabets do not match.

Example
nwsa := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("ATAGGAAG"))}
nwsa.Alpha = alphabet.DNAgapped
nwsb := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("ATTGGCAATG"))}
nwsb.Alpha = alphabet.DNAgapped

//		   Query letter
//  	 -	 A	 C	 G	 T
// -	 0	-1	-1	-1	-1
// A	-1	 1	-1	-1	-1
// C	-1	-1	 1	-1	-1
// G	-1	-1	-1	 1	-1
// T	-1	-1	-1	-1	 1
//
// Gap open: -5
needle := NWAffine{
	Matrix: Linear{
		{0, -1, -1, -1, -1},
		{-1, 1, -1, -1, -1},
		{-1, -1, 1, -1, -1},
		{-1, -1, -1, 1, -1},
		{-1, -1, -1, -1, 1},
	},
	GapOpen: -5,
}

aln, err := needle.Align(nwsa, nwsb)
if err == nil {
	fmt.Printf("%s\n", aln)
	fa := Format(nwsa, nwsb, aln, '-')
	fmt.Printf("%s\n%s\n", fa[0], fa[1])
}
Output:

[[0,7)/[0,7)=3 -/[7,9)=-7 [7,8)/[9,10)=1]
ATAGGAA--G
ATTGGCAATG

type SW

type SW Linear

SW is the Smith-Waterman aligner type. Matrix is a square scoring matrix with the last column and last row specifying gap penalties. Currently gap opening is not considered.

func (SW) Align

func (a SW) Align(reference, query AlphabetSlicer) ([]feat.Pair, error)

Align aligns two sequences using the Smith-Waterman algorithm. It returns an alignment description or an error if the scoring matrix is not square, or the sequence data types or alphabets do not match.

type SWAffine

type SWAffine Affine

SWAffine is the affine gap penalty Smith-Waterman aligner type.

func (SWAffine) Align

func (a SWAffine) Align(reference, query AlphabetSlicer) ([]feat.Pair, error)

Align aligns two sequences using the Smith-Waterman algorithm. It returns an alignment description or an error if the scoring matrix is not square, or the sequence data types or alphabets do not match.

Example
swsa := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("ATAGGAAG"))}
swsa.Alpha = alphabet.DNAgapped
swsb := &linear.Seq{Seq: alphabet.BytesToLetters([]byte("ATTGGCAATG"))}
swsb.Alpha = alphabet.DNAgapped

//		   Query letter
//  	 -	 A	 C	 G	 T
// -	 0	-1	-1	-1	-1
// A	-1	 1	-1	-1	-1
// C	-1	-1	 1	-1	-1
// G	-1	-1	-1	 1	-1
// T	-1	-1	-1	-1	 1
//
// Gap open: -5
smith := SWAffine{
	Matrix: Linear{
		{0, -1, -1, -1, -1},
		{-1, 1, -1, -1, -1},
		{-1, -1, 1, -1, -1},
		{-1, -1, -1, 1, -1},
		{-1, -1, -1, -1, 1},
	},
	GapOpen: -5,
}

aln, err := smith.Align(swsa, swsb)
if err == nil {
	fmt.Printf("%s\n", aln)
	fa := Format(swsa, swsb, aln, '-')
	fmt.Printf("%s\n%s\n", fa[0], fa[1])
}
Output:

[[0,7)/[0,7)=3]
ATAGGAA
ATTGGCA

Directories

Path Synopsis
Package matrix provides a variety of alignment scoring matrices for sequence alignment.
Package matrix provides a variety of alignment scoring matrices for sequence alignment.
Package pals implements functions and methods required for PALS sequence alignment.
Package pals implements functions and methods required for PALS sequence alignment.
dp
Package providing PALS dynamic programming alignment routines.
Package providing PALS dynamic programming alignment routines.
filter
Package providing PALS sequence hit filter routines based on 'Efficient q-gram filters for finding all 𝛜-matches over a given length.' Kim R. Rasmussen, Jens Stoye, and Eugene W. Myers.
Package providing PALS sequence hit filter routines based on 'Efficient q-gram filters for finding all 𝛜-matches over a given length.' Kim R. Rasmussen, Jens Stoye, and Eugene W. Myers.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL