blast

package
v0.0.0-...-4206233 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 19, 2013 License: MIT Imports: 9 Imported by: 0

Documentation

Overview

Package blast provides functions and types to help with running any of the BLAST suite of programs. Namely, this package defines an interface `Blaster` whereby values of types that implement it can execute a BLAST search using the `Blast` function in this package.

The results of a BLAST search are captured as XML data and loaded into the `BlastResults` structure automatically.

Note that this is not a package for executing remote BLAST queries on NCBI's web page, but rather, running local programs like "blastp" on a local database.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type BlastHSP

type BlastHSP struct {
	XMLName     xml.Name `xml:"Hsp"`
	Num         int      `xml:"Hsp_num"`
	BitScore    float64  `xml:"Hsp_bit-score"`
	Score       float64  `xml:"Hsp_score"`
	EValue      float64  `xml:"Hsp_evalue"`
	QueryFrom   int      `xml:"Hsp_query-from"`
	QueryTo     int      `xml:"Hsp_query-to"`
	HitFrom     int      `xml:"Hsp_hit-from"`
	HitTo       int      `xml:"Hsp_hit-to"`
	PatternFrom int      `xml:"Hsp_pattern-from"`
	PatternTo   int      `xml:"Hsp_pattern-to"`
	QueryFrame  int      `xml:"Hsp_query-frame"`
	HitFrame    int      `xml:"Hsp_hit-frame"`
	Identity    int      `xml:"Hsp_identity"`
	Positive    int      `xml:"Hsp_positive"`
	Gaps        int      `xml:"Hsp_gaps"`
	AlignLength int      `xml:"Hsp_align-len"`
	Density     int      `xml:"Hsp_density"`
	AlignQuery  string   `xml:"Hsp_qseq"`
	AlignHit    string   `xml:"Hsp_hseq"`
	AlignMiddle string   `xml:"Hsp_midline"`
}

type BlastHit

type BlastHit struct {
	XMLName   xml.Name   `xml:"Hit"`
	Num       int        `xml:"Hit_num"`
	Id        string     `xml:"Hit_id"`
	Def       string     `xml:"Hit_def"`
	Accession string     `xml:"Hit_accession"`
	Length    int        `xml:"Hit_len"`
	Hsps      []BlastHSP `xml:"Hit_hsps>Hsp"`
}

type BlastIteration

type BlastIteration struct {
	XMLName  xml.Name        `xml:"Iteration"`
	Num      int             `xml:"Iteration_iter-num"`
	QueryID  string          `xml:"Iteration_query-ID"`
	QueryDef string          `xml:"Iteration_query-def"`
	QueryLen int             `xml:"Iteration_query-len"`
	Hits     []BlastHit      `xml:"Iteration_hits>Hit"`
	Stats    BlastStatistics `xml:"Iteration_stat>Statistics"`
	Message  string          `xml:"Iteration_message"`
}

type BlastParams

type BlastParams struct {
	XMLName     xml.Name `xml:"Parameters"`
	Matrix      string   `xml:"Parameters_matrix"`
	Expect      float64  `xml:"Parameters_exect"`
	Include     float64  `xml:"Parameters_include"`
	ScMatch     int      `xml:"Parameters_sc-match"`
	ScMismatch  int      `xml:"Parameters_sc-mismatch"`
	GapOpen     int      `xml:"Parameters_gap-open"`
	GapExtend   int      `xml:"Parameters_gap-extend"`
	Filter      string   `xml:"Parameters_filter"`
	Pattern     string   `xml:"Parameters_pattern"`
	EntrezQuery string   `xml:"Parameters_entrez-query"`
}

type BlastResults

type BlastResults struct {
	XMLName    xml.Name         `xml:"BlastOutput"`
	Program    string           `xml:"BlastOutput_program"`
	Version    string           `xml:"BlastOutput_version"`
	Reference  string           `xml:"BlastOutput_reference"`
	DB         string           `xml:"BlastOutput_db"`
	QueryID    string           `xml:"BlastOutput_query-ID"`
	QueryDef   string           `xml:"BlastOutput_query-def"`
	QueryLen   int              `xml:"BlastOutput_query-len"`
	QuerySeq   string           `xml:"BlastOutput_query-seq"`
	Params     BlastParams      `xml:"BlastOutput_param>Parameters"`
	Iterations []BlastIteration `xml:"BlastOutput_iterations>Iteration"`
}

BlastResults is the top-level struct for representing XML output of the BLAST family of programs. Subsequent XML elements are represented with other `Blast*` types.

The types are meant to be comprehensive with respect to NCBI's DTD found here: http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd. Note that the meat is really here: http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.mod.dtd.

func Blast

func Blast(blaster Blaster) (*BlastResults, error)

Blast executes the search query described by blaster. Search results are returned from Blast's XML output format mode.

Example

ExampleBlast demonstrates a very simple protein BLAST search. Note that you'll need to change `dbPath` to your own local BLAST database. The one I used in the example is a BLAST database containing all of the protein sequences from each strain of yeast from http://www.yeastgenome.org.

dbPath := "/home/andrew/research/repeats/data/blast/amino"
sequence := seq.Sequence{
	Name: "YAL001C",
	Residues: []seq.Residue(`
	MVLTIYPDELVQIVSDKIASNKGKITLNQLWDISGKYFDLSDKKVKQFVLSCVILKKDIE
	VYCDGAITTKNVTDIIGDANHSYSVGITEDSLWTLLTGYTKKESTIGNSAFELLLEVAKS
	GEKGINTMDLAQVTGQDPRSVTGRIKKINHLLTSSQLIYKGHVVKQLKLKKFSHDGVDSN
	PYINIRDHLATIVEVVKRSKNGIRQIIDLKRELKFDKEKRLSKAFIAAIAWLDEKEYLKK
	VLVVSPKNPAIKIRCVKYVKDIPDSKGSPSFEYDSNSADEDSVSDSKAAFEDEDLVEGLD
	NFNATDLLQNQGLVMEEKEDAVKNEVLLNRFYPLQNQTYDIADKSGLKGISTMDVVNRIT
	GKEFQRAFTKSSEYYLESVDKQKENTGGYRLFRIYDFEGKKKFFRLFTAQNFQKLTNAED
	EISVPKGFDELGKSRTDLKTLNEDNFVALNNTVRFTTDSDGQDIFFWHGELKIPPNSKKT
	PNKNKRKRQVKNSTNASVAGNISNPKRIKLEQHVSTAQEPKSAEDSPSSNGGTVVKGKVV
	NFGGFSARSLRSLQRQRAILKVMNTIGGVAYLREQFYESVSKYMGSTTTLDKKTVRGDVD
	LMVESEKLGARTEPVSGRKIIFLPTVGEDAIQRYILKEKDSKKATFTDVIHDTEIYFFDQ
	TEKNRFHRGKKSVERIRKFQNRQKNAKIKASDDAISKKSTSVNVSDGKIKRRDKKVSAGR
	TTVVVENTKEDKTVYHAGTKDGVQALIRAVVVTKSIKNEIMWDKITKLFPNNSLDNLKKK
	WTARRVRMGHSGWRAYVDKWKKMLVLAIKSEKISLRDVEELDLIKLLDIWTSFDEKEIKR
	PLFLYKNYEENRKKFTLVRDDTLTHSGNDLAMSSMIQREISSLKKTYTRKISASTKDLSK
	SQSDDYIRTVIRSILIESPSTTRNEIEALKNVGNESIDNVIMDMAKEKQIYLHGSKLECT
	DTLPDILENRGNYKDFGVAFQYRCKVNELLEAGNAIVINQEPSDISSWVLIDLISGELLN
	MDVIPMVRNVRPLTYTSRRFEIRTLTPPLIIYANSQTKLNTARKSAVKVPLGKPFSRLWV
	NGSGSIRPNIWKQVVTMVVNEIIFHPGITLSRLQSRCREVLSLHEISEICKWLLERQVLI
	TTDFDGYWVNHNWYSIYEST*
	`),
}

blaster := NewBlastp([]seq.Sequence{sequence}, dbPath)
blaster.SetFlag("evalue", 0.1)

results, err := Blast(blaster)
if err != nil {
	fmt.Println(err)
	return
}

hit := results.Iterations[0].Hits[0].Def
fmt.Println(strings.Contains(strings.ToLower(hit), "tfc3"))
Output:

true

type BlastStatistics

type BlastStatistics struct {
	XMLName      xml.Name `xml:"Statistics"`
	NumSequences int      `xml:"Statistics_db-num"`
	Length       int      `xml:"Statistics_db-len"`
	HSPLength    int      `xml:"Statistics_hsp-len"`
	EffSpace     float64  `xml:"Statistics_eff-space"`
	Kappa        float64  `xml:"Statistics_kappa"`
	Lambda       float64  `xml:"Statistics_lambda"`
	Entropy      float64  `xml:"Statistics_entropy"`
}

type Blaster

type Blaster interface {
	// Executable should return the blast executable to run.
	Executable() string

	// CmdArgs should return a list of command line flags to pass to the
	// blast executable. This list must not include the `-outfmt` flag,
	// since clients of this interface may set it in order to retrieve
	// results in an expected format.
	CmdArgs() []string

	// Stdin, when not nil, will be used for the stdin of the blast process.
	Stdin() io.Reader
}

Blaster represents values that can execute a BLAST search. This package provides some slim implementations of this interface for a couple variations of BLAST. Clients requiring access to some of BLAST's more sophisticated options should provide their own Blaster.

type Query

type Query struct {
	// The BLAST executable to use.
	Exec string
	// contains filtered or unexported fields
}

Query is a generic blaster for any type of BLAST search. It provides a thin wrapper around setting command line flags to pass to a BLAST executable.

func NewBlastn

func NewBlastn(queries []seq.Sequence, database string) *Query

NewBlastn calls NewQuery with "blastn" as the executable.

func NewBlastp

func NewBlastp(queries []seq.Sequence, database string) *Query

NewBlastp calls NewQuery with "blastp" as the executable.

func NewQuery

func NewQuery(exec string, queries []seq.Sequence, database string) *Query

NewQuery constructs a generic blast search with default parameters. Parameters can be overridden using the `SetFlag` method.

Note that `queries` may have length 0. If it does, then the obligation is on the caller to set the `-query` flag (or provide some other means of giving BLAST a search query).

This also sets the `-num_threads` flag to the number of logical CPUs on your machine.

func (Query) CmdArgs

func (fs Query) CmdArgs() []string

func (*Query) Executable

func (b *Query) Executable() string

func (*Query) SetFlag

func (b *Query) SetFlag(name string, value interface{})

SetFlag adds a command line switch (without the proceeding "-") to the set of blastp arguments. `value` should be a string, integer, float, bool or other type with an appropriate `Stringer` implementation that results in a valid command line flag value.

If `value` is `false`, then the flag is removed from the blastp arguments.

func (*Query) Stdin

func (b *Query) Stdin() io.Reader

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL