pipeline

package
v0.0.0-...-1be76fa Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 11, 2020 License: MIT Imports: 25 Imported by: 0

Documentation

Overview

Package pipeline contains a streaming pipeline implementation based on the Gopher Academy article by S. Lampa - Patterns for composable concurrent pipelines in Go (https://blog.gopheracademy.com/advent-2015/composable-pipelines-improvements/)

Index

Constants

View Source
const BUFFERSIZE int = 64

BUFFERSIZE is the size of the buffer used by the pipeline channels

Variables

This section is empty.

Functions

This section is empty.

Types

type AlignCmd

type AlignCmd struct {
	Fasta           bool
	BloomFilter     bool
	MinKmerCoverage float64
	BAMout          string
	NoExactAlign    bool // turn off the exact alignment and BAM output - only used by WASP currently
}

AlignCmd stores the runtime info for the sketch command

type DataStreamer

type DataStreamer struct {
	// contains filtered or unexported fields
}

DataStreamer is a pipeline process that streams data from STDIN/file

func NewDataStreamer

func NewDataStreamer(info *Info) *DataStreamer

NewDataStreamer is the constructor

func (*DataStreamer) Connect

func (proc *DataStreamer) Connect(input []string)

Connect is the method to connect the DataStreamer to some data source

func (*DataStreamer) Run

func (proc *DataStreamer) Run()

Run is the method to run this process, which satisfies the pipeline interface

type EMpathFinder

type EMpathFinder struct {
	// contains filtered or unexported fields
}

EMpathFinder is a pipeline process to identify graph paths using Expectation Maximization

func NewEMpathFinder

func NewEMpathFinder(info *Info) *EMpathFinder

NewEMpathFinder is the constructor

func (*EMpathFinder) Connect

func (proc *EMpathFinder) Connect(previous *GFAreader)

Connect is the method to connect the MCMCpathFinder to the output of a GFAreader

func (*EMpathFinder) ConnectPruner

func (proc *EMpathFinder) ConnectPruner(previous *GraphPruner)

ConnectPruner is the method to connect the MCMCpathFinder to the output of a GraphPruner

func (*EMpathFinder) Run

func (proc *EMpathFinder) Run()

Run is the method to run this process, which satisfies the pipeline interface

type FastqChecker

type FastqChecker struct {
	// contains filtered or unexported fields
}

FastqChecker is a process to quality check FASTQ reads and send the sequence on for mapping

func NewFastqChecker

func NewFastqChecker(info *Info) *FastqChecker

NewFastqChecker is the constructor

func (*FastqChecker) Connect

func (proc *FastqChecker) Connect(previous *FastqHandler)

Connect is the method to join the input of this process with the output of FastqHandler

func (*FastqChecker) Run

func (proc *FastqChecker) Run()

Run is the method to run this process, which satisfies the pipeline interface TODO: I've removed the QC bits for now

type FastqHandler

type FastqHandler struct {
	// contains filtered or unexported fields
}

FastqHandler is a pipeline process to convert a pipeline to the FASTQ type

func NewFastqHandler

func NewFastqHandler(info *Info) *FastqHandler

NewFastqHandler is the constructor

func (*FastqHandler) Connect

func (proc *FastqHandler) Connect(previous *DataStreamer)

Connect is the method to join the input of this process with the output of a DataStreamer

func (*FastqHandler) ConnectWASM

func (proc *FastqHandler) ConnectWASM(previous *WASMstreamer)

ConnectWASM is a tmp solution for WASM

func (*FastqHandler) Run

func (proc *FastqHandler) Run()

Run is the method to run this process, which satisfies the pipeline interface

type GFAreader

type GFAreader struct {
	// contains filtered or unexported fields
}

GFAreader is a pipeline process that reads in the weighted GFAs

func NewGFAreader

func NewGFAreader(info *Info) *GFAreader

NewGFAreader is the constructor

func (*GFAreader) Connect

func (proc *GFAreader) Connect(input []string)

Connect is the method to connect the GFAreader to some data source

func (*GFAreader) Run

func (proc *GFAreader) Run()

Run is the method to run this process, which satisfies the pipeline interface

type GraphPruner

type GraphPruner struct {
	// contains filtered or unexported fields
}

GraphPruner is a pipeline process to prune the graphs post mapping

func NewGraphPruner

func NewGraphPruner(info *Info, conH bool) *GraphPruner

NewGraphPruner is the constructor

func (*GraphPruner) CollectOutput

func (proc *GraphPruner) CollectOutput() []string

CollectOutput is a method to return what paths are left post-pruning

func (*GraphPruner) Connect

func (proc *GraphPruner) Connect(previous *ReadMapper)

Connect is the method to join the input of this process with the output of ReadMapper

func (*GraphPruner) Run

func (proc *GraphPruner) Run()

Run is the method to run this process, which satisfies the pipeline interface

type GraphSketcher

type GraphSketcher struct {
	// contains filtered or unexported fields
}

GraphSketcher is a pipeline process that windows graph traversals and sketches them

func NewGraphSketcher

func NewGraphSketcher(info *Info) *GraphSketcher

NewGraphSketcher is the constructor

func (*GraphSketcher) Connect

func (proc *GraphSketcher) Connect(previous *MSAconverter)

Connect is the method to connect the MSAconverter to some data source

func (*GraphSketcher) Run

func (proc *GraphSketcher) Run()

Run is the method to run this process, which satisfies the pipeline interface

type HaploCmd

type HaploCmd struct {
	Cutoff        float64
	MinIterations int
	MaxIterations int
	TotalKmers    int
	HaploDir      string
}

HaploCmd stores the runtime info for the haplotype command

type HaplotypeParser

type HaplotypeParser struct {
	// contains filtered or unexported fields
}

HaplotypeParser is a pipeline process to parse the paths produced by the MCMCpathFinder process

func NewHaplotypeParser

func NewHaplotypeParser(info *Info) *HaplotypeParser

NewHaplotypeParser is the constructor

func (*HaplotypeParser) CollectOutput

func (proc *HaplotypeParser) CollectOutput() []string

CollectOutput is a method to return what paths are found via MCMC

func (*HaplotypeParser) Connect

func (proc *HaplotypeParser) Connect(previous *EMpathFinder)

Connect is the method to connect the HaplotypeParser to the output of a EMpathFinder

func (*HaplotypeParser) Run

func (proc *HaplotypeParser) Run()

Run is the method to run this process, which satisfies the pipeline interface

type Info

type Info struct {
	Version              string
	NumProc              int
	Profiling            bool
	KmerSize             int
	SketchSize           int
	WindowSize           int
	NumPart              int
	MaxK                 int
	MaxSketchSpan        int
	ContainmentThreshold float64
	IndexDir             string
	Store                graph.Store

	// the following fields are not written to disk
	Sketch    AlignCmd
	Haplotype HaploCmd
	// contains filtered or unexported fields
}

Info stores the runtime information

func (*Info) AttachDB

func (Info *Info) AttachDB(db *lshe.ContainmentIndex)

AttachDB is a method to attach a LSH Ensemble index to the runtime

func (*Info) Dump

func (Info *Info) Dump(path string) error

Dump is a method to dump the pipeline info to file

func (*Info) Load

func (Info *Info) Load(path string) error

Load is a method to load Info from file

func (*Info) LoadFromBytes

func (Info *Info) LoadFromBytes(data []byte) error

LoadFromBytes is a method to load Info from bytes

func (*Info) SaveDB

func (Info *Info) SaveDB(filePath string) error

SaveDB is a method to write an LSH Ensemble index to disk

type MSAconverter

type MSAconverter struct {
	// contains filtered or unexported fields
}

MSAconverter is a pipeline process that converts a list of MSAs to GFAs

func NewMSAconverter

func NewMSAconverter(info *Info) *MSAconverter

NewMSAconverter is the constructor

func (*MSAconverter) Connect

func (proc *MSAconverter) Connect(input []string)

Connect is the method to connect the MSAconverter to some data source

func (*MSAconverter) Run

func (proc *MSAconverter) Run()

Run is the method to run this process, which satisfies the pipeline interface

type Pipeline

type Pipeline struct {
	// contains filtered or unexported fields
}

Pipeline is the base type, which takes any types that satisfy the process interface

func NewPipeline

func NewPipeline() *Pipeline

NewPipeline is the pipeline constructor

func (*Pipeline) AddProcess

func (Pipeline *Pipeline) AddProcess(proc process)

AddProcess is a method to add a single process to the pipeline

func (*Pipeline) AddProcesses

func (Pipeline *Pipeline) AddProcesses(procs ...process)

AddProcesses is a method to add multiple processes to the pipeline

func (*Pipeline) GetNumProcesses

func (Pipeline *Pipeline) GetNumProcesses() int

GetNumProcesses is a method to return the number of processes registered in a pipeline

func (*Pipeline) Run

func (Pipeline *Pipeline) Run()

Run is a method that starts the pipeline

type ReadMapper

type ReadMapper struct {
	// contains filtered or unexported fields
}

ReadMapper is a pipeline process to query the LSH database, map reads and project alignments onto graphs

func NewReadMapper

func NewReadMapper(info *Info) *ReadMapper

NewReadMapper is the constructor

func (*ReadMapper) CollectReadStats

func (proc *ReadMapper) CollectReadStats() [4]int

CollectReadStats is a method to return the number of reads processed, how many mapped and the number of multimaps

func (*ReadMapper) Connect

func (proc *ReadMapper) Connect(previous *FastqChecker)

Connect is the method to join the input of this process with the output of FastqChecker

func (*ReadMapper) Run

func (proc *ReadMapper) Run()

Run is the method to run this process, which satisfies the pipeline interface

type SketchIndexer

type SketchIndexer struct {
	// contains filtered or unexported fields
}

SketchIndexer is a pipeline process that adds sketches to the LSH Ensemble

func NewSketchIndexer

func NewSketchIndexer(info *Info) *SketchIndexer

NewSketchIndexer is the constructor

func (*SketchIndexer) Connect

func (proc *SketchIndexer) Connect(previous *GraphSketcher)

Connect is the method to connect the MSAconverter to some data source

func (*SketchIndexer) Run

func (proc *SketchIndexer) Run()

Run is the method to run this process, which satisfies the pipeline interface

type WASMstreamer

type WASMstreamer struct {
	// contains filtered or unexported fields
}

WASMstreamer is a pipeline process that streams data from the WASM JS function

func NewWASMstreamer

func NewWASMstreamer() *WASMstreamer

NewWASMstreamer is the constructor

func (*WASMstreamer) ConnectChan

func (proc *WASMstreamer) ConnectChan(inputChan chan []byte)

ConnectChan is a to connect the pipeline to the WASM JS function

func (*WASMstreamer) Run

func (proc *WASMstreamer) Run()

Run is the method to run this process, which satisfies the pipeline interface

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL