sam

package
v4.1.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 19, 2020 License: AGPL-3.0, AGPL-3.0-or-later Imports: 26 Imported by: 0

Documentation

Overview

Package sam is a library for parsing and representing SAM files, and for efficiently executing sequencing pipelines on .sam/.bam files, taking advantage of modern multi-core processors.

Modifications to headers and alignments are expressed as filters. The library comes with a number of commonly used pre-defined filters, but you can also define and use your own filters. A pipeline can be executed with the RunPipeline method of the PipelineInput interface, which accepts SAM/BAM files as input and/or output sources, but can also operate on an in-memory representation of such files. PipelineInput and PipelineOutput can be implemented to also operate on other input/output sources, such as databases.

elPrep provides high-level Filter and AlignmentFilter types that operate on SAM file header and alignment structs. elPrep then uses the pargo library for expressing pipelines of such filters for efficient parallel execution. It is normally not necessary to deal with pargo pipelines directly, but you can check the documentation at https://godoc.org/github.com/ExaScience/pargo/pipeline for details of pargo pipelines if necessary.

Index

Constants

View Source
const (
	SamExt = ".sam"
	BamExt = ".bam"
)

SAM file extensions.

View Source
const (
	FileFormatVersion = "1.6"
	FileFormatDate    = "22 May 2018"
)

The SAM file format version and date strings supported by this library. This is entered by default in an @HD line in the header section of a SAM file, unless user code explicitly asks for a different version number. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

View Source
const (
	// Template having multiple segments in sequencing.
	Multiple = 0x1

	// Each segment properly aligned according to the aligner.
	Proper = 0x2

	// Segment unmapped.
	Unmapped = 0x4

	// Next segment in the template unmapped.
	NextUnmapped = 0x8

	// SEQ being reversed complemented.
	Reversed = 0x10

	// SEQ of the next segment in the template being reverse
	// complemented.
	NextReversed = 0x20

	// The first segment in the template.
	First = 0x40

	// The last segment in the template.
	Last = 0x80

	// Secondary alignment.
	Secondary = 0x100

	// Not passing filters, such as platform/vendor quality controls.
	QCFailed = 0x200

	// PCR or optical duplicate.
	Duplicate = 0x400

	// Supplementary alignment.
	Supplementary = 0x800
)

Bit values for the FLAG field in the Alignment struct. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

Variables

View Source
var (
	CC = utils.Intern("CC")
	LB = utils.Intern("LB")
	PG = utils.Intern("PG")
	PU = utils.Intern("PU")
	RG = utils.Intern("RG")
)

Symbols for some commonly used optional fields. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

View Source
var (
	LIBID = utils.Intern("LIBID")
	REFID = utils.Intern("REFID")
)

Symbols for some temporary fields.

Functions

func AlignmentToBytes

func AlignmentToBytes(writer *OutputFile) pipeline.Filter

AlignmentToBytes returns a pargo pipeline.Filter that formats slices of Alignment pointers into slices of bytes representing these alignments according to the SAM/BAM file format.

func BytesToAlignment

func BytesToAlignment(reader *InputFile) pipeline.Filter

BytesToAlignment returns a pargo pipeline.Filter that parses slices of bytes representing alignments according to the SAM/BAM file format into slices of pointers to freshly allocated Alignment values.

func BytesToAlignmentFI

func BytesToAlignmentFI(reader *InputFile, setFileIndex bool) pipeline.Filter

BytesToAlignmentFI returns a pargo pipeline.Filter that parses slices of bytes representing alignments according to the SAM/BAM file format into slices of pointers to freshly allocated Alignment values, with an additional option to indicate whether a file index should be recorded with each alignment or not.

func ComposeFilters

func ComposeFilters(header *Header, hdrFilters []Filter) (receiver pipeline.Receiver)

ComposeFilters takes a Header and a slice of Filter functions, and successively calls these functions to generate the corresponding AlignmentFilter predicates. It then returns a pargo pipeline.Receiver that applies these AlignmentFilter predicates on the slices of Alignment pointers it receives. ComposeFilters may return nil if all AlignmentFilters are nil.

func CoordinateLess

func CoordinateLess(aln1, aln2 *Alignment) bool

CoordinateLess compares two alignments according to their coordinate. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD, SO.

func IsHeaderUserTag

func IsHeaderUserTag(code string) bool

IsHeaderUserTag determins whether this tag string represent a user-defined tag.

func MergeSingleEndFilesSplitPerChromosome

func MergeSingleEndFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)

MergeSingleEndFilesSplitPerChromosome merges files containing single-end reads that were split with SplitSingleEndFilePerChromosome.

func MergeSortedFilesSplitPerChromosome

func MergeSortedFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)

MergeSortedFilesSplitPerChromosome merges files that were split with SplitFilePerChromosome and sorted in coordinate order.

func MergeUnsortedFilesSplitPerChromosome

func MergeUnsortedFilesSplitPerChromosome(inputPath, output, inputPrefix, inputExtension string, header *Header, _ int) (funcErr error)

MergeUnsortedFilesSplitPerChromosome merges files that were split with SplitFilePerChromosome and are unsorted.

func ParseBamHeader

func ParseBamHeader(reader io.Reader) (hdr *Header, references []BAMReference, err error)

ParseBamHeader parses a complete header in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 4.2.

Returns a freshly allocated header, the BAM-encoded sequence dictionary, and a non-nil error value if an error occurred during parsing.

func ParseHeaderLineFromString

func ParseHeaderLineFromString(line string) (utils.StringMap, error)

ParseHeaderLineFromString parses a SAM header line from a string, except that entries are separated by white space instead of tabulators. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

The @ record type code must have already been scanned. ParseHeaderLineFromString cannot be used for @CO lines.

func QNAMELess

func QNAMELess(aln1, aln2 *Alignment) bool

QNAMELess compares two alignments according to their query template name. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD, SO.

func SQLN

func SQLN(record utils.StringMap) (int32, error)

SQLN returns he LN field value, assuming that the given record represents an @SQ line in the the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

If the LN field is present, error is nil unless the value cannot be successfully parsed into an int32. If the LN field is not present, SQLN returns the maximum possible value for LN and a non-nil error value.

func SetSQLN

func SetSQLN(record utils.StringMap, value int32)

SetSQLN sets the LN field value, assumming that the given record represents an @SQ line in the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

func SkipSamHeader

func SkipSamHeader(reader *bufio.Reader) (err error)

SkipSamHeader skips the complete header in a SAM file. This is more efficient than calling ParseHeader and ignoring its result.

Returns a non-nil error value if an error occurred.

func SplitFilePerChromosome

func SplitFilePerChromosome(input, outputPath, outputPrefix, outputExtension string, contigGroupSize int) (funcErr error)

SplitFilePerChromosome splits a SAM file into: a file containing all unmapped reads, a file containing all pairs where reads map to different chromosomes, and a file per chromosome containing all pairs where the reads map to that chromosome. There are no requirements on the input file for splitting.

func SplitSingleEndFilePerChromosome

func SplitSingleEndFilePerChromosome(input, outputPath, outputPrefix, outputExtension string, contigGroupSize int) (funcErr error)

SplitSingleEndFilePerChromosome splits a SAM file containing single-end reads into a file for the unmapped reads, and a file per chromosome, containing all reads that map to that chromosome. There are no requirements on the input file for splitting.

Types

type Alignment

type Alignment struct {
	// The Query template NAME.
	QNAME string

	// The Reference sequence NAME.
	RNAME string

	// The 1-based leftmost mapping POSition (as in the SAM format).
	POS int32

	// The bitwise FLAG.
	FLAG uint16

	// The MAPping Quality.
	MAPQ byte

	// The CIGAR string as a slice of CIGAR operations.
	CIGAR []CigarOperation

	// The Reference sequence name of the mate/NEXT read.
	RNEXT string

	// The 1-based leftmost mapping Position of the make/NEXT read (as in the SAM format).
	PNEXT int32

	// The observed Template LENgth.
	TLEN int32

	// The segment SEQuence (as in the BAM format).
	SEQ Sequence

	// The ASCII of Phred-scaled base QUALity+33.
	// A slice of the Phred-scaled base quality values (as in the BAM format,
	// without the increment of 33 to turn the values into printable ASCII characters).
	QUAL []byte

	// The optional fields in a read alignment.
	TAGS utils.SmallMap

	// Additional optional fields which are not stored in SAM files, but
	// reserved for temporary values in filters.
	Temps utils.SmallMap
}

An Alignment represents a single read alignment with mandatory and optional fields that can be contained in a SAM file alignment line. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5. SEQ and QUAL are represented as in the BAM format, see Section 4.2.

func (*Alignment) FileIndex

func (aln *Alignment) FileIndex() int

FileIndex returns the index of the alignment in the original input file. May return -1 if unknown. This function may be deprecated in the future.

func (*Alignment) FlagEvery

func (aln *Alignment) FlagEvery(flag uint16) bool

FlagEvery checks for every bit set in the given flag being also set in aln.FLAG.

func (*Alignment) FlagNotAny

func (aln *Alignment) FlagNotAny(flag uint16) bool

FlagNotAny checks for not any bit set in the given flag being also set in aln.FLAG.

func (*Alignment) FlagNotEvery

func (aln *Alignment) FlagNotEvery(flag uint16) bool

FlagNotEvery checks for not every bit set in the given flag being also set in aln.FLAG.

func (*Alignment) FlagSome

func (aln *Alignment) FlagSome(flag uint16) bool

FlagSome checks for some bits set in the given flag being also set in aln.FLAG.

func (*Alignment) IsDuplicate

func (aln *Alignment) IsDuplicate() bool

IsDuplicate checks for PCR or optical duplicate. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsFirst

func (aln *Alignment) IsFirst() bool

IsFirst checks for being the first segment in the template. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsLast

func (aln *Alignment) IsLast() bool

IsLast checks for being the last segment in the template. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsMultiple

func (aln *Alignment) IsMultiple() bool

IsMultiple checks for template having multiple segments in sequencing. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsNextReversed

func (aln *Alignment) IsNextReversed() bool

IsNextReversed check for SEQ of the next segment in the template being reverse complemented. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsNextUnmapped

func (aln *Alignment) IsNextUnmapped() bool

IsNextUnmapped checks for next segment in the template unmapped. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsProper

func (aln *Alignment) IsProper() bool

IsProper checks for each segment being properly aligned according to the aligner. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsQCFailed

func (aln *Alignment) IsQCFailed() bool

IsQCFailed checks for not passing filters, such as platform/vendor quality controls. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsReversed

func (aln *Alignment) IsReversed() bool

IsReversed checks for SEQ being reversed complemented. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsSecondary

func (aln *Alignment) IsSecondary() bool

IsSecondary checks for secondary alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsSupplementary

func (aln *Alignment) IsSupplementary() bool

IsSupplementary checks for supplementary alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsUnmapped

func (aln *Alignment) IsUnmapped() bool

IsUnmapped checks for segment unmapped. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) LIBID

func (aln *Alignment) LIBID() interface{}

LIBID returns the LIBID temporary field.

func (*Alignment) REFID

func (aln *Alignment) REFID() int32

REFID returns the REFID temporary field.

If REFID field is not set, this will panic with a log message. The AddREFID filter can be used to avoid this situation. (The elPrep command line tool ensures that AddREFID is correctly used for its default pipelines.)

func (*Alignment) RG

func (aln *Alignment) RG() interface{}

RG returns the (potentially empty) RG optional field.

func (*Alignment) SetLIBID

func (aln *Alignment) SetLIBID(libid interface{})

SetLIBID sets the LIBID temporary field.

func (*Alignment) SetREFID

func (aln *Alignment) SetREFID(refid int32)

SetREFID sets the REFID temporary field.

func (*Alignment) SetRG

func (aln *Alignment) SetRG(rg interface{})

SetRG sets the RG optional field.

type AlignmentFilter

type AlignmentFilter func(*Alignment) bool

An AlignmentFilter receives an Alignment which it can modify. It returns true if the alignment should be kept, and false if the alignment should be removed.

type AlignmentSorter

type AlignmentSorter struct {
	// contains filtered or unexported fields
}

AlignmentSorter is a helper for sorting Alignment slices that implements https://godoc.org/github.com/ExaScience/pargo/sort#StableSorter

func (AlignmentSorter) Assign

func (s AlignmentSorter) Assign(p psort.StableSorter) func(i, j, len int)

Assign implements the method of the StableSorter interface.

func (AlignmentSorter) Len

func (s AlignmentSorter) Len() int

Len implements the method of the sort.Interface.

func (AlignmentSorter) Less

func (s AlignmentSorter) Less(i, j int) bool

Less implements the method of the sort.Interface.

func (AlignmentSorter) NewTemp

func (s AlignmentSorter) NewTemp() psort.StableSorter

NewTemp implements the method of the StableSorter interface

func (AlignmentSorter) SequentialSort

func (s AlignmentSorter) SequentialSort(i, j int)

SequentialSort implements the method of the SequantialSorter interface.

type BAMReference

type BAMReference struct {
	Name   string
	Length int32
}

BAMReference is a an entry in a slice of BAM-encoded sequence dictionary entries. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 4.2.

func SkipBamHeader

func SkipBamHeader(reader io.Reader) (references []BAMReference, err error)

SkipBamHeader skips the complete header in a BAM file. This is more efficient than calling ParseBamHeader and ignoring its result.

Returns the BAM-encoded sequence dictionary and a non-nil error value if an error occurred during parsing.

type BGZFReader

type BGZFReader struct {
	// contains filtered or unexported fields
}

BGZFReader reads in parallel from a BGZF file.

func NewBGZFReader

func NewBGZFReader(r flate.Reader) (*BGZFReader, error)

NewBGZFReader returns a BGZFReader for the given flate.Reader

func (*BGZFReader) Close

func (bgzf *BGZFReader) Close() error

Close implements the corresponding method of io.Closer

func (*BGZFReader) Read

func (bgzf *BGZFReader) Read(p []byte) (n int, err error)

Read implements the corresponding method of io.Reader

type BGZFWriter

type BGZFWriter struct {
	// contains filtered or unexported fields
}

BGZFWriter writes in parallel to a BGZF file.

func NewBGZFWriter

func NewBGZFWriter(w io.Writer) *BGZFWriter

NewBGZFWriter returns a BGZFWriter for the given io.Writer.

func (*BGZFWriter) Close

func (bgzf *BGZFWriter) Close() error

Close closes this BGZFWriter.

func (*BGZFWriter) Write

func (bgzf *BGZFWriter) Write(p []byte) (n int, err error)

Write implements the corresponding method of io.Writer.

type By

type By func(aln1, aln2 *Alignment) bool

By is a type for comparison predicates on Alignment pointers.

func (By) ParallelStableSort

func (by By) ParallelStableSort(alns []*Alignment)

ParallelStableSort sorts a slice of alignments according to the given comparison predicate.

type ByteArray

type ByteArray []byte

ByteArray is a representation for byte arrays as stored in optional fields of read alignments lines using type H. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

type CigarOperation

type CigarOperation struct {
	Length    int32
	Operation byte // 'M', 'I', 'D', 'N', 'S', 'H', 'P', '=', or 'X'
}

CigarOperation represents a CIGAR operation. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.6.

func ScanCigarString

func ScanCigarString(cigar string) ([]CigarOperation, error)

ScanCigarString converts a CIGAR string to a slice of CigarOperation. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.6.

Uses an internal cache to reduce memory overhead. It is safe for multiple goroutines to call ScanCigarString concurrently.

type Filter

type Filter func(*Header) AlignmentFilter

A Filter receives a Header and returns an AlignmentFilter or nil.

type GroupingOrder

type GroupingOrder string

GroupingOrder represents the possible values for the GO tag stored in the @HD line of a header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

const (
	None      GroupingOrder = "none"
	Query     GroupingOrder = "query"
	Reference GroupingOrder = "reference"
)

Grouping orders.

type Header struct {
	// The @HD line.
	HD utils.StringMap

	// The @SQ, @RG, and @PG lines, in the order they occur in the
	// header.
	SQ, RG, PG []utils.StringMap

	// The @CO lines in the order they occur in the header.
	CO []string

	// The lines with user-defined @ tags, for each tag in the order
	// they occur in the header.
	UserRecords map[string][]utils.StringMap
}

Header represents the information stored in the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

Each line (except for @CO) is represented as a map[string]string, mapping string tags to string values.

The zero Header is valid and empty.

func NewHeader

func NewHeader() *Header

NewHeader allocates and initializes an empty header.

func ParseSamHeader

func ParseSamHeader(reader *bufio.Reader) (hdr *Header, err error)

ParseSamHeader parses a complete header in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

Returns a freshly allocated header and a non-nil error value if an error occurred during parsing.

func (*Header) AddUserRecord

func (hdr *Header) AddUserRecord(code string, record utils.StringMap)

AddUserRecord adds a header line for the given user-defined @ tag to the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

func (*Header) EnsureHD

func (hdr *Header) EnsureHD() utils.StringMap

EnsureHD ensures that an @HD line is present in the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

If an @HD line already exists, it is returned unchanged. Otherwise, the HD field is initialized with a default VN value.

func (*Header) EnsureUserRecords

func (hdr *Header) EnsureUserRecords() map[string][]utils.StringMap

EnsureUserRecords ensures that a map for user-defined @ tags exists in the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

If the map already exists, it is returned unchanged. Otherwise, the UserRecords field is initialized with an empty map.

func (*Header) FormatBam

func (hdr *Header) FormatBam(out []byte) []byte

FormatBam writes the header section of a BAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 4.2.

func (*Header) FormatSam

func (hdr *Header) FormatSam(out []byte) []byte

FormatSam writes the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

func (*Header) HDGO

func (hdr *Header) HDGO() GroupingOrder

HDGO returns the grouping order (GO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

If there is no @HD line, or the GO field is not set, returns "none".

func (*Header) HDSO

func (hdr *Header) HDSO() SortingOrder

HDSO returns the sorting order (SO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

If there is no @HD line, or the SO field is not set, returns "unknown".

func (*Header) SetHDGO

func (hdr *Header) SetHDGO(value GroupingOrder)

SetHDGO sets the grouping order (GO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

This also deletes the value for the SO field if it is set.

func (*Header) SetHDSO

func (hdr *Header) SetHDSO(value SortingOrder)

SetHDSO sets the sorting order (SO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

This also deletes the value for the GO field if it is set.

type InputFile

type InputFile struct {
	// contains filtered or unexported fields
}

InputFile represents a SAM or BAM file for input.

func Open

func Open(name string) (*InputFile, error)

Open a SAM or BAM file for input.

If the filename extension is not .bam, then .sam is always assumed.

If the name is "/dev/stdin", then the input is read from os.Stdin

func (*InputFile) Close

func (f *InputFile) Close() error

Close closes the SAM/BAM input file.

func (*InputFile) Data

func (f *InputFile) Data() interface{}

Data implements the method of the pipeline.Source interface.

func (*InputFile) Err

func (f *InputFile) Err() error

Err implements the method of the pipeline.Source interface.

func (*InputFile) Fetch

func (f *InputFile) Fetch(size int) int

Fetch implements the method of the pipeline.Source interface.

func (*InputFile) ParseAlignment

func (f *InputFile) ParseAlignment(block []byte) (*Alignment, error)

ParseAlignment parses a block of bytes into an alignment. For example in a SAM file, each block of bytes must be one line from the alignment section.

func (*InputFile) ParseHeader

func (f *InputFile) ParseHeader() (*Header, error)

ParseHeader fetches a header from a SAM or BAM file.

func (*InputFile) Prepare

func (f *InputFile) Prepare(ctx context.Context) int

Prepare implements the method of the pipeline.Source interface.

func (*InputFile) RunPipeline

func (f *InputFile) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error

RunPipeline implements the PipelineInput interface for SAM/BAM InputFile values.

func (*InputFile) RunPipelineFI

func (f *InputFile) RunPipelineFI(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder, setFileIndex bool) error

RunPipelineFI implements a variant of the PipelineInput interface for SAM/BAM InputFile values, with an additional option to indicate whether a file index should be recorded with each alignment or not.

func (*InputFile) SkipHeader

func (f *InputFile) SkipHeader() error

SkipHeader skips the header section of a SAM or BAM file. This is more efficient than calling ParseHeader and ignoring its result.

type OutputFile

type OutputFile struct {
	// contains filtered or unexported fields
}

OutputFile represents a SAM or BAM file for output.

func Create

func Create(name string) (*OutputFile, error)

Create a SAM or BAM file for output.

If the filename extension is not .bam, then .sam is always assumed.

If the name is "/dev/stdout", then the output is written to os.Stdout.

func (*OutputFile) AddNodes

func (f *OutputFile) AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)

AddNodes implements the PipelineOutput interface for SAM/BAM OutputFile values.

func (*OutputFile) Close

func (f *OutputFile) Close() error

Close closes a SAM or BAM output file.

func (*OutputFile) FormatAlignment

func (f *OutputFile) FormatAlignment(aln *Alignment, out []byte) ([]byte, error)

FormatAlignment formats an alignment into a block of bytes for a SAM or BAM file.

func (*OutputFile) FormatHeader

func (f *OutputFile) FormatHeader(hdr *Header) error

FormatHeader writes the header to a SAM or BAM file.

func (*OutputFile) Write

func (f *OutputFile) Write(p []byte) (int, error)

Write can be used to write the blocks of bytes from FormatAlignment to the underlying SAM or BAM file.

type PipelineInput

type PipelineInput interface {
	RunPipeline(output PipelineOutput, filters []Filter, sortingOrder SortingOrder) error
}

A PipelineInput arranges for a pargo pipeline to be properly initialized, arrange for the pipeline to run the given filters, call output.AddNodes(...), and eventually run the pipeline. If RunPipeline doesn't encounter an error of its own, it should return the error of its pargo pipeline, if any.

type PipelineOutput

type PipelineOutput interface {
	AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)
}

A PipelineOutput can add nodes to the given pargo pipeline. AddNodes also receives a header that should be added to the output, and a sortingOrder. AddNodes should arrange for the alignments that it receives to be sorted according to that sortingOrder if possible, or report an error if it can't perform such a sort. Any error should be reported to the pipeline by calling p.Err(err) with a non-nil error value.

type Sam

type Sam struct {
	Header     *Header
	Alignments []*Alignment
	// contains filtered or unexported fields
}

Sam represents a complete SAM data set that can be contained in a SAM or BAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.

func NewSam

func NewSam() *Sam

NewSam allocates and initializes an empty SAM data set.

func (*Sam) AddNodes

func (sam *Sam) AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)

AddNodes implements the PipelineOutput interface for Sam values to represent complete SAM/BAM files in memory.

func (*Sam) NofBatches

func (sam *Sam) NofBatches(n int)

NofBatches sets or gets the number of batches that are created from this Sam value for the next call of RunPipeline.

NofBatches can be called safely by user programs before RunPipeline is called.

If user programs do not call NofBatches, or call them with a value < 1, then the pipeline will choose a reasonable default value that takes runtime.GOMAXPROCS(0) into account.

func (*Sam) RunPipeline

func (sam *Sam) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error

RunPipeline implements the PipelineInput interface for Sam values that represent complete SAM/BAM files in memory.

type Sequence

type Sequence nibbles.Nibbles

Sequence encodes a SAM segment SEQuence as in the BAM format. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 4.2.

func (Sequence) Base

func (seq Sequence) Base(i int) (base byte)

Base returns the base in a SAM segment SEQuence at the given position.

func (Sequence) Len

func (seq Sequence) Len() int

Len returns the length of a SAM segment SEQuence.

func (Sequence) SetBase

func (seq Sequence) SetBase(i int, base byte)

SetBase sets the base in a SAM segment SEQuence at the given position.

func (Sequence) Slice

func (seq Sequence) Slice(low, high int) Sequence

Slice slices a SAM segment SEQuence.

type SortingOrder

type SortingOrder string

SortingOrder represents the possible values for the SO tag stored in the @HD line of a header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

const (
	Keep       SortingOrder = "keep"
	Unknown    SortingOrder = "unknown"
	Unsorted   SortingOrder = "unsorted"
	Queryname  SortingOrder = "queryname"
	Coordinate SortingOrder = "coordinate"
)

Sorting orders.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL