sam

package
v1.0.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 24, 2018 License: BSD-3-Clause Imports: 18 Imported by: 0

Documentation

Overview

Package sam is a library for parsing and representing SAM files, and for efficiently executing sequencing pipelines on .sam/.bam/.cram files, taking advantage of modern multi-core processors.

Modifications to headers and alignments are expressed as filters. The library comes with a number of commonly used pre-defined filters, but you can also define and use your own filters. A pipeline can be executed with the RunPipeline method of the PipelineInput interface, which accepts SAM/BAM/CRAM files as input and/or output sources, but can also operate on an in-memory representation of such files. PipelineInput and PipelineOutput can be implemented to also operate on other input/output sources, such as databases.

elPrep provides high-level Filter and AlignmentFilter types that operate on SAM file header and alignment structs. elPrep then uses the pargo library for expressing pipelines of such filters for efficient parallel execution. It is normally not necessary to deal with pargo pipelines directly, but you can check the documentation at https://godoc.org/github.com/ExaScience/pargo/pipeline for details of pargo pipelines if necessary.

Index

Constants

View Source
const (
	SamExt  = ".sam"
	BamExt  = ".bam"
	CramExt = ".cram"
)

SAM file extensions.

View Source
const (
	FileFormatVersion = "1.5"
	FileFormatDate    = "1 Jun 2017"
)

The SAM file format version and date strings supported by this library. This is entered by default in an @HD line in the header section of a SAM file, unless user code explicitly asks for a different version number. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

View Source
const (
	// Template having multiple segments in sequencing.
	Multiple = 0x1

	// Each segment properly aligned according to the aligner.
	Proper = 0x2

	// Segment unmapped.
	Unmapped = 0x4

	// Next segment in the template unmapped.
	NextUnmapped = 0x8

	// SEQ being reversed complemented.
	Reversed = 0x10

	// SEQ of the next segment in the template being reverse
	// complemented.
	NextReversed = 0x20

	// The first segment in the template.
	First = 0x40

	// The last segment in the template.
	Last = 0x80

	// Secondary alignment.
	Secondary = 0x100

	// Not passing filters, such as platform/vendor quality controls.
	QCFailed = 0x200

	// PCR or optical duplicate.
	Duplicate = 0x400

	// Supplementary alignment.
	Supplementary = 0x800
)

Bit values for the FLAG field in the Alignment struct. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

View Source
const CigarOperations = "MmIiDdNnSsHhPpXx="

CigarOperations contains all valid CIGAR operations.

Variables

View Source
var (
	CC = utils.Intern("CC")
	LB = utils.Intern("LB")
	PG = utils.Intern("PG")
	PU = utils.Intern("PU")
	RG = utils.Intern("RG")
)

Symbols for some commonly used optional fields. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

View Source
var (
	LIBID = utils.Intern("LIBID")
	REFID = utils.Intern("REFID")
)

Symbols for some temporary fields.

Functions

func AlignmentToString

func AlignmentToString(p *pipeline.Pipeline, _ pipeline.NodeKind, _ *int) (receiver pipeline.Receiver, _ pipeline.Finalizer)

AlignmentToString returns a pargo pipeline.Receiver that formats slices of Alignment pointers into slices of strings representing these alignments according to the SAM file format. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.

func ComposeFilters

func ComposeFilters(header *Header, hdrFilters []Filter) (receiver pipeline.Receiver)

ComposeFilters takes a Header and a slice of Filter functions, and successively calls these functions to generate the corresponding AlignmentFilter predicates. It then returns a pargo pipeline.Receiver that applies these AlignmentFilter predicates on the slices of Alignment pointers it receives. ComposeFilters may return nil if all AlignmentFilters are nil.

func CoordinateLess

func CoordinateLess(aln1, aln2 *Alignment) bool

CoordinateLess compares two alignments according to their coordinate. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD, SO.

func FormatComment

func FormatComment(out *bufio.Writer, code, comment string) error

FormatComment writes a header comment line in a SAM file header section. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

func FormatHeaderLine

func FormatHeaderLine(out *bufio.Writer, code string, record utils.StringMap) error

FormatHeaderLine writes a header line in a SAM file header section. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

func FormatString

func FormatString(out *bufio.Writer, tag, value string) error

FormatString writes a SAM file TAG of type string.

func FormatTag

func FormatTag(out []byte, tag utils.Symbol, value interface{}) ([]byte, error)

FormatTag writes a SAM file TAG by appending its ASCII-string representation to out and returning the result, dispatching on the actual type of the given value. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

The following types are accepted: byte (A), int32 (i), float32 (f), string (Z), ByteArray (H), []int8 (B:c), []uint8 (B:C), []int16 (B:s), []uint16 (B:S), []int32 (B:i), []uint32 (B:I), and []float32 (B:f).

func IsHeaderUserTag

func IsHeaderUserTag(code string) bool

IsHeaderUserTag determins whether this tag string represent a user-defined tag.

func MergeSingleEndFilesSplitPerChromosome

func MergeSingleEndFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, header *Header) (err error)

MergeSingleEndFilesSplitPerChromosome merges files containing single-end reads that were split with SplitSingleEndFilePerChromosome.

func MergeSortedFilesSplitPerChromosome

func MergeSortedFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, header *Header) (err error)

MergeSortedFilesSplitPerChromosome merges files that were split with SplitFilePerChromosome and sorted in coordinate order.

func MergeUnsortedFilesSplitPerChromosome

func MergeUnsortedFilesSplitPerChromosome(inputPath, output, fai, fasta, inputPrefix, inputExtension string, header *Header) (err error)

MergeUnsortedFilesSplitPerChromosome merges files that were split with SplitFilePerChromosome and are unsorted.

func ParseHeaderLineFromString

func ParseHeaderLineFromString(line string) (utils.StringMap, error)

ParseHeaderLineFromString parses a SAM header line from a string, except that entries are separated by white space instead of tabulators. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

The @ record type code must have already been scanned. ParseHeaderLineFromString cannot be used for @CO lines.

func QNAMELess

func QNAMELess(aln1, aln2 *Alignment) bool

QNAMELess compares two alignments according to their query template name. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD, SO.

func SQLN

func SQLN(record utils.StringMap) (int32, error)

SQLN returns he LN field value, assuming that the given record represents an @SQ line in the the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

If the LN field is present, error is nil unless the value cannot be successfully parsed into an int32. If the LN field is not present, SQLN returns the maximum possible value for LN and a non-nil error value.

func SetSQLN

func SetSQLN(record utils.StringMap, value int32)

SetSQLN sets the LN field value, assumming that the given record represents an @SQ line in the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

func SkipHeader

func SkipHeader(reader *bufio.Reader) (lines int, err error)

SkipHeader skips the complete header in a SAM file. This is more efficient than calling ParseHeader and ignoring its result.

Returns the number of header lines and a non-nil error value if an error occurred.

func SplitFilePerChromosome

func SplitFilePerChromosome(input, outputPath, outputPrefix, outputExtension, fai, fasta string) (err error)

SplitFilePerChromosome splits a SAM file into: a file containing all unmapped reads, a file containing all pairs where reads map to different chromosomes, and a file per chromosome containing all pairs where the reads map to that chromosome. There are no requirements on the input file for splitting.

func SplitSingleEndFilePerChromosome

func SplitSingleEndFilePerChromosome(input, outputPath, outputPrefix, outputExtension, fai, fasta string) (err error)

SplitSingleEndFilePerChromosome splits a SAM file containing single-end reads into a file for the unmapped reads, and a file per chromosome, containing all reads that map to that chromosome. There are no requirements on the input file for splitting.

func StringToAlignment

func StringToAlignment(p *pipeline.Pipeline, _ pipeline.NodeKind, _ *int) (receiver pipeline.Receiver, _ pipeline.Finalizer)

StringToAlignment returns a pargo pipeline.Receiver that parses slices of strings representing alignments according to the SAM file format into slices of pointers to freshly allocated Alignment values. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.

Types

type Alignment

type Alignment struct {
	// The Query template NAME.
	QNAME string

	// The bitwise FLAG.
	FLAG uint16

	// The Reference sequence NAME.
	RNAME string

	// The 1-based leftmost mapping POSition.
	POS int32

	// The MAPping Quality.
	MAPQ byte

	// The CIGAR string.
	CIGAR string

	// The Reference sequence name of the mate/NEXT read.
	RNEXT string

	// The 1-based leftmost mapping Position of the make/NEXT read.
	PNEXT int32

	// The observed Template LENgth.
	TLEN int32

	// The segment SEQuence.
	SEQ string

	// The ASCII of Phred-scaled base QUALity+33.
	QUAL string

	// The optional fields in a read alignment.
	TAGS utils.SmallMap

	// Additional optional fields which are not stored in SAM files, but
	// resereved for temporary values in filters.
	Temps utils.SmallMap
}

An Alignment represents a single read alignment with mandatory and optional fields that can be contained in a SAM file alignment line. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.

func NewAlignment

func NewAlignment() *Alignment

NewAlignment allocates and initializes an empty alignment.

func (*Alignment) FlagEvery

func (aln *Alignment) FlagEvery(flag uint16) bool

FlagEvery checks for every bit set in the given flag being also set in aln.FLAG.

func (*Alignment) FlagNotAny

func (aln *Alignment) FlagNotAny(flag uint16) bool

FlagNotAny checks for not any bit set in the given flag being also set in aln.FLAG.

func (*Alignment) FlagNotEvery

func (aln *Alignment) FlagNotEvery(flag uint16) bool

FlagNotEvery checks for not every bit set in the given flag being also set in aln.FLAG.

func (*Alignment) FlagSome

func (aln *Alignment) FlagSome(flag uint16) bool

FlagSome checks for some bits set in the given flag being also set in aln.FLAG.

func (*Alignment) Format

func (aln *Alignment) Format(out []byte) ([]byte, error)

Format writes a SAM file read alignment line by appending its ASCII-string representation to out and return the result. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.

func (*Alignment) IsDuplicate

func (aln *Alignment) IsDuplicate() bool

IsDuplicate checks for PCR or optical duplicate. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsFirst

func (aln *Alignment) IsFirst() bool

IsFirst checks for being the first segment in the template. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsLast

func (aln *Alignment) IsLast() bool

IsLast checks for being the last segment in the template. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsMultiple

func (aln *Alignment) IsMultiple() bool

IsMultiple checks for template having multiple segments in sequencing. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsNextReversed

func (aln *Alignment) IsNextReversed() bool

IsNextReversed check for SEQ of the next segment in the template being reverse complemented. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsNextUnmapped

func (aln *Alignment) IsNextUnmapped() bool

IsNextUnmapped checks for next segment in the template unmapped. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsProper

func (aln *Alignment) IsProper() bool

IsProper checks for each segment being properly aligned according to the aligner. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsQCFailed

func (aln *Alignment) IsQCFailed() bool

IsQCFailed checks for not passing filters, such as platform/vendor quality controls. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsReversed

func (aln *Alignment) IsReversed() bool

IsReversed checks for SEQ being reversed complemented. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsSecondary

func (aln *Alignment) IsSecondary() bool

IsSecondary checks for secondary alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsSupplementary

func (aln *Alignment) IsSupplementary() bool

IsSupplementary checks for supplementary alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) IsUnmapped

func (aln *Alignment) IsUnmapped() bool

IsUnmapped checks for segment unmapped. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.2.

func (*Alignment) LIBID

func (aln *Alignment) LIBID() interface{}

LIBID returns the LIBID temporary field.

func (*Alignment) REFID

func (aln *Alignment) REFID() int32

REFID returns the REFID temporary field.

If REFID field is not set, this will panic with a log message. The AddREFID filter can be used to avoid this situation. (The elPrep command line tool ensures that AddREFID is correctly used for its default pipelines.)

func (*Alignment) RG

func (aln *Alignment) RG() interface{}

RG returns the (potentially empty) RG optional field.

func (*Alignment) SetLIBID

func (aln *Alignment) SetLIBID(libid interface{})

SetLIBID sets the LIBID temporary field.

func (*Alignment) SetREFID

func (aln *Alignment) SetREFID(refid int32)

SetREFID sets the REFID temporary field.

func (*Alignment) SetRG

func (aln *Alignment) SetRG(rg interface{})

SetRG sets the RG optional field.

type AlignmentFilter

type AlignmentFilter func(*Alignment) bool

An AlignmentFilter receives an Alignment which it can modify. It returns true if the alignment should be kept, and false if the alignment should be removed.

type AlignmentSorter

type AlignmentSorter struct {
	// contains filtered or unexported fields
}

AlignmentSorter is a helper for sorting Alignment slices that implements https://godoc.org/github.com/ExaScience/pargo/sort#StableSorter

func (AlignmentSorter) Assign

func (s AlignmentSorter) Assign(p psort.StableSorter) func(i, j, len int)

Assign implements the method of the StableSorter interface.

func (AlignmentSorter) Len

func (s AlignmentSorter) Len() int

Len implements the method of the sort.Interface.

func (AlignmentSorter) Less

func (s AlignmentSorter) Less(i, j int) bool

Less implements the method of the sort.Interface.

func (AlignmentSorter) NewTemp

func (s AlignmentSorter) NewTemp() psort.StableSorter

NewTemp implements the method of the StableSorter interface

func (AlignmentSorter) SequentialSort

func (s AlignmentSorter) SequentialSort(i, j int)

SequentialSort implements the method of the SequantialSorter interface.

type By

type By func(aln1, aln2 *Alignment) bool

By is a type for comparison predicates on Alignment pointers.

func (By) ParallelStableSort

func (by By) ParallelStableSort(alns []*Alignment)

ParallelStableSort sorts a slice of alignments according to the given comparison predicate.

type ByteArray

type ByteArray []byte

ByteArray is a representation for byte arrays as stored in optional fields of read alignments lines using type H. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

type CigarOperation

type CigarOperation struct {
	Length    int32
	Operation byte // 'M', 'I', 'D', 'N', 'S', 'H', 'P', '=', or 'X'
}

CigarOperation represents a CIGAR operation. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.6.

func ScanCigarString

func ScanCigarString(cigar string) ([]CigarOperation, error)

ScanCigarString converts a CIGAR string to a slice of CigarOperation. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.6.

Uses an internal cache to reduce memory overhead. It is safe for multiple goroutines to call ScanCigarString concurrently.

type FieldParser

type FieldParser func(*StringScanner, utils.Symbol) (utils.Symbol, interface{})

FieldParser is the signature for all parsers for optional fields in read alignment lines in SAM files.

type Filter

type Filter func(*Header) AlignmentFilter

A Filter receives a Header and returns an AlignmentFilter or nil.

type GroupingOrder

type GroupingOrder string

GroupingOrder represents the possible values for the GO tag stored in the @HD line of a header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

const (
	None      GroupingOrder = "none"
	Query     GroupingOrder = "query"
	Reference GroupingOrder = "reference"
)

Grouping orders.

type Header struct {
	// The @HD line.
	HD utils.StringMap

	// The @SQ, @RG, and @PG lines, in the order they occur in the
	// header.
	SQ, RG, PG []utils.StringMap

	// The @CO lines in the order they occur in the header.
	CO []string

	// The lines with user-defined @ tags, for each tag in the order
	// they occur in the header.
	UserRecords map[string][]utils.StringMap
}

Header represents the information stored in the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

Each line (except for @CO) is represented as a map[string]string, mapping string tags to string values.

The zero Header is valid and empty.

func NewHeader

func NewHeader() *Header

NewHeader allocates and initializes an empty header.

func ParseHeader

func ParseHeader(reader *bufio.Reader) (hdr *Header, lines int, err error)

ParseHeader parses a complete header in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

Returns a freshly allocated header, the number of header lines, and a non-nil error value if an error occurred during parsing.

func (*Header) AddUserRecord

func (hdr *Header) AddUserRecord(code string, record utils.StringMap)

AddUserRecord adds a header line for the given user-defined @ tag to the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

func (*Header) EnsureHD

func (hdr *Header) EnsureHD() utils.StringMap

EnsureHD ensures that an @HD line is present in the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

If an @HD line already exists, it is returned unchanged. Otherwise, the HD field is initialized with a default VN value.

func (*Header) EnsureUserRecords

func (hdr *Header) EnsureUserRecords() map[string][]utils.StringMap

EnsureUserRecords ensures that a map for user-defined @ tags exists in the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

If the map already exists, it is returned unchanged. Otherwise, the UserRecords field is initialized with an empty map.

func (*Header) Format

func (hdr *Header) Format(out *bufio.Writer) (err error)

Format writes the header section of a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

func (*Header) HDGO

func (hdr *Header) HDGO() GroupingOrder

HDGO returns the grouping order (GO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

If there is no @HD line, or the GO field is not set, returns "none".

func (*Header) HDSO

func (hdr *Header) HDSO() SortingOrder

HDSO returns the sorting order (SO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

If there is no @HD line, or the SO field is not set, returns "unknown".

func (*Header) SetHDGO

func (hdr *Header) SetHDGO(value GroupingOrder)

SetHDGO sets the grouping order (GO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

This also deletes the value for the SO field if it is set.

func (*Header) SetHDSO

func (hdr *Header) SetHDSO(value SortingOrder)

SetHDSO sets the sorting order (SO) stored in the @HD line of the given header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

This also deletes the value for the GO field if it is set.

type InputFile

type InputFile struct {
	*bufio.Reader
	*exec.Cmd
	// contains filtered or unexported fields
}

InputFile represents a SAM, BAM, or CRAM file for input.

func Open

func Open(name string, headerOnly bool) (*InputFile, error)

Open a SAM file for input.

If the filename extension is .bam or .cram, use samtools view for input. Tell samtools view to only return the header section for input when headerOnly is true.

samtools must be visible in the directories named by the PATH environment variable for .bam or .cram input.

If the filename extension is not .bam or .cram, then .sam is always assumed.

If the name is "/dev/stdin", then the input is read from os.Stdin

func (*InputFile) Close

func (input *InputFile) Close() error

Close the SAM input file. If samtools view is used for input, wait for its process to finish.

func (*InputFile) SamReader

func (input *InputFile) SamReader() *Reader

SamReader returns the Reader for a SAM, BAM or CRAM InputFile.

type OutputFile

type OutputFile struct {
	*bufio.Writer
	*exec.Cmd
	// contains filtered or unexported fields
}

OutputFile represents a SAM, BAM, or CRAM file for output.

func Create

func Create(name, fai, fasta string) (*OutputFile, error)

Create a SAM file for output.

If the filename extension is .bam or .cram, use samtools view for output. If the filename extension is .cram, then either fai or fasta must be a filename, and the other must be "". If fai is a filename, it is passed as the -t option to samtools view. If fasta is a filename, it is passed as the -T option to samtools view.

samtools must be visible in the directories named by the PATH environment variable for .bam or .cram output.

If the filename extension is not .bam or .cram, then .sam is always assumed.

If the name is "/dev/stdout", then the output is written to os.Stdout.

func (*OutputFile) Close

func (output *OutputFile) Close() error

Close the SAM output file. If samtools view is used for output, wait for its process to finish.

func (*OutputFile) SamWriter

func (output *OutputFile) SamWriter() *Writer

SamWriter returns the Writer for a SAM, BAM or CRAM OutputFile.

type PipelineInput

type PipelineInput interface {
	RunPipeline(output PipelineOutput, filters []Filter, sortingOrder SortingOrder) error
}

A PipelineInput arranges for a pargo pipeline to be properly initialized, arrange for the pipeline to run the given filters, call output.AddNodes(...), and eventually run the pipeline. If RunPipeline doesn't encounter an error of its own, it should return the error of its pargo pipeline, if any.

type PipelineOutput

type PipelineOutput interface {
	AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)
}

A PipelineOutput can add nodes to the given pargo pipeline. AddNodes also receives a header that should be added to the output, and a sortingOrder. AddNodes should arrange for the alignments that it receives to be sorted according to that sortingOrder if possible, or report an error if it can't perform such a sort. Any error should be reported to the pipeline by calling p.Err(err) with a non-nil error value.

type Reader

type Reader bufio.Reader

Reader is a bufio.Reader for a SAM, BAM or CRAM InputFile.

func (*Reader) RunPipeline

func (input *Reader) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error

RunPipeline implements the PipelineInput interface for Reader values that produce input in the SAM file format.

type Sam

type Sam struct {
	Header     *Header
	Alignments []*Alignment
}

Sam represents a complete SAM data set that can be contained in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.

func NewSam

func NewSam() *Sam

NewSam allocates and initializes an empty SAM data set.

func (*Sam) AddNodes

func (sam *Sam) AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)

AddNodes implements the PipelineOutput interface for Sam values to represent complete SAM files in memory.

func (*Sam) Format

func (sam *Sam) Format(out *bufio.Writer) error

Format writes a complete SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.

func (*Sam) RunPipeline

func (sam *Sam) RunPipeline(output PipelineOutput, hdrFilters []Filter, sortingOrder SortingOrder) error

RunPipeline implements the PipelineInput interface for Sam values that represent complete SAM files in memory.

type SortingOrder

type SortingOrder string

SortingOrder represents the possible values for the SO tag stored in the @HD line of a header. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3, Tag @HD.

const (
	Keep       SortingOrder = "keep"
	Unknown    SortingOrder = "unknown"
	Unsorted   SortingOrder = "unsorted"
	Queryname  SortingOrder = "queryname"
	Coordinate SortingOrder = "coordinate"
)

Sorting orders.

type StringScanner

type StringScanner struct {
	// contains filtered or unexported fields
}

A StringScanner can be used scan/parse ASCII strings representing lines in SAM files.

The zero StringScanner is valid and empty.

func (*StringScanner) Err

func (sc *StringScanner) Err() error

Err returns the error that occurred during scanning/parsing.

func (*StringScanner) Len

func (sc *StringScanner) Len() int

Len returns the number of ASCII characters that still need to be scanned/parsed. Returns 0 if Err() would return a non-nil value.

func (*StringScanner) ParseAlignment

func (sc *StringScanner) ParseAlignment() *Alignment

ParseAlignment parses a read alignment line in a SAM file and returns a freshly allocated alignment. See http://samtools.github.io/hts-specs/SAMv1.pdf - Sections 1.4 and 1.5.

func (*StringScanner) ParseByteArray

func (sc *StringScanner) ParseByteArray(tag utils.Symbol) (utils.Symbol, interface{})

ParseByteArray parses a byte array in the tab-delimited Hex format and returns it as a ByteArray. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

func (*StringScanner) ParseChar

func (sc *StringScanner) ParseChar(tag utils.Symbol) (utils.Symbol, interface{})

ParseChar parses a single tab-delimited character and returns it as a byte. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

func (*StringScanner) ParseFloat

func (sc *StringScanner) ParseFloat(tag utils.Symbol) (utils.Symbol, interface{})

ParseFloat parses a single tab-delimited float and returns it as a float32. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

func (*StringScanner) ParseHeaderField

func (sc *StringScanner) ParseHeaderField() (tag, value string)

ParseHeaderField parses a field in a header line in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

func (*StringScanner) ParseHeaderLine

func (sc *StringScanner) ParseHeaderLine() utils.StringMap

ParseHeaderLine parses a header line in a SAM file. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.3.

The @ record type code must have already been scanned. ParseHeaderLine cannot be used for @CO lines.

func (*StringScanner) ParseInteger

func (sc *StringScanner) ParseInteger(tag utils.Symbol) (utils.Symbol, interface{})

ParseInteger parses a single tab-delimited integer and returns it as an int32. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

func (*StringScanner) ParseMandatoryField

func (sc *StringScanner) ParseMandatoryField() string

ParseMandatoryField parses a single tab-delimited mandatory field in a SAM read alignment line and returns it as a string. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.4.

func (*StringScanner) ParseNumericArray

func (sc *StringScanner) ParseNumericArray(tag utils.Symbol) (utils.Symbol, interface{})

ParseNumericArray parses a typed, tab-delimited, and comma-separated integer or numeric array and returns it as a []int8, []uint8, []int16, []uint16, []int32, []uint32, or []float32. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

func (*StringScanner) ParseOptionalField

func (sc *StringScanner) ParseOptionalField() (tag utils.Symbol, value interface{})

ParseOptionalField parses a single tab-delimited optional field in a SAM read alignment line and returns it as a tag/value pair. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

The second return value is one of byte (representing an ASCII character), int32, float32, string, ByteArray, []int8, []uint8, []int16, []uint16, []int32, []uint32, or []float32.

func (*StringScanner) ParseString

func (sc *StringScanner) ParseString(tag utils.Symbol) (utils.Symbol, interface{})

ParseString parses a single tab-delimited string and returns it. See http://samtools.github.io/hts-specs/SAMv1.pdf - Section 1.5.

func (*StringScanner) Reset

func (sc *StringScanner) Reset(s string)

Reset resets the scanner, and initializes it with the given string.

type Writer

type Writer bufio.Writer

Writer is a bufio.Writer for a SAM, BAM or CRAM OutputFile.

func (*Writer) AddNodes

func (output *Writer) AddNodes(p *pipeline.Pipeline, header *Header, sortingOrder SortingOrder)

AddNodes implements the PipelineOutput interface for Writer values to produce output in the SAM file format.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL