fieldio

package
v0.0.0-...-9ba24aa Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 1, 2024 License: Apache-2.0 Imports: 24 Imported by: 0

Documentation

Overview

Package fieldio provides a reader and a writer for individual column (field).

Code generated from " ../../../../base/gtl/generate.py --prefix=unsafe -DELEM=int32 --package=fieldio --output=unsafeint32.go ../../../../base/gtl/unsafe.go.tpl ". DO NOT EDIT.

Index

Constants

View Source
const (
	// FieldIndexMagic is the value of FieldIndex.Magic.
	FieldIndexMagic = uint64(0xe360ac9026052aca)
)
View Source
const SizeofSliceHeader = int(unsafe.Sizeof(reflect.SliceHeader{}))

SizeofSliceHeader is the internal size of a slice. Usually 2*(CPU word size).

Variables

This section is empty.

Functions

func SeekReaders

func SeekReaders(requestedRange biopb.CoordRange, coordReader *Reader, columns []ColumnSeeker) error

SeekReaders arranges the readers (coordReader and columns[]) to read the requestedRange. After a successful return, the read pointer of every reader will be at requestedRange.StartAddr.

func SeqBytes

func SeqBytes(n int) int

SeqBytes computes the size of a sam.Seq.Seq that stores n bases. It returns ⌈nbases/2⌉, since each base consumes 4 bits.

Types

type AuxMetadata

type AuxMetadata struct {
	Tags []AuxTagHeader
}

type AuxTagHeader

type AuxTagHeader struct {
	// Two-letter tag name + datatype ('Z', 'H', 'i', etc)
	Name [3]byte
	// Length of the payload part (excluding the first three letters).
	Len int
}

type ColumnSeeker

type ColumnSeeker interface {
	// Seek arranges the reader to read the given range. If the reader has a
	// record in the range, it should move the read pointer to the block
	// containing r.StartAddr and return the start address of the block.  If the
	// reader has no record in the requested range, or on any error, it should
	// return false.
	Seek(r biopb.CoordRange) (biopb.Coord, bool)

	// Skip skips one record. It will be called repeatedly after Seek().
	Skip()
}

ColumnSeeker is an interface used by SeekReaders to seek PAM field files. Thread compatible.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader reads a sequence of values for one field type.

func NewReader

func NewReader(ctx context.Context, path, label string, coordField bool, fileOpts file.Opts, errp *errors.Once, opts ...ReaderOpt) (*Reader, error)

NewReader creates a new Reader that reads from the given path. Label is shown in log messages. coordField should be true if the file stores the genomic coordinate. Setting setting coordField=true enables the codepath that computes biopb.Coord.Seq values. If no file is found for this field, return value is nil, nil.

func (*Reader) Close

func (fr *Reader) Close(ctx context.Context)

Close closes the reader. Errors are reported through fr.err.

func (*Reader) Label

func (fr *Reader) Label() string

Label returns the diagnostic label of the reader object.

func (*Reader) PeekCoordField

func (fr *Reader) PeekCoordField() (biopb.Coord, bool)

PeekCoordField reads the next coordinate value without advancing the read pointer. It returns false on EOF or any error.

func (*Reader) ReadAuxField

func (fr *Reader) ReadAuxField(md AuxMetadata, arena *UnsafeArena) []sam.Aux

ReadAuxField reads the next aux field. Arg "md" must be the value reported by ReadAuxMetadata.

func (*Reader) ReadAuxMetadata

func (fr *Reader) ReadAuxMetadata() (AuxMetadata, bool)

ReadAuxMetadata reads the number and the size information of the aux field.

func (*Reader) ReadBytesField

func (fr *Reader) ReadBytesField(n int, arena *UnsafeArena) []byte

ReadBytesField reads the next variable-length byteslice field. The arg "n" must be the value reported by ReadBytesMetadata.

func (*Reader) ReadBytesMetadata

func (fr *Reader) ReadBytesMetadata() (int, bool)

ReadBytesMetadata returns the size of the variable-length byteslice field.

func (*Reader) ReadCigarField

func (fr *Reader) ReadCigarField(nOp int, arena *UnsafeArena) sam.Cigar

ReadCigarField reads next the Cigar field. The arg "nOp" must be the value reported by ReadCigarMetadata.

func (*Reader) ReadCigarMetadata

func (fr *Reader) ReadCigarMetadata() (int, bool)

ReadCigarMetadata reads the the # of cigar ops.

func (*Reader) ReadCoordField

func (fr *Reader) ReadCoordField() (biopb.Coord, bool)

ReadCoordField reads the next coordinate value. It returns false on EOF or any error.

func (*Reader) ReadFloat64Field

func (fr *Reader) ReadFloat64Field() (float64, bool)

ReadFloat64Field reads the next float64 value.

func (*Reader) ReadSeqField

func (fr *Reader) ReadSeqField(nBases int, arena *UnsafeArena) sam.Seq

ReadSeqField reads the Seq field. nBases must be obtained by calling ReadSeqMetadata.

func (*Reader) ReadSeqMetadata

func (fr *Reader) ReadSeqMetadata() (int, bool)

ReadSeqMetadata returns the length of the next seq field.

func (*Reader) ReadStringDeltaField

func (fr *Reader) ReadStringDeltaField(md StringDeltaMetadata, arena *UnsafeArena) string

ReadStringDeltaField reads a delta-encoded string. The arg "md" must be the value reported by ReadStringDeltaMetadata.

func (*Reader) ReadStringDeltaMetadata

func (fr *Reader) ReadStringDeltaMetadata() (StringDeltaMetadata, bool)

ReadStringDeltaMetadata reads the length information for delta-encoded string. Pass the result to readStringDeltaField() to actually decode the string.

func (*Reader) ReadUint16Field

func (fr *Reader) ReadUint16Field() (uint16, bool)

ReadUint16Field reads a uint16 value. It returns false on EOF or any error.

func (*Reader) ReadUint8Field

func (fr *Reader) ReadUint8Field() (uint8, bool)

ReadUint8Field reads a mapq value. It returns false on EOF or any error.

func (*Reader) ReadVarint32sField

func (fr *Reader) ReadVarint32sField(n int, arena *UnsafeArena) []int32

ReadVarint32sField reads the next varint slice field. The arg "n" must be the value reported by ReadVarint32sMetadata.

func (*Reader) ReadVarint32sMetadata

func (fr *Reader) ReadVarint32sMetadata() (int, bool)

ReadVarint32sMetadata returns the count of the varint slice field.

func (*Reader) ReadVarintDeltaField

func (fr *Reader) ReadVarintDeltaField() (int64, bool)

ReadVarintDeltaField reads a field containing a delta-encoded int. It returns false on EOF or any error.

func (*Reader) ReadVarintField

func (fr *Reader) ReadVarintField() (int64, bool)

ReadVarintField reads a field containing a varint. It returns false on EOF or any error.

func (*Reader) Seek

func (fr *Reader) Seek(requestedRange biopb.CoordRange) (biopb.Coord, bool)

Seek sets up the reader to read the requested coordinate range. Since the requestedRange may not be exactly aligned with recordio block boundaries, this method will typically arrange to read a slightly wider range than requested. It returns the start coordinate of the first recordio block to be read.

REQUIRES: maybeReadNextBlock has never been called.

func (*Reader) SkipAuxField

func (fr *Reader) SkipAuxField()

SkipAuxField skips the next aux field. It panics on EOF or any error.

func (*Reader) SkipBytesField

func (fr *Reader) SkipBytesField()

SkipBytesField skips the next variable-length byteslice field. It panics on EOF or any error.

func (*Reader) SkipCigarField

func (fr *Reader) SkipCigarField()

SkipCigarField skips the next cigar field.

func (*Reader) SkipFloat64Field

func (fr *Reader) SkipFloat64Field()

SkipFloat64Field skips the next float64 field. It panics on EOF or any error.

func (*Reader) SkipSeqField

func (fr *Reader) SkipSeqField()

SkipSeqField skips the next seq field. It panics on EOF or any error.

func (*Reader) SkipStringDeltaField

func (fr *Reader) SkipStringDeltaField()

SkipStringDeltaField skips a delta-encoded string. It panics on EOF or any error.

func (*Reader) SkipUint16Field

func (fr *Reader) SkipUint16Field()

SkipUint16Field skips the next uint16 value. It panics on EOF or any error.

func (*Reader) SkipUint8Field

func (fr *Reader) SkipUint8Field()

SkipUint8Field skips the next uint8 value. It panics on EOF or any error.

func (*Reader) SkipVarint32sField

func (fr *Reader) SkipVarint32sField()

SkipVarint32sField skips the next varint slice field. It panics on EOF or any error.

func (*Reader) SkipVarintField

func (fr *Reader) SkipVarintField()

SkipVarintField skips the next varint-encoded field. It panics on EOF or any error.

type ReaderOpt

type ReaderOpt func(*readerOpts)

ReaderOpt is an option to pass to NewReader.

func BufSize

func BufSize(size int) ReaderOpt

BufSize constructs a ReaderOpt for using a buffered reader to read the underlying file. Size 0 disables buffer.

type StringDeltaMetadata

type StringDeltaMetadata struct {
	PrefixLen int // Prefix shared with the prev record
	DeltaLen  int // Length of the suffix that differs from the prev record.
}

type UnsafeArena

type UnsafeArena struct {
	// contains filtered or unexported fields
}

UnsafeArena is an arena allocator. It supports allocating []bytes quickly.

func NewUnsafeArena

func NewUnsafeArena(buf []byte) UnsafeArena

NewUnsafeArena creates an arena for filling the given buffer.

func (*UnsafeArena) Align

func (ub *UnsafeArena) Align()

Align rounds "ub.n" up so that it is a multiple of 8. The next alloc() call returns a block aligned at a 8-byte boundary. Used when storing a pointer in []byte. 8-byte alignment is sufficient for all CPUs we care about.

func (*UnsafeArena) Alloc

func (ub *UnsafeArena) Alloc(size int) []byte

Alloc allocates a byte slice of "size" bytes.

Requires: ub must have at least size bytes of free space.

type WriteBufPool

type WriteBufPool struct {
	// contains filtered or unexported fields
}

func NewBufPool

func NewBufPool(capacity int) *WriteBufPool

func (*WriteBufPool) Finish

func (p *WriteBufPool) Finish()

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer buffers values of one field and writes them to a recordio file.

func NewWriter

func NewWriter(path, label string, transformers []string, bufFreePool *WriteBufPool, opts file.Opts, errp *errors.Once) *Writer

NewWriter creates a new field writer that writes to the given path. Label is used for logging. Transformers is set as the recordio transformers.

func (*Writer) BufLen

func (fw *Writer) BufLen() int

BufLen returns the bytelength of the buffer.

func (*Writer) Close

func (fw *Writer) Close()

Close the output file and return any error encountered so far. No method shall be called after fw.close().

REQUIRES: All outstanding flushes have completed.

func (*Writer) FlushBuf

func (fw *Writer) FlushBuf()

FlushBuf starts flushing the buffer to the underlying recordio file. It returns before the data is flushed to the storage.

func (*Writer) NewBuf

func (fw *Writer) NewBuf()

NewBuf allocates a new buffer and set it in fw.buf. It blocks the caller if there are too many flushing already ongoing.

func (*Writer) PutAuxField

func (fw *Writer) PutAuxField(addr biopb.Coord, aa []sam.Aux)

PutAuxField adds the aux field.

func (*Writer) PutBytesField

func (fw *Writer) PutBytesField(addr biopb.Coord, data []byte)

PutBytesField adds a field consisting of variable-length byte slice

func (*Writer) PutCigarField

func (fw *Writer) PutCigarField(addr biopb.Coord, cigar sam.Cigar)

PutCigarField adds the cigar field.

func (*Writer) PutCoordField

func (fw *Writer) PutCoordField(addr biopb.Coord, refID int, pos int)

PutCoordField adds a coordinate field to the buffer.

TODO(saito) we don't need (refid, pos). They can be derived from coord.

func (*Writer) PutFloat64Field

func (fw *Writer) PutFloat64Field(addr biopb.Coord, v float64)

PutFloat64Field adds a float64 field.

func (*Writer) PutSeqField

func (fw *Writer) PutSeqField(addr biopb.Coord, seq sam.Seq)

PutSeqField adds the seq field.

func (*Writer) PutStringDeltaField

func (fw *Writer) PutStringDeltaField(addr biopb.Coord, data string)

PutStringDeltaField adds a string-delta-encoded field.

func (*Writer) PutUint16Field

func (fw *Writer) PutUint16Field(addr biopb.Coord, v uint16)

PutUint16Field adds a uint16 field.

func (*Writer) PutUint8Field

func (fw *Writer) PutUint8Field(addr biopb.Coord, v byte)

PutUint8Field adds a byte field.

func (*Writer) PutVarint32sField

func (fw *Writer) PutVarint32sField(addr biopb.Coord, values []int32)

PutVarint32sField adds an array of varints.

func (*Writer) PutVarintDeltaField

func (fw *Writer) PutVarintDeltaField(addr biopb.Coord, value int64)

PutVarintDeltaField adds a varint-delta-encoded field.

func (*Writer) PutVarintField

func (fw *Writer) PutVarintField(addr biopb.Coord, value int64)

PutVarintField adds a varint-encoded field.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL