bgzf

package
v0.0.0-...-d966d87 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 18, 2020 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Overview

Package bgzf includes a Writer for the .bgzf (block gzipped) file format. A .bgzf file consists of one or more complete gzip blocks concatenated together. Each of the gzip blocks must represent at most 64KB of uncompressed data, and the compressed size of the block must be at most 64KB. The payload of the .bgzf file is equal to the uncompressed content of each block, concatenated together in order. A valid .bgzf file ends with the 28 byte .bgzf terminator shown below; the terminator is a valid gzip block containing an empty payload.

The .bgzf format is used by .bam files and Illumina .bcl.bgzf files from Nextseq instruments.

For more information about the .bgzf file format, see the SAM/BAM spec here: https://samtools.github.io/hts-specs/SAMv1.pdf

Example use with basic level parameter:

var bgzfFile bytes.Buffer
w, err := NewWriter(&bgzfFile, flate.DefaultCompression)
n, err := w.Write([]byte("Foo bar"))
err = w.Close()

Example use with more configuration parameters:

var bgzfFile bytes.Buffer
w, err := NewWriterParams(
  &bgzfFile,
  flate.DefaultCompression,
  DefaultUncompressedBlockSize,
  zlibng.RLEStrategy,
  0,
)
n, err := w.Write([]byte("Foo bar"))
err = w.Close()

Example use with multiple compression shards:

// In goroutine 1
var shard1 bytes.Buffer
w, err := NewWriter(&shard1, flate.DefaultCompression)
n, err := w.Write([]byte("Foo bar"))
err = w.CloseWithoutTerminator()

// In goroutine 2
var shard2 bytes.Buffer
w, err := NewWriter(&shard2, flate.DefaultCompression)
n, err := w.Write([]byte(" baz!"))
err = w.Close()  // Terminator goes at the end of the last shard.

// Merge shards into final .bgzfFile.
var bgzfFile bytes.Buffer
_, err := io.Copy(&bgzfFile, &shard1)
_, err = io.Copy(&bgzfFile, &shard2)

Index

Constants

View Source
const (
	// DefaultUncompressedBlockSize is the default bgzf
	// uncompressedBlockSize chosen by both sambamba and biogo.  See
	// the SAM/BAM specification for details.
	DefaultUncompressedBlockSize = 0x0ff00

	// MaxUncompressedBlockSize is the largest legal value for
	// uncompressedBlockSize.  Illumina's Nextseq machines use this
	// value when creating .bcl.bgzf files.
	MaxUncompressedBlockSize = 0x10000
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer compresses data into .bgzf format. The .bgzf format consists of gzip blocks concatenated together. Each gzip block has an uncompressed size of at most 64KB. The .bgzf format adds an Extra header field to each of the gzip headers; the Extra field contains the size of the uncompressed block in bytes - 1. The payload data of the .bgzf file is equal to the in-order concatenation of all the uncompressed payloads of the gzip blocks. A .bgzf file also contains an EOF terminator at the end of the file.

func NewWriter

func NewWriter(w io.Writer, level int) (*Writer, error)

NewWriter returns a new .bgzf writer with the given compression level. Returns an nil, error if there is a problem.

func NewWriterParams

func NewWriterParams(w io.Writer, level, uncompressedBlockSize, gzipStrategy, gzipXFL int) (*Writer, error)

NewWriterParams returns a new .bgzf writer, with the given configuration parameters. uncompressedBlockSize is the largest number of bytes to put into each .bgzf block. gzipStrategy is a strategy value from gzip; possible values are DefaultStrategy, FilteredStrategy, HuffmanOnlyStrategy, RLEStrategy, and FixedStrategy. gzipXFL will be written to the XFL gzip header field for each of the gzip blocks in the output; if gzipXFL is -1, then gzip with set XFL according to the other gzip configuration parameters. Returns nil, error if there is a problem.

func (*Writer) Close

func (w *Writer) Close() error

Close the current .bgzf block and also append the .bgzf terminator.

func (*Writer) CloseWithoutTerminator

func (w *Writer) CloseWithoutTerminator() error

CloseWithoutTerminator closes the current .bgzf block, but does not append the .bgzf terminator. This output file is not a complete .bgzf file until the user calls Close().

func (*Writer) VOffset

func (w *Writer) VOffset() uint64

VOffset returns the virtual-offset of the next byte to be written.

func (*Writer) Write

func (w *Writer) Write(buf []byte) (int, error)

Writes buf to the .bgzf payload. Returns the number of bytes consumed from buf and any error encountered.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL