simplewarc

package module
v0.0.0-...-a8309ec Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 11, 2022 License: MIT Imports: 9 Imported by: 2

Documentation

Index

Constants

View Source
const (
	ChunkSize                 = 4096
	RecordDelimitingLineCount = 2
)

Variables

This section is empty.

Functions

This section is empty.

Types

type CompressionType

type CompressionType int
const (
	// Compression Types
	NoCompression CompressionType = iota + 1
	BzipCompression
	GzipCompression
)
type Header map[string]string

Header represents a WARC named-field

func (Header) ContentLength

func (h Header) ContentLength() int64

ContentLength returns the content-length field-value

func (Header) Delete

func (h Header) Delete(key string)

Delete removes the named-field for the given field-name

func (Header) Get

func (h Header) Get(key string) string

Get returns the field-value set for the field-name

func (Header) Has

func (h Header) Has(key string) bool

Has returns true if the field-name exists as a named-field

func (Header) Set

func (h Header) Set(key, val string)

Set sets the field-value for the given field-name

type Reader

type Reader interface {
	// Close closes the reader
	Close() error

	// Read reads up up to len(p) bytes into p
	Read(p []byte) (n int, err error)

	// Next returns the next WARC record in the archive
	Next() (*Record, error)

	// Seek sets the reader to the next record in the archive
	Seek() error

	// ReadLine reads the next line in the current record
	ReadLine() (string, error)
}

Reader represents a WARC archive reader

func New

func New(source io.Reader) (Reader, error)

New wraps the given reader and returns a new WARC reader

type Record

type Record struct {
	Header  Header
	Content io.Reader
	Offset  int // Offset in archive where record starts (in bytes)
}

Record represents a WARC record

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL