chunker

package

v0.15.1 Latest Latest Go to latest Published: Apr 8, 2024 License: Apache-2.0, MIT Imports: 21 Imported by: 1

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/ipni/index-provider

Links

Open Source Insights

Documentation ¶

Overview ¶

Package chunker provides functionality for chunking ad entries generated from provider.MultihashIterator into an IPLD DAG. The interface given a multihash iterator an EntriesChunker drains it, restructures the multihashes in an IPLD DAG and returns the root link to that DAG. Two DAG datastructures are currently implemented: ChainChunker, and HamtChunker. Additionally, CachedEntriesChunker can use either of the chunkers and provide an LRU caching functionality for the generated DAGs.

See: CachedEntriesChunker, ChainChunker, HamtChunker

Index ¶

type CachedEntriesChunker
- func NewCachedEntriesChunker(ctx context.Context, ds datastore.Batching, capacity int, ...) (*CachedEntriesChunker, error)
type ChainChunker
- func NewChainChunker(ls *ipld.LinkSystem, chunkSize int) (*ChainChunker, error)
- func (ls *ChainChunker) Chunk(ctx context.Context, mhi provider.MultihashIterator) (ipld.Link, error)
type EntriesChunker
type HamtChunker
- func NewHamtChunker(ls *ipld.LinkSystem, hashAlg multicodec.Code, bitWidth, bucketSize int) (*HamtChunker, error)
- func (h *HamtChunker) Chunk(ctx context.Context, iterator provider.MultihashIterator) (ipld.Link, error)
type NewChunkerFunc
- func NewChainChunkerFunc(chunkSize int) NewChunkerFunc
- func NewHamtChunkerFunc(hashAlg multicodec.Code, bitWidth, bucketSize int) NewChunkerFunc

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type CachedEntriesChunker ¶

type CachedEntriesChunker struct {
	// contains filtered or unexported fields
}

CachedEntriesChunker is an EntriesChunker that caches the generated chunks using an LRU cache. The chunks can be formatted as any DAG with two current implementations: HamtChunker and ChainChunker.

The DAGs are guaranteed to either be fully cached or not at all. If DAGs overlap, the smaller overlapping portion is not evicted unless all the DAGs that link to it are evicted.

The number of DAGs cached will be at most equal to the given capacity. The capacity is immutable. DAGs are evicted as needed if the capacity is reached.

See: NewCachedEntriesChunker.

func NewCachedEntriesChunker ¶

func NewCachedEntriesChunker(ctx context.Context, ds datastore.Batching, capacity int, newChunker NewChunkerFunc, purge bool) (*CachedEntriesChunker, error)

NewCachedEntriesChunker instantiates a new CachedEntriesChunker backed by a given datastore.

The DAGs are generated with the given newChunker and are stored in an LRU cache. Once stored, the individual DAGs that make up the entries chain are retrievable in their raw binary

form via CachedEntriesChunker.GetRawCachedChunk.

The shape of the DAGs is dictated by the underlying chunking logic that is instantiated once via newChunker function. See: NewHamtChunkerFunc, NewChainChunkerFunc.

The growth of LRU cache is limited by the given capacity. The capacity specifies the number of complete DAGs that are cached, not the DAGs within each chain. The actual storage consumed by the cache is a factor of: 1) the DAG shape determined by the underlying chunker, 2) multihash length and 3) capacity. For example, a fully populated cache with chunk size of 16384, for multihashes of length 128-bit and capacity of 1024 will consume 256MiB of space, i.e. (16384 * 1024 * 128b).

This implementation guarantees that for any given chain of entries, either the entire chain is cached, or it is not cached at all. When chains overlap, the overlapping portion of the chain is not evicted until the larger chain is evicted.

Unless purge is set to true, upon instantiation, the chunker will restore its state from the datastore, and prunes the datastore as needed. For example, if the given capacity is smaller than the number of chains present in the datastore it will evict chains to respect the given capacity in no particular order.

The purge flag specifies whether any existing cache should be cleared on startup. If set, any existing cached chunks will be deleted from the datastore. Otherwise, the previously cached entries are restored.

Note that a caching metadata with negligible size is persistent in addition to the chunks. The caching metadata is checked during restore to determine the root of cached chains, and the number of overlapping chunks.

The context is only used cancel a call to this function while it is accessing the data store.

See: CachedEntriesChunker.Chunk, CachedEntriesChunker.GetRawCachedChunk.

func (*CachedEntriesChunker) Cap ¶

func (ls *CachedEntriesChunker) Cap() int

Cap returns the maximum number of chained entries chunks this cache stores.

Note, the maximum number refers to the number of chains as a unit and not the total sum of individual chunks across chains.

func (*CachedEntriesChunker) Chunk ¶

func (ls *CachedEntriesChunker) Chunk(ctx context.Context, mhi provider.MultihashIterator) (ipld.Link, error)

Chunk chunks the multihashes supplied by the given mhi into a DAG and returns the link to root.

func (*CachedEntriesChunker) Clear ¶

func (ls *CachedEntriesChunker) Clear(ctx context.Context) error

Clear purges all stored items from the CachedEntriesChunker.

func (*CachedEntriesChunker) Close ¶

func (ls *CachedEntriesChunker) Close() error

Close syncs the backing datastore but does not close it. This is because cached entries chunker wraps an existing datastore and does not construct it, and the wrapped datastore may be in use elsewhere.

func (*CachedEntriesChunker) GetRawCachedChunk ¶

func (ls *CachedEntriesChunker) GetRawCachedChunk(ctx context.Context, l ipld.Link) ([]byte, error)

GetRawCachedChunk gets the raw cached entry chunk for the given link, or nil if no such caching exists.

func (*CachedEntriesChunker) Len ¶

func (ls *CachedEntriesChunker) Len() int

Len returns the number of chained entries chunks thar are currently stored in cache.

Note, the number refers to the number of chains as a unit and not the total sum of individual chunks across chains.

type ChainChunker ¶

type ChainChunker struct {
	// contains filtered or unexported fields
}

ChainChunker chunks advertisement entries as a chained series of schema.EntryChunk nodes. See: NewChainChunker

func NewChainChunker ¶

func NewChainChunker(ls *ipld.LinkSystem, chunkSize int) (*ChainChunker, error)

NewChainChunker instantiates a new chain chunker that given a provider.MultihashIterator it drains all its mulithashes and stores them in the given link system represented as a chain of schema.EntryChunk nodes where each chunk contains no more than chunkSize number of multihashes.

See: schema.EntryChunk.

func (*ChainChunker) Chunk ¶

func (ls *ChainChunker) Chunk(ctx context.Context, mhi provider.MultihashIterator) (ipld.Link, error)

Chunk chunks all the mulithashes returned by the given iterator into a chain of schema.EntryChunk nodes where each chunk contains no more than chunkSize number of multihashes and returns the link the root chunk node.

See: schema.EntryChunk.

type EntriesChunker ¶

type EntriesChunker interface {
	// Chunk chunks multihashes supplied by a given provider.MultihashIterator into a chain of
	// schema.EntryChunk and returns the link of the chain root.
	// If the given iterator has no elements, this function returns a nil link with no error.
	Chunk(context.Context, provider.MultihashIterator) (ipld.Link, error)
}

EntriesChunker chunks multihashes supplied by a given provider.MultihashIterator into a chain of schema.EntryChunk.

type HamtChunker ¶

type HamtChunker struct {
	// contains filtered or unexported fields
}

HamtChunker chunks advertisement entries as an IPLD HAMT data structure. See: NewHamtChunker.

func NewHamtChunker ¶

func NewHamtChunker(ls *ipld.LinkSystem, hashAlg multicodec.Code, bitWidth, bucketSize int) (*HamtChunker, error)

NewHamtChunker instantiates a new HAMT chunker that given a provider.MultihashIterator it drains all its mulithashes and stores them in the given link system represented as an IPLD HAMT ADL.

Only multicodec.Identity, multicodec.Sha2_256 and multicodec.Murmur3X64_64 are supported as hash algorithm. The bit-width and bucket size must be at least 3 and 1 respectively.

See:

func (*HamtChunker) Chunk ¶

func (h *HamtChunker) Chunk(ctx context.Context, iterator provider.MultihashIterator) (ipld.Link, error)

Chunk drains all the multihashes in the given iterator, stores them as an IPLD HAMT ADL and returns the link to the root HAMT node.

The HAMT is used as a set where the keys in the map represent the multihashes and values are simply set to true.

type NewChunkerFunc ¶

type NewChunkerFunc func(ls *ipld.LinkSystem) (EntriesChunker, error)

NewChunkerFunc instantiates the core EntriesChunker to use for generating advertisement entries DAG.

func NewChainChunkerFunc ¶

func NewChainChunkerFunc(chunkSize int) NewChunkerFunc

func NewHamtChunkerFunc ¶

func NewHamtChunkerFunc(hashAlg multicodec.Code, bitWidth, bucketSize int) NewChunkerFunc

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL