index

package
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 24, 2023 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package index provides an inverted index implementation and basic functionality for work with this data structure

Index

Constants

View Source
const IndexVersion = "v5.1"

IndexVersion tells that the inverted index structure has the provided below version

Variables

View Source
var (
	// ErrPostingListShouldBeNotNil occurs when was an attempt to persist nil Posting List
	ErrPostingListShouldBeNotNil = errors.New("postingList should be not nil")
)

Functions

func NewEncoder

func NewEncoder() (compression.Encoder, error)

NewEncoder returns a new instance of Encoder

Types

type DocumentID

type DocumentID = uint32

DocumentID is a unique identifier of a indexed document

type Index

type Index = map[Term][]Position

Index is a low level data structure for storing a map of posting lists

type Indices

type Indices = []Index

Indices is a list of Indexes grouped by a length of a document's nGram set

type InvertedIndex

type InvertedIndex interface {
	// Get returns corresponding posting list for given term
	Get(term Term) (PostingListContext, error)
	// Has checks is there is given term in inverted index
	Has(term Term) bool
}

InvertedIndex is an index data structure that contains list of references to documents for each term

func NewInvertedIndex

func NewInvertedIndex(
	reader store.Input,
	table invertedIndexStructure,
) InvertedIndex

NewInvertedIndex returns new instance of InvertedIndex that is stored on disc

type InvertedIndexIndices

type InvertedIndexIndices interface {
	// Get returns InvertedIndex of term with given index.
	// Index here represents document ngrams cardinality
	Get(index int) InvertedIndex
	// Size returns number of InvertedIndex
	Size() int
}

InvertedIndexIndices is a array of InvertedIndex, where index - ngrams cardinality of containing documents 0 index - inverted index that contains all documents (without ngrams' cardinality separation)

func NewInvertedIndexIndices

func NewInvertedIndexIndices(indices []InvertedIndex) InvertedIndexIndices

NewInvertedIndexIndices returns new instance of InvertedIndexIndices

type Position

type Position = DocumentID

Position (posting) is a list item of PostingList

type PostingList

type PostingList interface {
	merger.ListIterator
	// Init initialize the given posting list with the provided context
	Init(context PostingListContext) error
}

PostingList represents a list of documents ids, that belongs to the certain index term

type PostingListContext

type PostingListContext struct {
	ListSize int
	Reader   store.Input
}

PostingListContext is the entity that holds context information for the corresponding Posting List

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader is an entity, providing access to a search index

func NewIndexReader

func NewIndexReader(
	directory store.Directory,
	config WriterConfig,
) *Reader

NewIndexReader returns a new instance of a search index reader

func (*Reader) Read

func (ir *Reader) Read() (InvertedIndexIndices, error)

Read reads a inverted index indices from the given directory

type Searcher

type Searcher interface {
	// Search performs search for the given index with the terms and threshold
	Search(invertedIndex InvertedIndex, terms []Term, threshold int, collector merger.Collector) error
}

Searcher is responsible for searching

func NewSearcher

func NewSearcher(merger merger.ListMerger) Searcher

NewSearcher creates a new Searcher instance

type Term

type Term = string

Term represents an independent search element in a search document

type Writer

type Writer struct {
	// contains filtered or unexported fields
}

Writer creates and maintains an inverted index

func NewIndexWriter

func NewIndexWriter(
	directory store.Directory,
	config WriterConfig,
	encoder compression.Encoder,
) *Writer

NewIndexWriter returns new instance of a index writer

func (*Writer) AddDocument

func (iw *Writer) AddDocument(id DocumentID, term []Term) error

AddDocument adds a new documents with the given fields

func (*Writer) Commit

func (iw *Writer) Commit() error

Commit commits all added documents to the index storage

type WriterConfig

type WriterConfig struct {
	HeaderFileName       string
	DocumentListFileName string
}

WriterConfig stores a set of file paths that are required for creating search index

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL