hnsw

package
v0.0.12 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 24, 2024 License: MIT Imports: 11 Imported by: 0

Documentation

Overview

Package hnsw implements the Hierarchical Navigable Small World (HNSW) graph for approximate nearest neighbor search.

Index

Constants

This section is empty.

Variables

View Source
var DefaultOptions = Options{
	M:            8,
	EF:           200,
	Heuristic:    true,
	DistanceType: index.DistanceTypeSquaredL2,
}

Functions

This section is empty.

Types

type HNSW

type HNSW struct {
	sync.RWMutex
	// contains filtered or unexported fields
}

HNSW represents the Hierarchical Navigable Small World graph

func New

func New(optFns ...func(o *Options)) *HNSW

New creates a new HNSW instance with the given options

func (*HNSW) BruteSearch

func (h *HNSW) BruteSearch(query []float32, k int, filter func(id uint32) bool) ([]index.SearchResult, error)

BruteSearch performs a brute-force search in the HNSW graph

func (*HNSW) GobDecode

func (h *HNSW) GobDecode(data []byte) error

GobDecode method for HNSW.

func (*HNSW) GobEncode

func (h *HNSW) GobEncode() ([]byte, error)

GobEncode method for HNSW.

func (*HNSW) Insert

func (h *HNSW) Insert(v []float32) (uint32, error)

Insert inserts a new element into the HNSW graph

func (*HNSW) KNNSearch

func (h *HNSW) KNNSearch(q []float32, k int, efSearch int, filter func(id uint32) bool) ([]index.SearchResult, error)

KNNSearch performs a k-nearest neighbor search in the HNSW graph

func (*HNSW) Stats

func (h *HNSW) Stats()

Stats prints statistics about the HNSW graph

type Node

type Node struct {
	Connections [][]uint32 // Links to other nodes
	Vector      []float32  // Vector (X dimensions)
	Layer       int        // Layer the node exists in the HNSW tree
	ID          uint32     // Unique identifier
}

Node represents a node in the HNSW graph

type Options

type Options struct {
	// M specifies the number of established connections for every new element during construction.
	// Reasonable range for M is 2-100. Higher M works better on datasets with high intrinsic dimensionality and/or high recall,
	// while low M works better for datasets with low intrinsic dimensionality and/or low recalls.
	// As an example, for dimension=4 random vectors, optimal M for search is somewhere around 6, while for high-dimensional datasets
	// (word embeddings, good face descriptors), higher M values are required (e.g., M=48-64) for optimal performance at high recall.
	// The range M=12-48 is ok for most use cases. When M is changed, one has to update the other parameters.
	// Nonetheless, ef and ef_construction parameters can be roughly estimated by assuming that M * ef_construction is a constant.
	M int

	// EF specifies the size of the dynamic candidate list.
	// EF is important for search time. Larger EF values can improve recall at the cost of increased search time.
	EF int

	// Heuristic indicates whether to use the heuristic algorithm (true) or the naive K-NN algorithm (false).
	// The heuristic algorithm is generally faster but may sacrifice some accuracy.
	Heuristic bool

	// DistanceType represents the type of distance function used for calculating distances between vectors.
	DistanceType index.DistanceType
}

Options represents the options for configuring HNSW.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL