hnsw

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 13, 2023 License: MIT-0 Imports: 12 Imported by: 0

Documentation

Index

Constants

View Source
const Efconstruction = 200 // Can be auto-configured using sample data
View Source
const M = 16

Set defaults for HNSW TODO: Auto select best values based on dataset and use ML model to determine

View Source
const Mmax = M
View Source
const Mmax0 = M * 2
View Source
const Version = 1.0

Variables

This section is empty.

Functions

This section is empty.

Types

type HNSW

type HNSW struct {
	Efconstruction int     // Size of the dynamic candidate list
	M              int     // Number of established connections (a reasonable range for M is from 5 to 48, smaller M generally produces better results for lower recalls and/or lower dimensional data. Bigger M is better for high recall and high dimensional data, and determines the memory consumption)
	Mmax           int     // Max number of connections per element/per layer (matches M)
	Mmax0          int     // Max for the 0 layer (Simulations suggest 2*M is a good choice, setting higher leads to performance degradation and excessive memory usage)
	Ml             float64 // Normalization factor for level generation
	Ep             int64   // Top layer of HNSW

	Maxlevel int // Track the current max level used

	Heuristic bool

	NodeList NodeList // Used to store the vectors within each node

	Wg sync.WaitGroup
	// contains filtered or unexported fields
}

func Load

func Load(filename string) (h HNSW, err error)

func New

func New(m int, mmax int, mmax0 int, efconstruction int, vecsize int) (h HNSW, err error)

func (*HNSW) AddConnections

func (h *HNSW) AddConnections(neighbourNode uint32, newNode uint32, level int)

Add links between nodes in the HNSW graph

func (*HNSW) BruteSearch

func (h *HNSW) BruteSearch(q *[]float32, K int) (topCandidates queue.PriorityQueue, err error)

Brute search

func (*HNSW) BruteSearchConcurrent

func (h *HNSW) BruteSearchConcurrent(size int, K int, numWorkers int) (resultChan chan SearchResults, jobs chan SearchQuery, err error)

Search concurrent

func (*HNSW) BruteSearchWorker

func (h *HNSW) BruteSearchWorker(id int, K int, jobs <-chan SearchQuery, resultChan chan<- SearchResults) error

func (*HNSW) FindEp

func (h *HNSW) FindEp(q *[]float32, currentObj *Node, layer int16) (match Node, currentDist float32, err error)

func (*HNSW) GetConnections

func (h *HNSW) GetConnections(ep *Node, level int) []uint32

Get links for a desired entry-point (ep) at a specified layer in the HNSW graph.

func (*HNSW) Insert

func (h *HNSW) Insert(q []float32) (uint32, error)

Input: Multi-layer graph hnsw, new element `q“, number of established connections `h.M“, max number of connections for each element per layer `h.Mmax“, size of the dynamic candidate list `h.efConstruction`, normalised factor for level generation `h.Ml` Output: update h inserting element q

func (*HNSW) InsertConcurrent

func (h *HNSW) InsertConcurrent(size int) (resultChan chan uint32, jobs chan []float32, err error)

Insert concurrent

func (*HNSW) InsertWorker

func (h *HNSW) InsertWorker(id int, jobs <-chan []float32, resultChan chan<- uint32) error

func (*HNSW) KnnSearch

func (h *HNSW) KnnSearch(q Node, K int, ef int) (nearestElements []Node)

Search layer Input: Query element `q`, number of nearest neighbours to return `K`, size of the dynamic candidate list `ef` Output: `nearestElements` to `q`

func (*HNSW) LegacySearchLayer

func (h *HNSW) LegacySearchLayer(q *[]float32, ep *[]float32, C *[]uint32, M int) queue.PriorityQueue

func (*HNSW) PeekNode

func (h *HNSW) PeekNode(id int) Node

Peek a node

func (*HNSW) Save

func (h *HNSW) Save(filename string) (err error)

func (*HNSW) Search

func (h *HNSW) Search(q *[]float32, topCandidates *queue.PriorityQueue, K int, efSearch int) (err error)

Find query point `q` and result `K` results (max-heap)

func (*HNSW) SearchConcurrent

func (h *HNSW) SearchConcurrent(size int, K int, efSearch int, numWorkers int) (resultChan chan SearchResults, jobs chan SearchQuery, err error)

Search concurrent

func (*HNSW) SearchLayer

func (h *HNSW) SearchLayer(q *[]float32, ep *queue.Item, topCandidates *queue.PriorityQueue, ef int, level uint) (err error)

Input: Query element `q`, enter point `ep`, `M` number of nearest to `q“ elements to return, layer number `layerNum` Output: `nearestElements` closest neighbours to `q`

func (*HNSW) SearchWorker

func (h *HNSW) SearchWorker(id int, K int, efSearch int, jobs <-chan SearchQuery, resultChan chan<- SearchResults) error

func (*HNSW) SelectNeighboursHeuristic

func (h *HNSW) SelectNeighboursHeuristic(topCandidates *queue.PriorityQueue, M int, order bool)

func (*HNSW) SelectNeighboursSimple

func (h *HNSW) SelectNeighboursSimple(topCandidates *queue.PriorityQueue, M int)

Input: Candidate elements `C`, number of neighbours to return `M` Output: `M` nearest elements in heap

func (*HNSW) Stats

func (h *HNSW) Stats()

type HNSW_Meta

type HNSW_Meta struct {
	Efconstruction int     // Size of the dynamic candidate list
	M              int     // Number of established connections (a reasonable range for M is from 5 to 48, smaller M generally produces better results for lower recalls and/or lower dimensional data. Bigger M is better for high recall and high dimensional data, and determines the memory consumption)
	Mmax           int     // Max number of connections per element/per layer (matches M)
	Mmax0          int     // Max for the 0 layer (Simulations suggest 2*M is a good choice, setting higher leads to performance degradation and excessive memory usage)
	Ml             float64 // Normalization factor for level generation
	Ep             int64   // Top layer of HNSW

	Maxlevel int // Track the current max level used

	Heuristic bool
}

type Node

type Node struct {
	Connections [][]uint32 // Links to other nodes
	Vectors     []float32  // Vector (X dimensions)
	Layer       int        // Layer the node exists in the HNSW tree
	Id          uint32     // Unique identifier
}

type NodeList

type NodeList struct {
	Nodes []Node
	// contains filtered or unexported fields
}

type SearchQuery

type SearchQuery struct {
	Id int
	Qp []float32
}

type SearchResults

type SearchResults struct {
	Id             int
	BestCandidates queue.PriorityQueue
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL