Documentation ¶
Index ¶
- type ArgMaxHeap
- type Bucket
- type Buckets
- type Float
- type Hash
- type HashingTreeLevel
- type LookupTable
- type Maddness
- type Vector
- func (v Vector[F]) Add(other Vector[F]) Vector[F]
- func (v Vector[F]) AddSquares(other Vector[F]) Vector[F]
- func (v Vector[_]) ArgMin() int
- func (v Vector[F]) Copy() Vector[F]
- func (v Vector[F]) DivScalar(x F) Vector[F]
- func (v Vector[F]) DotProduct(other Vector[F]) (y F)
- func (v Vector[F]) Reverse() Vector[F]
- func (v Vector[F]) Square() Vector[F]
- func (v Vector[F]) Sub(other Vector[F]) Vector[F]
- type Vectors
- func (vs Vectors[F]) ColumnWiseVariance() Vector[F]
- func (vs Vectors[F]) Copy() Vectors[F]
- func (vs Vectors[F]) CumulativeSSE() Vector[F]
- func (vs Vectors[F]) Mean() Vector[F]
- func (vs Vectors[F]) OptimalSplitThreshold(splitIndex int) (threshold, loss F)
- func (vs Vectors[F]) Reverse() Vectors[F]
- func (vs Vectors[F]) SortByColumn(index int) Vectors[F]
- func (vs Vectors[F]) SplitByThreshold(index int, threshold F) (lt, gte Vectors[F])
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ArgMaxHeap ¶
type ArgMaxHeap[F Float] struct { // contains filtered or unexported fields }
ArgMaxHeap is a max-heap working on the indices of a Vector, without modifying the Vector itself.
func NewArgMaxHeap ¶
func NewArgMaxHeap[F Float](v Vector[F]) *ArgMaxHeap[F]
NewArgMaxHeap creates and initializes a new ArgMaxHeap object.
func (*ArgMaxHeap[_]) FirstArgsMax ¶
func (h *ArgMaxHeap[_]) FirstArgsMax(n int) []int
FirstArgsMax returns the first n arguments of the maxima (arg max) of v.
If n is greater than Len, only Len indices are returned.
func (*ArgMaxHeap[_]) Len ¶
func (h *ArgMaxHeap[_]) Len() int
Len is the number of indices in the collection.
func (*ArgMaxHeap[_]) Less ¶
func (h *ArgMaxHeap[_]) Less(i, j int) bool
Less reports whether the index at i must sort before the index of j. It returns true only if the vector's value at the i-th index is grater than the value at the j-th index.
func (*ArgMaxHeap[_]) Pop ¶
func (h *ArgMaxHeap[_]) Pop() any
Pop removes and returns the index of the maximum element (according to Less) from the heap.
func (*ArgMaxHeap[_]) Push ¶
func (h *ArgMaxHeap[_]) Push(any)
Push always panics: in order to simplify the implementation, it is not allowed to add new elements.
func (*ArgMaxHeap[_]) Swap ¶
func (h *ArgMaxHeap[_]) Swap(i, j int)
Swap swaps the i-th and j-th indices.
type Bucket ¶
A Bucket is a set of Vectors, supporting the learning process of MADDNESS hash function parameters.
The tree Level, and the NodeIndex within that level, are included as a convenient for debugging or logging purposes.
type Buckets ¶
Buckets is a set of Bucket items. Usually, they all belong to the same level of the hashing tree, built during the learning process.
func (Buckets[_]) HeuristicSelectIndices ¶
HeuristicSelectIndices selects a set of indices to be evaluated during the construction of a hashing-tree level.
It returns a maximum of top-four indices that contribute the most loss, summed across all Buckets.
func (Buckets[F]) Prototypes ¶
Prototypes creates the prototype Vectors.
For each Bucket, a prototype Vector is created by computing the mean of its sub-vectors.
type Hash ¶
type Hash[F Float] struct { TreeLevels []*HashingTreeLevel[F] Prototypes Vectors[F] }
Hash is the data structure for MADDNESS hash function. It holds the learned balanced binary regression tree and the prototype vectors.
type HashingTreeLevel ¶
HashingTreeLevel is one level of the binary tree from a Hash.
type LookupTable ¶
type LookupTable[F Float] struct { Bias F Scale F // Matrix, represented in row-major order, with row index = subspace, // and column index = prototype. Data []uint8 }
LookupTable holds a table of pre-computed dot products, quantized to 8 bits, and the parameters needed for de-quantization during the product quantization's aggregation step.
type Maddness ¶
type Maddness[F Float] struct { NumSubspaces int VectorSize int SubVectorSize int Hashes []*Hash[F] LookupTables []*LookupTable[F] }
Maddness is the primary structure that holds parameters and implements methods of the whole MADDNESS algorithm.
func TrainMaddness ¶
TrainMaddness runs the learning process for MADDNESS product quantization and hash functions parameters, returning a new trained Maddness object.
func (*Maddness[F]) DotProduct ¶
DotProduct computes the approximated dot product between a data vector, identified by the lookup-table indices obtained from the vector's quantization, and the query vector represented by queryVectorIndex.
func (*Maddness[F]) LookupTableIndices ¶
LookupTableIndices transforms a list of hash indices, as returned from Quantize, into a corresponding list of lookup-table indices, for accessing LookupTable.Data.
func (*Maddness[F]) Quantize ¶
Quantize splits the given vector into subspaces and returns a slice of hash indices, one for each subspace.
func (*Maddness[F]) Reconstruct ¶
Reconstruct builds a vector from a list of hash indices, reconstructed using the learned prototypes for each subspace.
type Vector ¶
type Vector[F Float] []F
A Vector is a slice of floating point values.
func (Vector[F]) Add ¶
Add modifies v adding to it the values from other, element-wise. It returns v.
func (Vector[F]) AddSquares ¶
AddSquares modifies v adding to it the squared values from other, element-wise. It returns v.
func (Vector[_]) ArgMin ¶
ArgMin returns the index of the minimum value from the vector.
If the identical minimum value is present more than once in the vector, the index of its first occurrence (i.e. the lowest index) is returned.
func (Vector[F]) DotProduct ¶
DotProduct computes the dot product between v and other.
type Vectors ¶
Vectors is a collection of Vector objects.
All Vector objects MUST be of the same size.
Given a collection of N vectors, each of size M, this can also be seen as a matrix with dimensions N×M (N rows, M columns).
func (Vectors[F]) ColumnWiseVariance ¶
ColumnWiseVariance returns a new Vector containing the column-wise variance computed across all vectors.
func (Vectors[F]) CumulativeSSE ¶
CumulativeSSE returns a new Vector, computing the error sum of squares (SSE) over the set of Vectors.
func (Vectors[F]) OptimalSplitThreshold ¶
OptimalSplitThreshold finds the optimal split threshold within the vectors, computed over values at the given split-index column.
func (Vectors[F]) SortByColumn ¶
SortByColumn sorts vs in place, according to the value at each vector's index (column), in ascending order. It returns vs.
func (Vectors[F]) SplitByThreshold ¶
SplitByThreshold splits the collection of vectors by a threshold value, compared with the elements at the given index.
Two new collections are returned:
- lt: vectors whose value at "index" is lower than "threshold"
- gte: vectors whose value at "index" is greater than or equal to "threshold"