fti

package
v0.0.0-...-7b96089 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 30, 2023 License: AGPL-3.0 Imports: 25 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ErrAbort = errors.New("abort")
View Source
var ErrNoConfig = errors.New("no config file found")

Functions

This section is empty.

Types

type ChunkSpec

type ChunkSpec struct {
	MaxTokens int `json:"max_tokens"`
	Overlap   int `json:"overlap"`
}

type Config

type Config struct {
	Embedding struct {
		Provider string `json:"provider"`
		Model    string `json:"model"`
	} `json:"embedding"`

	ChunkSpecs []ChunkSpec `json:"chunk_specs"`
}

type DocumentReference

type DocumentReference struct {
	Path string
}

type FileCursor

type FileCursor struct {
	fs.DirEntry

	Path string
	Err  error
}

type IndexedDocument

type IndexedDocument struct {
	psi.NodeBase

	Spec ChunkSpec
}

type Iterator

type Iterator[T any] interface {
	Next() bool
	Item() T
}

func Filter

func Filter[IT Iterator[T], T any](it IT, pred func(T) bool) Iterator[T]

Filter creates an iterator that filters the items in the given iterator based on the provided predicate. It returns a new iterator that only contains items for which the predicate returns true.

func IterateFiles

func IterateFiles(ctx context.Context, dirPath string) Iterator[FileCursor]

IterateFiles traverses the directory tree rooted at dirPath and sends each file info to the channel. It returns an iterator of FileCursor that represents each file found. The context is used to control the cancellation of the traversal.

type ObjectSnapshotImage

type ObjectSnapshotImage struct {
	Chunks     []chunkers.Chunk
	Embeddings []llm.Embedding
	Document   DocumentReference
}

func (*ObjectSnapshotImage) ReadFrom

func (osi *ObjectSnapshotImage) ReadFrom(r io.Reader) (int, error)

func (*ObjectSnapshotImage) WriteTo

func (osi *ObjectSnapshotImage) WriteTo(w io.Writer) (int, error)

WriteTo writes the ObjectSnapshotImage to the given io.Writer in PNG format. It generates an image representation of the ObjectSnapshotImage by assigning colors based on the embedding values.

type ObjectSnapshotMetadata

type ObjectSnapshotMetadata struct {
	Path       string `json:"path"`
	Hash       string `json:"hash"`
	ChunkCount []int  `json:"chunk_count"`
}

type OnlineIndex

type OnlineIndex struct {
	Repository *Repository
	// contains filtered or unexported fields
}

func NewOnlineIndex

func NewOnlineIndex(repo *Repository) (*OnlineIndex, error)

NewOnlineIndex initializes a new OnlineIndex with the given repository. It returns a pointer to the created OnlineIndex and an error if any.

NewOnlineIndex takes a repo *Repository as input and creates a new OnlineIndex instance. It initializes the OnlineIndex struct with the repo and an empty mapping. The function then creates a new Faiss index using faiss.NewIndexFlatIP with a dimension of 1536 and assigns it to the idx field of the OnlineIndex struct. If an error occurs during the creation of the index, it returns nil and the error. Otherwise, it returns a pointer to the created OnlineIndex and nil error.

func (*OnlineIndex) Add

func (oi *OnlineIndex) Add(img *ObjectSnapshotImage) error

Add adds an image to the online index. It takes an ObjectSnapshotImage as input and adds its embeddings to the index.

The function initializes a write lock, which ensures the thread-safety of the online index. The function then calculates the base index as the total number of entries in the index. For each embedding in the image, the function creates an OnlineIndexEntry, which holds the index, chunk, and embedding of the image. It then calls the putEntry() function to store the entry in the repository. Finally, it adds the embedding to the faiss index. If any error occurs during the process, it returns the error. Otherwise, it returns nil.

func (*OnlineIndex) Query

func (oi *OnlineIndex) Query(q llm.Embedding, k int64) ([]OnlineIndexQueryHit, error)

Query performs a search in the online index using the given query embedding and returns a list of hits. Each hit contains the corresponding entry from the index and the distance between the query and the entry embedding.

type OnlineIndexEntry

type OnlineIndexEntry struct {
	Index     int64
	Chunk     chunkers.Chunk
	Embedding llm.Embedding
	Document  DocumentReference
}

OnlineIndexEntry represents a single entry in the online index. It holds information about the index, chunk, and embedding of a file.

type OnlineIndexQueryHit

type OnlineIndexQueryHit struct {
	Entry    *OnlineIndexEntry
	Distance float32
}

OnlineIndexQueryHit represents a single search hit in the online index.

type Repository

type Repository struct {
	// contains filtered or unexported fields
}

func NewRepository

func NewRepository(repoPath string) (r *Repository, err error)

NewRepository creates a new Repository with the given repository path. It initializes the repository by loading the configuration and ignore file, creating a new online index, and loading the index if it exists.

func (*Repository) FileExists

func (r *Repository) FileExists(filePath string) bool

func (*Repository) Init

func (r *Repository) Init() error

Init initializes the repository by creating the necessary directories and configuration file. It creates the .fti directory and writes the default configuration to the config.json file.

func (*Repository) IsIgnored

func (r *Repository) IsIgnored(name string) bool

func (*Repository) IterateFiles

func (r *Repository) IterateFiles(ctx context.Context) Iterator[FileCursor]

IterateFiles returns an iterator that iterates over the files in the repository. It filters directories, files outside the repository path, and ignored files based on the repository's ignore file. The context parameter can be used to cancel the iteration.

func (*Repository) OpenFile

func (r *Repository) OpenFile(filePath string) (fs.File, error)

func (*Repository) Query

func (r *Repository) Query(ctx context.Context, query string, k int64) ([]OnlineIndexQueryHit, error)

Query searches for files in the repository that are similar to the provided query. It takes a context, which can be used for cancellation, the query string, and the maximum number of results (k) to return. The function returns a slice of OnlineIndexQueryHit, which contains information about the matching files, and an error, if any.

func (*Repository) RelativeToRoot

func (r *Repository) RelativeToRoot(name string) string

func (*Repository) RepoPath

func (r *Repository) RepoPath() string

func (*Repository) ResolveDbPath

func (r *Repository) ResolveDbPath(p ...string) string

func (*Repository) ResolvePath

func (r *Repository) ResolvePath(p ...string) string

func (*Repository) Update

func (r *Repository) Update(ctx context.Context) error

Update updates the repository by iterating over the files in the repository and updating each file. It uses the provided context to handle cancellation. For each file, it calls the UpdateFile function to perform the update operation. After updating all files, it writes the index to the index.faiss file.

func (*Repository) UpdateFile

func (r *Repository) UpdateFile(ctx context.Context, f FileCursor) error

UpdateFile updates a file in the repository. It takes a context, which can be used for cancellation, and a FileCursor representing the file to be updated. The function reads the file, computes its hash, and creates a directory with the hash as the name in the objects directory. It then calls the updateFileWithSpec function for each chunk specification in the repository's configuration. It updates the metadata with the count of chunks for each specification and writes the metadata to a JSON file. Returns an error if any occurred, or nil if the update was successful.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL