autocomplete

package module
v0.0.0-...-457066c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 13, 2023 License: MIT Imports: 18 Imported by: 0

README ¶

Autocomplete

🚧 Under construction. - Coming soon 🚧

A project to explore Trie's, Ternary Search Trees, dfs, and more!

Overview

Autocomplete provides an in-memory autocompletion engine. It is designed to be used in conjunction with a data store to provide autocompletion suggestions for a given prefix. This is useful for things like search bars, command line interfaces, and more.

WIP
  • Create some sort of saveSnapshot locally or to buckets even???
  • Enable data sources that are responsible for populating the keyword list.
  • Add coniguration option to choose the type of tree to use.
  • Complete Default Formatter tests
  • Complete KeywordList Formatter tests
  • Complete LocalFileProvider tests
  • Complete GoogleStorageBucketProvider tests
  • Complete GithubProvider tests
  • Complete AutoCompleteService tests
  • Add any missing New methods.
  • Work on complete sample service
  • Benchmarks w/ Examples
  • Cleanup unused properties/settings.
  • Setup cmd/autocompleter cli tool to run an AutoCompleteService
FUTURE
  • Profile to try and improve memory, performance and GC time.
  • Sharding Trees across mutexes
  • Export with binary instead of plain text
  • Extend by adding an additional Data structure
  • What would parallel computing look like?

Documentation ¶

Overview ¶

Autocomplete provides an in-memory autocompletion engine. It is designed to be used in conjunction with a data store to provide autocompletion suggestions for a given prefix. This is useful for things like search bars, command line interfaces, and more.

Index ¶

Constants ¶

View Source
const SERVICE_NAME = "autocomplete"

Variables ¶

This section is empty.

Functions ¶

func WithAutomaticUpdates ¶

func WithAutomaticUpdates(c *ServiceConfig)

func WithLoadDataSourcesOnStart ¶

func WithLoadDataSourcesOnStart(c *ServiceConfig)

func WithLowMemoryMode ¶

func WithLowMemoryMode(c *ServiceConfig)

func WithSnapshotsEnabled ¶

func WithSnapshotsEnabled(c *ServiceConfig)

Types ¶

type AutocompleteService ¶

type AutocompleteService struct {
	Config *ServiceConfig

	Errors      []error
	LastUpdated int64
	// contains filtered or unexported fields
}

Autocomplete service is the main object you will be interacting with. It is responsible for managing the autocompleter, data sources, and snapshots. It also provides a direct interface to interact with the autocompleter. That makes it easier than having to access the store to interface with the functionality.

func New ¶

func New(opts *ServiceConfig, keywords []string) (*AutocompleteService, error)

New creates a new AutocompleteService instance and performs all of the setup. This makes a call to LoadDataSources(). If you wish to skip this, set the LoadDataSourcesOnStart option to false.

You can also pass in a slice of keywords when calling this function to initialize your service store with.

func (*AutocompleteService) Add ¶

func (a *AutocompleteService) Add(word string)

func (*AutocompleteService) AddSnapshotDest ¶

func (a *AutocompleteService) AddSnapshotDest(dest DataSource)

func (*AutocompleteService) Clear ¶

func (a *AutocompleteService) Clear(runGC bool)

Clear will remove all data from the store, in the event you want to start fresh. There are two ways we can approach this, the safe way and just set an empty node to the root, and just wait for the GC take care of the old one.

Or we could manually trigger a GC cycle. Which is strongly discouraged, but might be required in the event of a memory shortage.

You may pass a flag to this function if you wish to manually trigger the GC cycle. Please note that running GC manually can:

Block the caller until the garbage collection is complete.
It may also block the entire program.
Per the runtime.DC() godocs.

func (*AutocompleteService) Close ¶

func (a *AutocompleteService) Close() error

Close will check for the SnapshotDest, and DataSources and close the providers associated with each. This is useful for a graceful shutdown to make sure all writes/reads are complete before exiting.

Note: I chose to use a composite error here for error handling, so that the caller doesn't have to solve one problem in order to get to the next (if one exists). So instead of returning on an error when we receive it we make our way through all data sources first, then generate a composite error with all errors we received along the way, append it to the AutocompleteService.Errors list and return it.

With this approach we no longer need a complex management system for in place for the Errors slice on our service.

func (*AutocompleteService) Complete ¶

func (a *AutocompleteService) Complete(prefix string) []string

I am providing different names to these functions to avoid implementing the internal interface autocompleter on itself. This also provides quick access instead of having to go through the store. And gives us room to add more functionality later.

func (*AutocompleteService) CreateSnapshot ¶

func (a *AutocompleteService) CreateSnapshot() error

func (*AutocompleteService) DisplayGraph ¶

func (a *AutocompleteService) DisplayGraph() ([]byte, error)

TODO: Add future functionality to allow the user to pass in a data source instead. This requires a redesign of the formatter and provider interfaces, mainly the formatter. It was designed specifically around keywords, however it's probably going to need to grow later on as we support various data structures.

func (*AutocompleteService) Exists ¶

func (a *AutocompleteService) Exists(word string) bool

func (*AutocompleteService) ExportToDataSource ¶

func (a *AutocompleteService) ExportToDataSource(dest DataSource) error

func (*AutocompleteService) GetContents ¶

func (a *AutocompleteService) GetContents() []string

func (*AutocompleteService) LoadDataSource ¶

func (a *AutocompleteService) LoadDataSource(src DataSource) error

func (*AutocompleteService) LoadDataSources ¶

func (a *AutocompleteService) LoadDataSources() error

func (*AutocompleteService) RestoreFromSnapshot ¶

func (a *AutocompleteService) RestoreFromSnapshot() error

type BitbucketProvider ¶

type BitbucketProvider struct{}

TODO: Future provider

type ConfigFn ¶

type ConfigFn func(*ServiceConfig)

A type to help with a new pattern for passing options to the New() function.

func WithDataSources ¶

func WithDataSources(sources []DataSource) ConfigFn

func WithMaxResults ¶

func WithMaxResults(max int) ConfigFn

WithMaxResults sets the maximum number of results to return Leave this as 0 for unlimited.

func WithServiceName ¶

func WithServiceName(name string) ConfigFn

func WithSnapshotDest ¶

func WithSnapshotDest(dest DataSource) ConfigFn

func WithSnapshotInterval ¶

func WithSnapshotInterval(interval int) ConfigFn

type DataProvider ¶

type DataProvider interface {
	ReadData(fileName string, store PublicProviderStore, fmtr Formatter) error
	DumpData(fileName string, store PublicProviderStore, fmtr Formatter) error
	Close() error
}

DataProvider is an interface that allows a DataSource of some kind, to be used to update the data inside of our AutoCompleterService store or export the data from the store to the DataSource.

type DataSource ¶

type DataSource struct {
	Provider  DataProvider
	Formatter Formatter
	Filepath  string
	Url       string
}

DataSource Either a source or destination for reads and exports. Use this when you have a resource that you would like to use to populate keywords into the AutoCompleter store. You will also use this when exporting the data and snapshots from the Autocompleter.

NOTICE: We have placed the formatter on the DataSoruce and not the Provider. This is because we wanted to allow for flexible data formatting based on the source. You may have one file that is in a JSON format or another that is in a CSV format if both files were coming from the same provider one of them would fail.

We did not think it was necessary to add user cli input to this process. Those can be handled when generating the Auto completer service..

func NewDataSource ¶

func NewDataSource(provider DataProvider, fmtr Formatter, filepath string, url string) *DataSource

Will add a default formatter if nil is provided for fmtr. The please see formatter/formatter.go DefaultFormatter for more information.

type DefaultFormat ¶

type DefaultFormat []string

DefaultFormat requires that your file decode into a slice of strings. Basically a non-nested JSON array of strings.

TYPE: type DefaultFormat []string

Example: keywords.json

[
  "keyword1",
  "keyword2",
  "keyword3"
]

Example: keywords.txt

keyword1
keyword2
keyword3

Example: keywords.csv

keyword1,keyword2,keyword3

Example: keywords.yaml

  • keyword1
  • keyword2
  • keyword3

func (DefaultFormat) FormatRead ¶

func (f DefaultFormat) FormatRead(data []byte, fileName string) ([]string, error)

func (DefaultFormat) FormatWrite ¶

func (f DefaultFormat) FormatWrite(keywords []string, fileName string) ([]byte, error)

type Formatter ¶

type Formatter interface {
	FormatRead(data []byte, fileName string) ([]string, error)
	FormatWrite(keywords []string, fileName string) ([]byte, error)
}

The formatter is used to define formatters to assign to the data providers. This allow us to both provide a stable API with default options, but also offers the flexibility to define custom formatters if you see fit.

TIP: Verify you have implemented the Formatter interface correctly by

var _ formatter.Formatter = (*YourTypeHere)(nil)

NOTE: Though it is not required to satisfy the interface, it is the standard to create a type alias if you're not using a user defined struct. For example: `type DefaultFormat []string`

Implementing the Formatter interface only requires one method. Format. It takes the file data and returns a slice of strings (keywords).

type GithubOpts ¶

type GithubOpts struct {
	AuthorName  string
	AuthorEmail string
	Repository  string
	BaseBranch  string
}

type GithubProvider ¶

type GithubProvider struct {
	Config     GithubOpts
	SourceOnly bool
	// contains filtered or unexported fields
}

func NewGithubProvider ¶

func NewGithubProvider(authToken string, opts GithubOpts) *GithubProvider

NewGithubProvider is a factory method for creating a new GithubProvider. NOTE: If you wish to use the SourceOnly feature of the GithubProvider, you must pass an empty string for the authToken.

func (*GithubProvider) Close ¶

func (g *GithubProvider) Close() error

Really doesn't do much other than set the closed flag to true. And remove any references to the client. If the github.Client, implements a Transport with CloseIdleConnections() method then any idle connections will be closed, otherwise this method is a no-op.

func (*GithubProvider) DumpData ¶

func (g *GithubProvider) DumpData(msg, fileName string, store PublicProviderStore, fmtr Formatter) error

func (*GithubProvider) ReadData ¶

func (g *GithubProvider) ReadData(fileName string, store PublicProviderStore, fmtr Formatter) error

type GoogleStorageBucketProvider ¶

type GoogleStorageBucketProvider struct {
	BucketName string
	// DefaultTimeout will be 5 minutes.
	DefaultTimeout time.Duration
	// contains filtered or unexported fields
}

GoogleStorageBucketProvider is a provider for reading and writing data to a Google Storage Bucket. Will work for both Private and Public buckets so long that your GOOGLE_APPLICATION_CREDENTIALS environment variable or GoogleStorageBucketProvider.credentials is set to a valid service account json file.

func NewGoogleStorageBucketProvider ¶

func NewGoogleStorageBucketProvider(name string, timeout time.Duration, creds *google.Credentials) (*GoogleStorageBucketProvider, error)

Pass 0 for timeout if you wish to use a default timeout.

func (*GoogleStorageBucketProvider) Close ¶

Deciding to only close on the client, instead of tracking weather or not a read operation or write operation was being performed and closing that reader and writer. Might have to change this.

func (*GoogleStorageBucketProvider) DumpData ¶

func (g *GoogleStorageBucketProvider) DumpData(fileName string, store PublicProviderStore, fmtr Formatter) error

func (*GoogleStorageBucketProvider) ReadData ¶

func (g *GoogleStorageBucketProvider) ReadData(fileName string, store PublicProviderStore, fmtr Formatter) error

type KeywordObjectListFormat ¶

type KeywordObjectListFormat struct {
	Keywords []string `json:"keywords" yaml:"keywords"`
}

KeywordObjectList requires a top level object named "keywords" with a value of a slice of strings.

TYPE: type KeywordObjectList struct {
	Keywords []string `json:"keywords" yaml:"keywords"`
}

Example: keywords.json

{
  "keywords": [
    "keyword1",
    "keyword2",
    "keyword3"
  ]
}

Example: keywords.yaml

keywords:
  - keyword1
  - keyword2
  - keyword3

Example: keywords.csv

keywords
keyword1,keyword2,keyword3

Example: keywords.txt

keywords
keyword1
keyword2
keyword3

func (KeywordObjectListFormat) FormatRead ¶

func (k KeywordObjectListFormat) FormatRead(data []byte, fileName string) ([]string, error)

func (KeywordObjectListFormat) FormatWrite ¶

func (k KeywordObjectListFormat) FormatWrite(keywords []string, fileName string) ([]byte, error)

type LocalFileProvider ¶

type LocalFileProvider struct {
	*os.File

	Filename string
	// contains filtered or unexported fields
}

Use a local file to read and write data.

func NewLocalFileProvider ¶

func NewLocalFileProvider(fileName string) (*LocalFileProvider, error)

func (*LocalFileProvider) Close ¶

func (l *LocalFileProvider) Close() error

My thought here is if the AutocompleteService.Close() is called while a write or read operation is currently in progress. We can go ahead and shut it down.

func (*LocalFileProvider) DumpData ¶

func (l *LocalFileProvider) DumpData(fileName string, store PublicProviderStore, fmtr Formatter) error

func (*LocalFileProvider) ReadData ¶

func (l *LocalFileProvider) ReadData(fileName string, store PublicProviderStore, fmtr Formatter) error

type PublicProviderStore ¶

type PublicProviderStore interface {
	Insert(word string)
	ListContents() []string
}

By implementing this interface the user can mock their store when testing their custom providers. This allows us to keep the autocomplete interface private. While at the time this also satisfies the interface of our AutoCompleterService store which is what will be passed into these formatting methods when executed by the service.

type ServiceConfig ¶

type ServiceConfig struct {
	ServiceName string
	// Leave 0 for unlimited.
	MaxResults       int
	SnapshotsEnabled bool
	SnapshotInterval int

	AutomaticUpdates       bool
	LoadDataSourcesOnStart bool
	LowMemoryMode          bool

	SnapshotDest *DataSource
	DataSources  []DataSource
}

ServiceConfig contains all of the configurable options for initializing a new autocomplete service.

You can use the NewServiceConfig() function to create a new instance of this

func NewServiceConfig ¶

func NewServiceConfig(opts ...ConfigFn) *ServiceConfig

NewServiceConfig creates a new ServiceConfig instance with the default values. Then performs any updates based on the config functions passed in.

Example:

config := NewServiceConfig(
  WithServiceName("my-service"),
  WithMaxResults(10),
  WithSnapshotsEnabled(),
)

Will create a new ServiceConfig with the default values, then update Service name, MaxResults, and enable snapshots.

Directories ¶

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL