Documentation ¶
Overview ¶
Package botanic provides functions to grow a regression tree (tree.Tree)
Index ¶
- func BranchOut(ctx context.Context, task *queue.Task, t *tree.Tree, ps *PruningStrategy) (tasks []*queue.Task, e error)
- func Seed(ctx context.Context, classFeature feature.Feature, features []feature.Feature, ...) (*tree.Tree, error)
- func Work(ctx context.Context, t *tree.Tree, q queue.Queue, ps *PruningStrategy, ...) error
- type Partition
- type Pruner
- type PrunerFunc
- type PruningStrategy
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func BranchOut ¶
func BranchOut(ctx context.Context, task *queue.Task, t *tree.Tree, ps *PruningStrategy) (tasks []*queue.Task, e error)
BranchOut takes a context, a task, a tree and a pruning strategy, develops the node in the task using the task's set and available feature to predict the tree's class feature and returns a set of tasks to develop the resulting children nodes or an error.
func Seed ¶
func Seed(ctx context.Context, classFeature feature.Feature, features []feature.Feature, s set.Set, q queue.Queue, ns tree.NodeStore) (*tree.Tree, error)
Seed takes a context, a class feature, a slice of features, a set of data, a queue and a node store and sets everything up so that workers that consume from the queue afterwards grow a tree that predicts the given class feature using the features in the given slice and according to the training data on the given set. Specifically it will create the root node of the tree on the node store and push a task to branch it out on the queue. The function returns the tree that can be grown or an error if the node cannot be created on the store, or the task pushed to the queue (in the amount of time allowed by the given context).
func Work ¶
func Work(ctx context.Context, t *tree.Tree, q queue.Queue, ps *PruningStrategy, emptyQueueSleep time.Duration) error
Work takes a context, a tree, a queue, a pruning strategy and an emptyQueueSleep duration and enters a loop in which it:
- pulls a task for the queue,
- branches its node out into new subnodes using BranchOut
- pushes the tasks for the new subnodes into the queue
- marks the task as completed on the queue
If at some point no task can be pulled from the queue and the sum of tasks running and pending on the queue is 0, the worker ends returning nil. If no task can be pulled but the sum is not 0, then the worker will sleep for the given emptyQueueSleep duration and then retry.
Work will return a non-nil error if the given context times out or is cancelled, if BranchOut returns a non-nil error or if an operation with the given queue returns a non-nil error.
Types ¶
type Partition ¶
type Partition struct { Feature feature.Feature Tasks []*queue.Task // contains filtered or unexported fields }
Partition represents a partition of a set according to a feature into subtrees with an information gain to predict the class feature
func NewContinuousPartition ¶
func NewContinuousPartition(ctx context.Context, s set.Set, f *feature.ContinuousFeature, classFeature feature.Feature, p Pruner) (*Partition, error)
NewContinuousPartition takes a context.Context, a set, a continuous feature and a class feature and returns a partition of the set for the given feature. The result may be nil if the obtained information gain is considered insufficient
func NewDiscretePartition ¶
func NewDiscretePartition(ctx context.Context, s set.Set, f *feature.DiscreteFeature, classFeature feature.Feature, p Pruner) (*Partition, error)
NewDiscretePartition takes a context.Context, a set, a discrete feature and a class feature and returns a partition of the set for the given feature. The result may be nil if the obtained information gain is considered insufficient
type Pruner ¶
type Pruner interface {
Prune(ctx context.Context, s set.Set, p *Partition, classFeature feature.Feature) (bool, error)
}
Pruner is an interface wrapping the Prune method, that can be used to decide whether a partition is good enough to become part of a tree or if it must be pruned instead.
The Prune method takes a context, set, a partition and a class Feature and returns a boolean: true to indicate the partition must be pruned, false to allow its adding to the tree and further development.
func DefaultPruner ¶
func DefaultPruner() Pruner
DefaultPruner returns a Pruner whose Prune method evaluates a minimum information gain for the partition and returns true if the partition information gain is below this minimum and false otherwise. This minimum is calculated as (1/N) x log2(N-1) + (1/N) x [ log2 (3k-2) - (k x Entropy(S) – k1 x Entropy(S1) – k2 x Entropy(S2) ... - ki x Entropy(Si)] with
- N begin the number of elements in the set
- k being the number of different values for the class feature on the set
- k1, k2, ... ki being the number of different values for the class feature on the subset for the partition subtree 1, 2, ... i
- S1, S2, ... Si begin the subset of data for the partition subtree 1, 2, ... i
func FixedInformationGainPruner ¶
FixedInformationGainPruner takes an informationGainThreshold float64 value and returns a Pruner whose Prune method returns whether the informationGainThreshold is greater or equal to the received partition's information gain
type PrunerFunc ¶
type PrunerFunc func(ctx context.Context, s set.Set, p *Partition, classFeature feature.Feature) (bool, error)
PrunerFunc wraps a function with the Prune method signature to implement the Pruner interface
type PruningStrategy ¶
type PruningStrategy struct { // Pruner is applied during the partition // of a node's set with a feature to determine // if the result is worth incorporating // into the tree. Pruner // MinimumEntropy is the maximum value of // entropy for a node that prevents it from // being branched out at all. In other words, // nodes whose training set of data has an // entropy equal or below this will not be // developed. MinimumEntropy float64 }
PruningStrategy holds the configuration for when a node not be partition further or at all.
Directories ¶
Path | Synopsis |
---|---|
cmd
|
|
botanic
Botanic is a tool to grow and use regression trees from sets of samples
|
Botanic is a tool to grow and use regression trees from sets of samples |
Package feature defines features and criteria for those features
|
Package feature defines features and criteria for those features |
yaml
Package yaml provides methods to parse feature.Feature specifications also known as metadata, from YAML documents.
|
Package yaml provides methods to parse feature.Feature specifications also known as metadata, from YAML documents. |
Package queue defines tasks to be performed to grow a tree as well as an interface for a Queue to manage them.
|
Package queue defines tasks to be performed to grow a tree as well as an interface for a Queue to manage them. |
Package set defines interfaces for sets as samples.
|
Package set defines interfaces for sets as samples. |
csv
Package csv provides functions to read/write a set.Set as CSV
|
Package csv provides functions to read/write a set.Set as CSV |
inputsample
Package inputsample provides an implementation of set.Sample that is read from an io.Reader.
|
Package inputsample provides an implementation of set.Sample that is read from an io.Reader. |
sqlset
Package sqlset provides implementations of set.Set that use SQL database as backends.
|
Package sqlset provides implementations of set.Set that use SQL database as backends. |
sqlset/pgadapter
Package pgadapter provides an implementation of the Adapter interface in the sqlset package that works over a PostgreSQL database.
|
Package pgadapter provides an implementation of the Adapter interface in the sqlset package that works over a PostgreSQL database. |
sqlset/sqlite3adapter
Package sqlite3adapter provides an implementation of the Adapter interface in the sqlset package that works over a SQLite3 database.
|
Package sqlite3adapter provides an implementation of the Adapter interface in the sqlset package that works over a SQLite3 database. |
Package tree defines a regression tree
|
Package tree defines a regression tree |
json
Package json provides functions that marshall/unmarshall a tree.Tree as/from JSON
|
Package json provides functions that marshall/unmarshall a tree.Tree as/from JSON |