id3

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 7, 2021 License: MIT Imports: 5 Imported by: 0

README

github.com/gbkr-com/id3

An implementation of the ID3 decision tree algorithm, which learns from CSV conformant data.

The code is organised as follows:

  • views.go provides an interface and implementations for ID3 to inspect CSV data
  • decisions.go defines the internal representation of the decision tree, including writing and reading that tree as JSON
  • learn.go is the ID3 algorithm itself.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func AverageEntropy

func AverageEntropy(view View, attribute, class string) (h float64)

AverageEntropy returns the average entropy of the class column over each distinct value of the attribute column.

func Entropy

func Entropy(p float64) float64

Entropy returns the Shannon entropy for the given probability. It converts the edge cases of probability zero and one to a zero entropy value.

func TotalEntropy

func TotalEntropy(view View, class string) (h float64)

TotalEntropy returns the total entropy of the class column in the view.

Types

type Case

type Case struct {
	Value  string    // The distinct column value.
	Class  string    // The decided class value, or "" if further decision(s) are needed.
	Decide *Decision // The subsequent decision, or nil.
}

A Case is a distinct value and its associated action; either a decided class value or a subsequent decision.

type Decision

type Decision struct {
	Column string  // The name of the data column.
	Cases  []*Case // The cases for that column.
}

Decision represents a decision within the decision tree for a single column. Each distinct value in that column is a case. The cases are in decreasing probability sequence.

func FromJSON

func FromJSON(b []byte) (*Decision, error)

FromJSON translates the given JSON formatted byte slice into a decision.

func Learn

func Learn(view View, class string) *Decision

Learn runs the ID3 algorithm on the given view using the named class column.

func (*Decision) Decide

func (d *Decision) Decide(data [][]string) (result []string)

Decide on the given CSV conformant data. The first row must be the column headings.

func (*Decision) ToJSON

func (d *Decision) ToJSON(indent bool) ([]byte, error)

ToJSON returns this decision as a JSON formatted bytes slice.

type Distinct

type Distinct struct {
	Value       string
	Probability float64
}

Distinct is a distinct column value and its associated probability.

func Likelihood

func Likelihood(view View, column string) []Distinct

Likelihood returns the probability of each distinct value in the named column of the view. The slice is sorted in decreasing probability.

type View

type View interface {

	// Columns returns the column names in this view. Columns with names of ""
	// have been hidden from this view - see Drop method.
	//
	Columns() []string

	// First returns to before the first row in this view.
	//
	First()

	// Next returns the next row in the view, or nil if there are no more
	// rows.
	//
	Next() []string

	// Select returns a view that shows only rows having the given value in the
	// column.
	//
	Select(column, value string) View

	// Drop returns a view which 'hides' the named column.
	//
	Drop(column string) View
}

View is the interface for ID3 to inspect CSV conformant data. It provides a cursor like mechanism for reading the data, through the First() and Next() functions.

func Read

func Read(reader io.Reader) (View, error)

Read CSV conformant data from the given reader and return a View on that.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL