markov

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 15, 2018 License: Apache-2.0 Imports: 10 Imported by: 3

README

Markov GoDoc

markov is a Markov chain implementation in Go.

It supports in-memory and disk based Markov chains.

Example

This example generates a nonsense sentence from sample text.

package main

import (
	"fmt"
	"math/rand"

	"github.com/pboyd/markov"
)

const text = `Listen, strange women lying in ponds distributing swords is no basis for a system of government. Supreme executive power derives from a mandate from the masses, not from some farcical aquatic ceremony.`

func main() {
	chain := markov.NewMemoryChain(0)

	// Feed each rune into the chain
	markov.Feed(chain, split(text))

	// Walk the chain with randomly but with weighted probabilities.
	walker := markov.RandomWalker(chain, 0)
	for {
		r, _ := walker.Next()
		fmt.Printf(string(r.(rune)))

		// Stop after the first period.
		if r == '.' {
			break
		}
	}
}

func split(text string) chan interface{} {
	runes := make(chan interface{})
	go func() {
		defer close(runes)

		// Start at the beginning of a word
		runes <- ' '

		for _, r := range text {
			runes <- r
		}
	}()
	return runes
}

For more in-depth examples see the cmd/markov-ngram and cmd/markov-walk programs.

License

This package is release under the terms of the Apache 2.0 license. See LICENSE.TXT.

Documentation

Overview

Package markov builds and traverses Markov chains

For background, see: https://en.wikipedia.org/wiki/Markov_chain

Example

This example generates a nonsense sentence from sample text.

package main

import (
	"fmt"
	"math/rand"

	"github.com/pboyd/markov"
)

const text = `Listen, strange women lying in ponds distributing swords is no basis for a system of government. Supreme executive power derives from a mandate from the masses, not from some farcical aquatic ceremony.`

// This example generates a nonsense sentence from sample text.
func main() {
	chain := markov.NewMemoryChain(0)

	// Feed each rune into the chain
	markov.Feed(chain, split(text))

	// Reset the random seed for consistent output.
	rand.Seed(0)

	// Walk the chain with randomly but with weighted probabilities.
	walker := markov.RandomWalker(chain, 0)
	for {
		r, _ := walker.Next()
		fmt.Printf(string(r.(rune)))

		// Stop after the first period.
		if r == '.' {
			break
		}
	}

}

func split(text string) chan interface{} {
	runes := make(chan interface{})
	go func() {
		defer close(runes)

		// Start at the beginning of a word
		runes <- ' '

		for _, r := range text {
			runes <- r
		}
	}()
	return runes
}
Output:

tera po me maswofove ndsting ng syinonds ssweny.

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	// ErrNotFound is returned by Find or Links when an item doesn't exist.
	ErrNotFound error = errors.New("markov: not found")

	// ErrBrokenChain is returned when the chain ends.
	ErrBrokenChain error = errors.New("markov: broken chain")
)

Functions

func Copy

func Copy(dest WriteChain, src Chain) error

Copy copies src to dest.

If the destination supports CopyFrom, it will be used.

func Feed

func Feed(wc WriteChain, channels ...<-chan interface{}) error

Feed reads values from the channels and writes them to the WriteChain.

Blocks until all the channels have been closed.

If the WriteChain returns an error Feed returns it immediately, leaving unread values on the channels.

func Random added in v1.0.1

func Random(chain Chain) (interface{}, error)

Random pseudo-randomly picks a value from the chain.

If the chain implements RandomChain, it's Random function will be used.

Types

type Chain

type Chain interface {
	// Get returns a value by it's ID. Returns nil if the ID doesn't exist.
	Get(id int) (interface{}, error)

	// Links returns the items linked to the given item.
	//
	// Returns ErrNotFound if the ID doesn't exist.
	Links(id int) ([]Link, error)

	// Find returns the ID for the given value.
	//
	// Returns ErrNotFound if the value doesn't exist.
	Find(value interface{}) (id int, err error)
}

Chain is a read-only Markov chain.

type CopyFrom

type CopyFrom interface {
	CopyFrom(src Chain) error
}

CopyFrom is optionally implemented by a WriteChain to improve performance over the generic Copy implementation.

type DiskChain

type DiskChain struct {
	// contains filtered or unexported fields
}

DiskChain is a read-only Chain implementation for file-based chains.

func ReadDiskChain

func ReadDiskChain(fh *os.File) (*DiskChain, error)

ReadDiskChain reads a chain from a file.

func (*DiskChain) Find

func (c *DiskChain) Find(value interface{}) (id int, err error)

Find returns the ID for the given value.

Returns ErrNotFound if the value doesn't exist.

func (*DiskChain) Get

func (c *DiskChain) Get(id int) (interface{}, error)

Get returns a value by it's ID. Returns nil if the ID doesn't exist.

func (c *DiskChain) Links(id int) ([]Link, error)

Links returns the items linked to the given item.

Returns ErrNotFound if the ID doesn't exist.

func (*DiskChain) Next

func (c *DiskChain) Next(id int) (int, error)

Next returns the id after the given id. Satisfies the IterativeChain interface.

func (*DiskChain) Random added in v1.0.1

func (c *DiskChain) Random() (interface{}, error)

Random pseudo-randomly picks a value and returns it. Satisfies the RandomChain interface.

type DiskChainWriter

type DiskChainWriter struct {
	// contains filtered or unexported fields
}

DiskChainWriter is a ReadWriteChain implementation for file-based chains.

The chain operates on-disk as much as possible, which makes it much slower than an in-memory chain. Building a disk chain by feeding values is particularly inefficient (but possible, and sometimes necessary). Building a MemoryChain and copying to a DiskChainWriter is likely faster.

Values can be strings, runes or any builtin numeric type.

func NewDiskChainWriter

func NewDiskChainWriter(file *os.File) (*DiskChainWriter, error)

NewDiskChainWriter creates a new DiskChainWriter. File must be writable. Any existing data in the file will be lost.

func OpenDiskChainWriter

func OpenDiskChainWriter(file *os.File) (*DiskChainWriter, error)

OpenDiskChainWriter reads an existing disk chain. If file is a read/write handle the disk chain can be updated.

func (*DiskChainWriter) Add

func (c *DiskChainWriter) Add(value interface{}) (int, error)

Add conditionally inserts a new value to the chain.

If the value exists it's ID is returned.

func (*DiskChainWriter) CopyFrom

func (c *DiskChainWriter) CopyFrom(src Chain) error

CopyFrom satisfies the CopyFrom interface. It's faster than the generic Copy algorithm implemented by Copy.

func (*DiskChainWriter) Find

func (c *DiskChainWriter) Find(value interface{}) (int, error)

Find returns the ID for the given value.

Returns ErrNotFound if the value doesn't exist.

func (*DiskChainWriter) Get

func (c *DiskChainWriter) Get(id int) (interface{}, error)

Get returns a value by it's ID. Returns nil if the ID doesn't exist.

func (c *DiskChainWriter) Links(id int) ([]Link, error)

Links returns the items linked to the given item.

Returns ErrNotFound if the ID doesn't exist.

func (*DiskChainWriter) Next

func (c *DiskChainWriter) Next(id int) (int, error)

Next returns the id after the given id. Satisfies the IterativeChain interface.

func (*DiskChainWriter) Random added in v1.0.1

func (c *DiskChainWriter) Random() (interface{}, error)

Random pseudo-randomly picks a value and returns it. Satisfies the RandomChain interface.

func (*DiskChainWriter) Relate

func (c *DiskChainWriter) Relate(parent, child int, delta int) error

Relate increases the number of times child occurs after parent.

type IterativeChain

type IterativeChain interface {
	Chain

	// Next returns the ID of the entry which follows the input ID.
	//
	// If the ID is at the end of the chain, ErrBrokenChain will be
	// returned.
	Next(id int) (int, error)
}

IterativeChain is a chain that can be iterated efficiently.

type Link struct {
	ID          int
	Probability float64
}

Link describes a child item.

type MemoryChain

type MemoryChain struct {
	// contains filtered or unexported fields
}

MemoryChain is a ReadWriteChain kept in memory.

func NewMemoryChain

func NewMemoryChain(capacity int) *MemoryChain

NewMemoryChain creates a new MemoryChain.

capacity is the number of items to initially allocate space for. If capacity is unknown, set it to 0.

func (*MemoryChain) Add

func (c *MemoryChain) Add(value interface{}) (int, error)

Add conditionally inserts a new value to the chain.

If the value exists it's ID is returned.

func (*MemoryChain) Find

func (c *MemoryChain) Find(value interface{}) (int, error)

Find returns the ID for the given value.

Returns ErrNotFound if the value doesn't exist.

func (*MemoryChain) Get

func (c *MemoryChain) Get(id int) (interface{}, error)

Get returns a value by it's ID. Returns nil if the ID doesn't exist.

func (c *MemoryChain) Links(id int) ([]Link, error)

Links returns the items linked to the given item.

Returns ErrNotFound if the ID doesn't exist.

func (*MemoryChain) Next

func (c *MemoryChain) Next(last int) (int, error)

Next returns the id after the given id. Satisfies the IterativeChain interface.

func (*MemoryChain) Random added in v1.0.1

func (c *MemoryChain) Random() (interface{}, error)

Random pseudo-randomly picks a value and returns it. Satisfies the RandomChain interface.

func (*MemoryChain) Relate

func (c *MemoryChain) Relate(parent, child int, delta int) error

Relate increases the number of times child occurs after parent.

type RandomChain added in v1.0.1

type RandomChain interface {
	Random() (interface{}, error)
}

RandomChain is a chain that can return a random value.

type ReadWriteChain

type ReadWriteChain interface {
	Chain
	WriteChain
}

ReadWriteChain is a chain that supports reading and writing.

type Walker

type Walker interface {
	// Next returns the next value in the chain.
	//
	// If the chain has no further nodes, Next() returns ErrBrokenChain.
	Next() (interface{}, error)
}

Walker walks through the chain.

func IterativeWalker

func IterativeWalker(chain Chain) Walker

IterativeWalker returns a Walker that traverses every item in the Chain.

If the chain implements IterativeChain, it will be used.

func RandomWalker

func RandomWalker(chain Chain, startID int) Walker

RandomWalker traverses a chain. Items are chosen randomly, but each possible item is weighted by it's probability.

type WriteChain

type WriteChain interface {
	// Add conditionally inserts a new value to the chain.
	//
	// If the value exists it's ID is returned.
	Add(value interface{}) (id int, err error)

	// Relate increases the number of times child occurs after parent.
	Relate(parent, child int, delta int) error
}

WriteChain is a Markov chain that can only be written.

Directories

Path Synopsis
cmd
markov-ngram
markov-ngram builds a Markov chain from a text file by splitting it into N-grams (words, bigrams, trigrams, etc.) It is included as an example, not as a useful program.
markov-ngram builds a Markov chain from a text file by splitting it into N-grams (words, bigrams, trigrams, etc.) It is included as an example, not as a useful program.
markov-optimize
markov-optimize optimizes a chain file for reading.
markov-optimize optimizes a chain file for reading.
markov-walk
markov-walk reads a Markov chain from disk and outputs items proportionally.
markov-walk reads a Markov chain from disk and outputs items proportionally.
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL