Documentation ¶
Overview ¶
Package randtxt contains a random text generator.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var PennTreebankTagSet = pennTreebankTagSet{}
PennTreebankTagSet is a TagSet for the English Penn Treebank tagset, as used by the Stanford POS tagger.
More details:
https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html https://nlp.stanford.edu/software/tagger.shtml
Functions ¶
This section is empty.
Types ¶
type Generator ¶
type Generator struct { // TagSet is the language and tagset specific rules. This should match // the TagSet used when the model was built. TagSet TagSet // contains filtered or unexported fields }
Generator generates random text from a model built by ModelBuilder.
func NewGenerator ¶
NewGenerator returns a new generator. Returns an error if the chain has an unrecognized format.
type ModelBuilder ¶
type ModelBuilder struct { TagSet TagSet // contains filtered or unexported fields }
ModelBuilder builds a model that Generator can use.
func NewModelBuilder ¶
func NewModelBuilder(chain markov.WriteChain, ngramSize int) *ModelBuilder
NewModelBuilder creates a ModelBuilder instance.
The model will be written to "chain".
ngramSize is the number of words to include in each ngram. Must be greater than 1.
See cmd/readtsv for an example.
func (*ModelBuilder) Feed ¶
func (b *ModelBuilder) Feed(sources ...<-chan Tag) error
Feed reads tags from one or more channels and writes them to the output chain.
type TagSet ¶
type TagSet interface { // Join returns the text from "tag" prepended with the separator that // should be between "prev" and "tag". // // "prev" is the zero tag at the beginning of the text. Join(tag, prev Tag) string // Normalize converts "tag" to a consistent form. If the returned tag // text is blank the tag is ignored. Normalize(tag, prev Tag) Tag }
TagSet contains code specific to a language and tagset.