Package chunk implements functions for finding useful chunks in text previously tagged from parts of speech.
txt := "Go is a open source programming language created at Google." words := tokenize.TextToWords(txt) tagger := tag.NewPerceptronTagger() fmt.Println(Chunk(tagger.Tag(words), TreebankNamedEntities))
var TreebankNamedEntities = regexp.MustCompile( `((CD__)*(NNP.)+(CD__|NNP.)*)+` + `((IN__)*(CD__)*(NNP.)+(CD__|NNP.)*)*`)
TreebankNamedEntities matches proper names, excluding prior adjectives, possibly including numbers and a linkage by preposition or subordinating conjunctions (for example "Bank of England").
Chunk returns a slice containing the chunks of interest according to the regexp.
This is a convenience wrapper around Locate, which should be used if you need access the to the in-text locations of each chunk.
Locate finds the chunks of interest according to the regexp.