snowballstem

package module
v0.9.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 4, 2020 License: BSD-3-Clause Imports: 5 Imported by: 440

README

snowballstem

This repository contains the Go stemmers generated by the Snowball project. They are maintained outside of the core bleve package so that they may be more easily be reused in other contexts.

Usage

All these stemmers export a single Stem() method which operates on a snowball Env structure. The Env structure maintains all state for the stemmer. A new Env is created to point at an initial string. After stemming, the results of the Stem() operation can be retrieved using the Current() method. The Env structure can be reused for subsequent calls by using the SetCurrent() method.

Example

package main

import (
	"fmt"

	"github.com/blevesearch/snowballstem"
	"github.com/blevesearch/snowballstem/english"
)

func main() {

	// words to stem
	words := []string{
		"running",
		"jumping",
	}

	// build new environment
	env := snowballstem.NewEnv("")

	for _, word := range words {
		// set up environment for word
		env.SetCurrent(word)
		// invoke stemmer
		english.Stem(env)
		// print results
		fmt.Printf("%s stemmed to %s\n", word, env.Current())
	}
}

Produces Output:

$ ./snowtest
running stemmed to run
jumping stemmed to jump

Testing

The test harness for these stemmers is hosted in the main Snowball repository. There are functional tests built around the separate snowballstem-data repository, and there is support for fuzz-testing the stemmers there as well.

Generating the Stemmers

$ export SNOWBALL=/path/to/github.com/snowballstem/snowball/after/snowball/built
$ go generate

Updated the Go Generate Commands

A simple tool is provided to automate these from the snowball algorithms directory:

$ go run gengen.go /path/to/github.com/snowballstem/snowball/algorithms

Documentation

Index

Constants

View Source
const MaxInt = math.MaxInt32
View Source
const MinInt = math.MinInt32

Variables

This section is empty.

Functions

func RuneCountInString

func RuneCountInString(str string) int

RuneCountInString is a wrapper around utf8.RuneCountInString this allows us to not have to conditionally include the utf8 package into some stemmers and not others

Types

type Among

type Among struct {
	Str string
	A   int32
	B   int32
	F   AmongF
}

func (*Among) String

func (a *Among) String() string

type AmongF

type AmongF func(env *Env, ctx interface{}) bool

type Env

type Env struct {
	Cursor        int
	Limit         int
	LimitBackward int
	Bra           int
	Ket           int
	// contains filtered or unexported fields
}

Env represents the Snowball execution environment

func NewEnv

func NewEnv(val string) *Env

NewEnv creates a new Snowball execution environment on the provided string

func (*Env) AssignTo

func (env *Env) AssignTo() string

func (*Env) ByteIndexForHop

func (env *Env) ByteIndexForHop(delta int32) int32

func (*Env) Clone

func (env *Env) Clone() *Env

func (*Env) Current

func (env *Env) Current() string

func (*Env) Debug

func (env *Env) Debug(count, lineNumber int)

func (*Env) EqS

func (env *Env) EqS(s string) bool

func (*Env) EqSB

func (env *Env) EqSB(s string) bool

func (*Env) FindAmong

func (env *Env) FindAmong(amongs []*Among, ctx interface{}) int32

func (*Env) FindAmongB

func (env *Env) FindAmongB(amongs []*Among, ctx interface{}) int32

func (*Env) InGrouping

func (env *Env) InGrouping(chars []byte, min, max int32) bool

func (*Env) InGroupingB

func (env *Env) InGroupingB(chars []byte, min, max int32) bool

func (*Env) Insert

func (env *Env) Insert(bra, ket int, s string)

func (*Env) NextChar

func (env *Env) NextChar()

func (*Env) OutGrouping

func (env *Env) OutGrouping(chars []byte, min, max int32) bool

func (*Env) OutGroupingB

func (env *Env) OutGroupingB(chars []byte, min, max int32) bool

func (*Env) PrevChar

func (env *Env) PrevChar()

func (*Env) ReplaceS

func (env *Env) ReplaceS(bra, ket int, s string) int32

func (*Env) SetCurrent

func (env *Env) SetCurrent(s string)

func (*Env) SliceDel

func (env *Env) SliceDel() bool

func (*Env) SliceFrom

func (env *Env) SliceFrom(s string) bool

func (*Env) SliceTo

func (env *Env) SliceTo() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL