concordance

package
v0.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 22, 2024 License: GPL-3.0 Imports: 5 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Line added in v0.0.2

type Line struct {

	// Text contains positional text data (= tokens)
	Text TokenSlice `json:"text"`

	// Ref contains structural metadata related to the line
	Ref string `json:"ref"`

	// ErrMsg is an error message in case problems occured
	// with parsing related to the line. The policy here is
	// to always return a line with value replaced by a placeholder
	// in case of an error.
	ErrMsg string `json:"errMsg,omitempty"`
}

type LineParser

type LineParser struct {
	// contains filtered or unexported fields
}

LineParser parses Manatee-encoded concordance lines and converts them into (more structured) MQuery format.

func NewLineParser

func NewLineParser(attrs []string) *LineParser

NewLineParser is a recommended factory function to instantiate a `LineParser` value.

func (*LineParser) Parse

func (lp *LineParser) Parse(lines []string) []Line

It also escapes strings to make them usable in XML documents.

type Token

type Token struct {
	Word string `json:"word"`

	// Strong is a general flag for emphasizing the token
	Strong bool `json:"strong"`

	// Attrs store additional attributes (e.g. PoS, lemma, syntax node parent)
	// of a respective position.
	Attrs map[string]string `json:"attrs"`

	// ErrMsg is an error message in case problems occured
	// with parsing related to the token. The policy here is
	// to always return a token with value replaced by a placeholder
	// in case of an error.
	ErrMsg string `json:"errMsg,omitempty"`
}

Token is a single text position in a corpus text.

func (*Token) HasError

func (t *Token) HasError() bool

type TokenSlice

type TokenSlice []*Token

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL