Documentation ¶
Overview ¶
Package lex contains base types to tokenize text for the linter. See sub packages for specific implementations. Folx by default uses the GLexer in package ggl.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Document ¶
type Document struct { // IANA name for the content encoding Encoding string // Text to be analysed by the Lexer Content string }
Document describes the content for lexical analysis
type EntityType ¶
type EntityType int
EntityType describes the type of entity represented by the token. TODO: add further entities as required e.g. Organisation.
const ( // UnknownEntity identifies a token with no entity. UnknownEntity EntityType = iota // Person identfiies a person entity. Person )
type Lexer ¶
type Lexer interface { Init(context.Context, *Document) error Next() (*Token, error) GetExecTime() time.Duration GetDocument() *Document }
Lexer performs a lexical analysis on a source text to return tokens.
type PartOfSpeechType ¶
type PartOfSpeechType int
PartOfSpeechType describes the lexical tag of a word. TODO: add further tags as required e.g. Verb.
const ( // UnknownPOS identifies a token with no part of speech tag. UnknownPOS PartOfSpeechType = iota // Noun identifies a word token that is a noun. Noun // Adjective identfiied a token that is an adjective. Adjective )
type Token ¶
type Token struct { Type TokenType PartOfSpeech PartOfSpeechType Entity EntityType Offset int Text string Lemma string Sentence Sentence Adjectives []*Token }
Token describes a discrete lexical element in a source text, returned by Lexer.Next.
Click to show internal directories.
Click to hide internal directories.