tokenize

package

v0.0.1 Latest Latest Go to latest Published: Nov 13, 2023 License: BSD-2-Clause Imports: 5 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/Jumpaku/sqanner

Links

Open Source Insights

Documentation ¶

Index ¶

func IsDecimalDigit(r rune) bool
func IsKeyword(r []rune) bool
type ScanState
type Token
- func Tokenize(input []rune) ([]Token, error)
- func (t Token) IsValid() bool
type TokenKind
- func (i TokenKind) String() string
type TokenScanner
- func (s *TokenScanner) Init(input []rune)
- func (s *TokenScanner) ScanNext() (Token, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func IsDecimalDigit ¶

func IsDecimalDigit(r rune) bool

func IsKeyword ¶

func IsKeyword(r []rune) bool

Types ¶

type ScanState ¶

type ScanState struct {
	Input  []rune
	Cursor int
}

ScanState represents a state for scanning and processing a sequence of runes. It contains the Input slice, which holds the runes to be scanned, and the Cursor indicating the current position in the Input slice.

func (ScanState) CountWhile ¶

func (s ScanState) CountWhile(begin int, satisfy func(rune) bool) int

CountWhile counts the number of runes that satisfy the given function 'satisfy', starting from the 'begin' position from the current Cursor position. The counting stops as soon as a rune that does not satisfy the condition is encountered. 'begin' must be in [0, s.Len()).

func (ScanState) FindFirst ¶

func (s ScanState) FindFirst(begin int, patternSize int, pattern func([]rune) bool) (int, bool)

FindFirst searches for the first occurrence of a pattern with the given 'patternSize' using the 'pattern' function, starting from the 'begin' position from the current Cursor position. It returns the index of the first occurrence and a boolean indicating if the pattern was found. If the pattern is not found, it returns the index 's.Len()' and 'false'. 'begin' must be in [0, s.Len()).

func (ScanState) Len ¶

func (s ScanState) Len() int

Len returns the remaining number of runes in the Input slice from the current Cursor position.

func (ScanState) PeekAt ¶

func (s ScanState) PeekAt(offset int) rune

PeekAt returns the rune at a given relative position 'offset' from the current Cursor position. 'offset' must be in [0, s.Len()).

func (ScanState) PeekSlice ¶

func (s ScanState) PeekSlice(begin int, endExclusive int) []rune

PeekSlice returns a slice of runes starting from 'begin' to 'endExclusive' positions from the current Cursor position. 'begin' must be in [0, s.Len()). 'endExclusive' must be in [0, s.Len()].

type Token ¶

type Token struct {
	Kind    TokenKind
	Content []rune

	Begin  int
	End    int
	Line   int
	Column int
}

Token represents a single token identified by the token scanner. The Token struct is used to represent identified tokens during the tokenization process. It contains information about the type and location of the token in the source code. A valid token has its TokenKind set to a specific type (not TokenUnspecified). It contains the following fields: - Kind: The TokenKind representing the type of the token. - Content: The content or value of the token. - Begin: The starting position (index) of the token in the input sequence. - End: The ending position (index) of the token in the input sequence. - Line: The line number where the token starts in the input source. - Column: The column number where the token starts in the input source.

func Tokenize ¶

func Tokenize(input []rune) ([]Token, error)

Tokenize returns a slice of Token representing the identified tokens in the input sequence. If an error occurs during tokenization, the function returns an error with a message indicating the failure to tokenize.

func (Token) IsValid ¶

func (t Token) IsValid() bool

IsValid checks if the token is valid, i.e., its TokenKind is not TokenUnspecified. It returns true if the token is valid and false otherwise.

type TokenKind ¶

type TokenKind int

TokenKind represents the type of token identified by the token scanner.

const (
	// TokenUnspecified represents an unspecified or unknown token.
	TokenUnspecified TokenKind = iota
	// TokenEOF represents the end of the file (EOF) token.
	TokenEOF
	// TokenSpace represents a space token.
	TokenSpace
	// TokenComment represents a comment token.
	TokenComment
	// TokenIdentifier represents an identifier token.
	TokenIdentifier
	// TokenIdentifierQuoted represents a quoted identifier token.
	TokenIdentifierQuoted
	// TokenLiteralQuoted represents a quoted literal (string) token.
	TokenLiteralQuoted
	// TokenLiteralInteger represents an integer literal token.
	TokenLiteralInteger
	// TokenLiteralFloat represents a floating-point literal token.
	TokenLiteralFloat
	// TokenKeyword represents a keyword token.
	TokenKeyword
	// TokenSpecialChar represents a special character token.
	TokenSpecialChar
)

func Comment ¶

func Comment(s *ScanState) (int, TokenKind, error)

Comment scans the input sequence represented by the ScanState 's' to identify and handle comments. It returns the count of runes in the scanned comment token and the corresponding TokenKind. If no comments are found at the current Cursor position, the function returns 0 for the count and TokenUnspecified for the TokenKind. If the comment starts with '#' and extends to the end of the line, the function returns the count of runes up to the newline character. If the comment starts with '//' or '--' and extends to the end of the line, the function returns the count of runes up to the newline character. If the comment starts with '/*' and ends with '*/', the function returns the count of runes up to the closing '*/' sequence. If the comment is not properly terminated with '*/', the function returns an error with a message indicating an incomplete comment.

func IdentifierOrKeyword ¶

func IdentifierOrKeyword(s *ScanState) (int, TokenKind, error)

IdentifierOrKeyword scans the input sequence represented by the ScanState 's' to identify and handle identifiers or keywords. It returns the count of runes in the scanned identifier or keyword, the corresponding TokenKind, and an error if any occurs during processing. If no identifier or keyword is found at the current Cursor position, the function returns 0 for the count, TokenUnspecified for the TokenKind, and nil for the error. If the scanned token is a keyword, the function returns the count of runes in the scanned keyword and TokenKind TokenKeyword. If the scanned token is an identifier, the function returns the count of runes in the scanned identifier and TokenKind TokenIdentifier.

func IdentifierQuoted ¶

func IdentifierQuoted(s *ScanState) (int, TokenKind, error)

IdentifierQuoted scans the input sequence represented by the ScanState 's' to identify and handle quoted identifiers enclosed within back quotes (`). It returns the count of runes in the scanned quoted identifier token and the corresponding TokenKind. If no quoted identifier is found at the current Cursor position, the function returns 0 for the count and TokenUnspecified for the TokenKind. If the quoted identifier is empty (two consecutive backticks), the function returns an error indicating an empty quoted identifier. If the quoted identifier is not properly enclosed within backticks, the function returns an error with a message indicating an invalid quoted identifier.

func LiteralQuoted ¶

func LiteralQuoted(s *ScanState) (int, TokenKind, error)

LiteralQuoted scans the input sequence represented by the ScanState 's' to identify and handle quoted literals (strings or bytes) with optional prefixes. It returns the count of runes in the scanned quoted literal, the corresponding TokenKind, and an error if any occurs during processing. If no quoted literal is found at the current Cursor position, the function returns 0 for the count, TokenUnspecified for the TokenKind, and nil for the error.

func NumberOrDot ¶

func NumberOrDot(s *ScanState) (int, TokenKind, error)

NumberOrDot scans the input sequence represented by the ScanState 's' to identify and handle numbers or the dot (.) operator. It returns the count of runes in the scanned number or dot operator, the corresponding TokenKind, and an error if any occurs during processing. If no number or dot operator is found at the current Cursor position, the function returns 0 for the count, TokenUnspecified for the TokenKind, and nil for the error. The function recognizes hexadecimal integers (starting with "0x"), decimals (with or without a decimal point), and floating-point numbers (with or without an exponent using 'e' or 'E'). If the scanned token is the dot (.) operator, the function returns 1 for the count and TokenKind TokenSpecialChar. If the scanned token is an integer (either decimal or hexadecimal), the function returns the count of runes in the scanned integer and TokenKind TokenLiteralInteger. If the scanned token is a floating-point number, the function returns the count of runes in the scanned number and TokenKind TokenLiteralFloat.

func Spaces ¶

func Spaces(s *ScanState) (int, TokenKind, error)

Spaces scans the input sequence represented by the ScanState 's' to find the number of consecutive space runes at the current Cursor position. It returns the count of runes in the scanned space token and the corresponding TokenKind. If no spaces are found at the current Cursor position, the function returns 0 for the count and TokenUnspecified for the TokenKind. If an error occurs during processing, it will be returned as the third value, which will be nil in this implementation.

func SpecialChar ¶

func SpecialChar(s *ScanState) (int, TokenKind, error)

SpecialChar scans the input sequence represented by the ScanState 's' to identify and handle special characters. It returns the count of runes in the scanned special character, the corresponding TokenKind, and an error if any occurs during processing. If no special character is found at the current Cursor position, the function returns 0 for the count, TokenUnspecified for the TokenKind, and nil for the error. If the scanned token is a dot (.) character followed by a decimal digit, the function returns 0 for the count, TokenUnspecified for the TokenKind, and nil. For all other cases, where the current rune represents a standalone special character, the function returns 1 for the count and TokenKind TokenSpecialChar.

func (TokenKind) String ¶

func (i TokenKind) String() string

type TokenScanner ¶

type TokenScanner struct {
	ScanState
	// contains filtered or unexported fields
}

TokenScanner provides a tokenizer for processing a sequence of runes and identifying different types of tokens.

func (*TokenScanner) Init ¶

func (s *TokenScanner) Init(input []rune)

func (*TokenScanner) ScanNext ¶

func (s *TokenScanner) ScanNext() (Token, error)

ScanNext scans the next token in the input sequence and returns the Token and an error if any occurs during processing. If the end of the input sequence is reached, the method returns a special Token with TokenKind TokenEOF to indicate the end of the file.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL