Documentation ¶
Overview ¶
Package lex contains the lexer of RSQL.
The Lexer parses a SQL batch, and returns a Lexeme at each call to l.Eat_next_lexeme().
A lexeme is just a string with some more information. It contains a type and subtype, e.g. LEX_KEYWORD/LEX_KEYWORD_ALTER for keywords, or LEX_IDENTPART/LEX_IDENTPART_SUB for identifiers. See static_tables.go for details.
The returned lexemes are lowercase for LEX_KEYWORD, LEX_OPERATOR, LEX_VARIABLE, LEX_ATFUNC, LEX_IDENTPART.
Sample program for using lex package
package main import ( "fmt" "os" "io/ioutil" "rsql" "rsql/lex" ) func main() { var ( err error text []byte lexeme lex.Lexeme rsql_err *rsql.Error ) filename := os.Args[1] if text, err = ioutil.ReadFile(filename); err != nil { fmt.Println(err) os.Exit(1) } //=== init Lexer === l := lex.New_lexer() if rsql_err = l.Attach_batch(text); rsql_err != nil { fmt.Println(rsql_err) os.Exit(1) } //=== get all Lexemes === lexeme = l.Current_lexeme fmt.Printf("%5d:%3d %-30s %-30s\n", l.Current_line.No, l.Current_line.Pos, lexeme.Type_string(), "<"+lexeme.Lex_word+">") for { if rsql_err = l.Eat_next_lexeme(); rsql_err != nil { fmt.Println(rsql_err) os.Exit(1) } lexeme = l.Current_lexeme fmt.Printf("%5d:%3d %-30s %-30s\n", l.Current_line.No, l.Current_line.Pos, lexeme.Type_string(), "<"+lexeme.Lex_word+">") if lexeme.Lex_type == lex.LEX_END_OF_BATCH { break } } }
Index ¶
- Constants
- Variables
- func Uc_hexa_digit_value(uc rune) (uint8, *rsql.Error)
- type Auxword_t
- type Coord_t
- type Lex_subtype_t
- type Lex_type_t
- type Lexeme
- type Lexer
- func (l *Lexer) Attach_batch(batch_text []byte) *rsql.Error
- func (l *Lexer) Eat_next_lexeme() *rsql.Error
- func (l *Lexer) Info_line_put_old()
- func (l *Lexer) Info_line_update()
- func (l *Lexer) Memorize_to(lbak *Lexer_bak)
- func (l *Lexer) Print_Info_line()
- func (l *Lexer) Restore_from(lbak *Lexer_bak)
- func (l *Lexer) Set_option_quoted_identifier(opt_quot bool)
- type Lexer_bak
- type Txtdim_t
Constants ¶
const ( SPEC_VARIABLE_MAX_LENGTH = 128 SPEC_ATFUNC_MAX_LENGTH = 128 SPEC_IDENTPART_MAX_LENGTH = 128 SPEC_NUMBER_MAX_LENGTH = 128 SPEC_HEXASTRING_MAX_LENGTH = 18000 // in bytes. There are 2 hexa digits per byte. This length includes also the "0x" prefix. Note: the parser will later also limit the max length of a VARBINARY, in byte count. PROD:18000 SPEC_STRING_MAX_LENGTH = 32000 // in bytes (utf8 byte count). Note: the parser will later also limit the max length of a VARCHAR, in rune count. )
SPEC_xxx constants are used by the lexer to limit the max length of lexemes IN BYTES, that is, variable names, identifiers, numbers, strings, etc.
const ( ASCII_PLAINSPACE = 1 // \t and space ASCII_CR_LF = 2 // \r and \n ASCII_OPERATOR = 4 // ! % & * + - / < = > ? ^| ~ ASCII_HASH_DOLLAR_AT_UNDERSCORE = 8 // # $ @ _ ASCII_A_Z = 16 // a-z A-Z ASCII_0_9 = 32 // 0-9 ASCII_HEXA = 64 // 0-9 a-f A-F // ASCII_STOP_SKIP_WS is used by function lexer_skip_whitespace(). // It contains all graphical (that is, not space, tab, cr, lf ...) characters, except '/' and '-' because they can be the start of a SQL comment. ASCII_STOP_SKIP_WS = 128 // @ # $ ? " ' ( ) [ ] { } * + % & | ~ ! < = > \ ^ , . : ; a-z A-Z 0-9 _ (note: '/' and '-' are not in this list) ASCII_XID_START = (ASCII_HASH_DOLLAR_AT_UNDERSCORE | ASCII_A_Z) // # $ @ _ a-z A-Z Used by uc_is_ascii_xid_start(), uc_is_variable_start() etc, which can exclude some more characters of this set. ASCII_XID_CONTINUE = (ASCII_HASH_DOLLAR_AT_UNDERSCORE | ASCII_A_Z | ASCII_0_9) // # $ @ _ a-z A-Z 0-9 Used by uc_is_ascii_xid_continue(), uc_is_variable_continue() etc, which can exclude some more characters of this set. ASCII_WHITESPACE = (ASCII_PLAINSPACE | ASCII_CR_LF) )
Variables ¶
var G_ASCII_PROPERTIES = [128]byte{}/* 128 elements not displayed */
Array of ascii character for quick property lookup.
Array of 128 elements, with various properties for all ascii characters (0...127).
Used to speed up property retrieval for ascii characters, so that unicode function calls can be avoided for them.
var G_KEYWORDS_OPERATORS = make(map[string]Lexeme, aLL_LEXEMES_ARRAY_SIZE) // in fact, the map will contain less than aLL_LEXEMES_ARRAY_SIZE elements, but it is a good approximation
G_KEYWORDS_OPERATORS map contains all keywords, and also operators like "and", "between", etc, in lowercase. Each string key stores the corresponding Lexeme.
E.g. "select" -> LEXEME_KEYWORD_SELECT "and" -> LEXEME_OPERATOR_LOGICAL_AND
Functions ¶
Types ¶
type Auxword_t ¶
type Auxword_t string
const ( AUXWORD_NOCOUNT Auxword_t = "nocount" AUXWORD_DATEFIRST Auxword_t = "datefirst" AUXWORD_DATEFORMAT Auxword_t = "dateformat" AUXWORD_ENCODING Auxword_t = "encoding" AUXWORD_FIELDTERMINATOR Auxword_t = "fieldterminator" AUXWORD_ROWTERMINATOR Auxword_t = "rowterminator" AUXWORD_CODEPAGE Auxword_t = "codepage" AUXWORD_FIRSTROW Auxword_t = "firstrow" AUXWORD_LASTROW Auxword_t = "lastrow" AUXWORD_KEEPIDENTITY Auxword_t = "keepidentity" AUXWORD_NEWIDENTITY Auxword_t = "newidentity" AUXWORD_KEEPNULLS Auxword_t = "keepnulls" AUXWORD_RTRIM Auxword_t = "rtrim" AUXWORD_DATE_FORMAT Auxword_t = "date_format" AUXWORD_TIME_FORMAT Auxword_t = "time_format" AUXWORD_DATETIME_FORMAT Auxword_t = "datetime_format" AUXWORD_TABLE_LOCK_READ Auxword_t = "table_lock_read" AUXWORD_TABLE_LOCK_EXCLUSIVE Auxword_t = "table_lock_exclusive" AUXWORD_SLEEP Auxword_t = "sleep" AUXWORD_COLLATOR Auxword_t = "collator" AUXWORD_CALENDAR Auxword_t = "calendar" AUXWORD_LOCALES Auxword_t = "locales" AUXWORD_PASSWORD Auxword_t = "password" AUXWORD_DEFAULT_DATABASE Auxword_t = "default_database" AUXWORD_DEFAULT_LANGUAGE Auxword_t = "default_language" AUXWORD_MEMBER Auxword_t = "member" AUXWORD_NAME Auxword_t = "name" AUXWORD_LOGIN Auxword_t = "login" AUXWORD_PRIVILEGES Auxword_t = "privileges" AUXWORD_SA Auxword_t = "sa" AUXWORD_DBO Auxword_t = "dbo" AUXWORD_TRASHDB Auxword_t = "trashdb" AUXWORD_MODIFY Auxword_t = "modify" AUXWORD_ENABLE Auxword_t = "enable" AUXWORD_DISABLE Auxword_t = "disable" AUXWORD_ONLINE Auxword_t = "online" AUXWORD_OFFLINE Auxword_t = "offline" AUXWORD_CORRUPTED Auxword_t = "corrupted" AUXWORD_READ_ONLY Auxword_t = "read_only" AUXWORD_READ_WRITE Auxword_t = "read_write" AUXWORD_RESTRICTED_USER Auxword_t = "restricted_user" AUXWORD_MULTI_USER Auxword_t = "multi_user" AUXWORD_QUOTED_IDENTIFIER Auxword_t = "quoted_identifier" AUXWORD_PARSEONLY Auxword_t = "parseonly" AUXWORD_NOEXEC Auxword_t = "noexec" AUXWORD_ANSI_NULL_DFLT_ON Auxword_t = "ansi_null_dflt_on" AUXWORD_ANSI_NULLS Auxword_t = "ansi_nulls" AUXWORD_SERVER Auxword_t = "server" AUXWORD_PARAMETER Auxword_t = "parameter" AUXWORD_SERVER_DEFAULT_COLLATION Auxword_t = "server_default_collation" AUXWORD_SERVER_SERVERNAME Auxword_t = "server_servername" AUXWORD_SERVER_WORKERS_MAX Auxword_t = "server_workers_max" AUXWORD_SERVER_GLOBAL_PAGE_CACHE_MEMORY Auxword_t = "server_global_page_cache_memory" AUXWORD_SERVER_DEFAULT_DATABASE Auxword_t = "server_default_database" AUXWORD_SERVER_DEFAULT_LANGUAGE Auxword_t = "server_default_language" AUXWORD_SERVER_QUOTED_IDENTIFIER Auxword_t = "server_quoted_identifier" AUXWORD_SERVER_BULK_DIR Auxword_t = "server_bulk_dir" AUXWORD_SERVER_DUMP_DIR Auxword_t = "server_dump_dir" AUXWORD_SERVER_READ_TIMEOUT Auxword_t = "server_read_timeout" AUXWORD_SERVER_LOCK_TICKER_INTERVAL Auxword_t = "server_lock_ticker_interval" AUXWORD_SERVER_LOCK_TIMEOUT_TICKS_COUNT Auxword_t = "server_lock_timeout_ticks_count" AUXWORD_SERVER_LOGGING_MAX_SIZE Auxword_t = "server_logging_max_size" AUXWORD_SERVER_LOGGING_MAX_COUNT Auxword_t = "server_logging_max_count" AUXWORD_SERVER_LOGGING_LOCALTIME Auxword_t = "server_logging_localtime" AUXWORD_SERVER_WCACHE_MEMORY_MAX Auxword_t = "server_wcache_memory_max" AUXWORD_SERVER_WCACHE_MODIF_MAX Auxword_t = "server_wcache_modif_max" AUXWORD_SERVER_BATCH_TEXT_MAX_SIZE Auxword_t = "server_batch_text_max_size" AUXWORD_SERVER_BATCH_INSERTS_MAX_COUNT Auxword_t = "server_batch_inserts_max_count" AUXWORD_EXPORT Auxword_t = "export" AUXWORD_SHRINK_FILE Auxword_t = "shrink_file" AUXWORD_NOWAIT Auxword_t = "nowait" AUXWORD_NORMAL Auxword_t = "normal" AUXWORD_ID Auxword_t = "id" AUXWORD_SQL Auxword_t = "sql" AUXWORD_TEMPLATE Auxword_t = "template" AUXWORD_TABLES Auxword_t = "tables" AUXWORD_TBL Auxword_t = "tbl" AUXWORD_TBLS Auxword_t = "tbls" AUXWORD_T Auxword_t = "t" AUXWORD_DATABASES Auxword_t = "databases" AUXWORD_DB Auxword_t = "db" AUXWORD_DBS Auxword_t = "dbs" AUXWORD_D Auxword_t = "d" AUXWORD_LOGINS Auxword_t = "logins" AUXWORD_L Auxword_t = "l" AUXWORD_PARAMETERS Auxword_t = "parameters" AUXWORD_PARAM Auxword_t = "param" AUXWORD_PARAMS Auxword_t = "params" AUXWORD_P Auxword_t = "p" AUXWORD_USERS Auxword_t = "users" AUXWORD_U Auxword_t = "u" AUXWORD_ROLES Auxword_t = "roles" AUXWORD_R Auxword_t = "r" AUXWORD_PERM Auxword_t = "perm" AUXWORD_PERMS Auxword_t = "perms" AUXWORD_PERMISSION Auxword_t = "permission" AUXWORD_PERMISSIONS Auxword_t = "permissions" AUXWORD_INFO Auxword_t = "info" AUXWORD_I Auxword_t = "i" AUXWORD_DDL_ONLY Auxword_t = "ddl_only" AUXWORD_COLL Auxword_t = "coll" AUXWORD_COLLATION Auxword_t = "collation" AUXWORD_COLLATIONS Auxword_t = "collations" AUXWORD_LANG Auxword_t = "lang" AUXWORD_LANGUAGE Auxword_t = "language" AUXWORD_LANGUAGES Auxword_t = "languages" AUXWORD_LOCK Auxword_t = "lock" AUXWORD_LOCKS Auxword_t = "locks" AUXWORD_WORKER Auxword_t = "worker" AUXWORD_WORKERS Auxword_t = "workers" AUXWORD_REPLACE Auxword_t = "replace" AUXWORD_NO_USER Auxword_t = "no_user" AUXWORD_VERBOSE Auxword_t = "verbose" )
miscellaneous auxilliary words, which are not keywords but LEX_IDENTPART. They are used in SQL batch e.g. as option names, or after SET keyword like in SET LANGUAGE.
type Lex_subtype_t ¶
type Lex_subtype_t uint32 // LEX_KEYWORD_ADD, LEX_OPERATOR_MINUS, LEX_IDENTPART_SUB, etc
const ( LEX_INVALID_SUB Lex_subtype_t = 0 // used only when an internal lexer function returns an error. It is only used internally by the lexer, and the caller never sees it. LEX_KEYWORD_ASSERT_ Lex_subtype_t = iota + 1 // Lexeme Lex_subtype for Lex_type LEX_KEYWORD LEX_KEYWORD_ASSERT_NULL_ LEX_KEYWORD_ASSERT_ERROR_ LEX_KEYWORD_ADD LEX_KEYWORD_ALTER LEX_KEYWORD_AS LEX_KEYWORD_ASC LEX_KEYWORD_AUTHORIZATION LEX_KEYWORD_BACKUP LEX_KEYWORD_BEGIN LEX_KEYWORD_BREAK LEX_KEYWORD_BROWSE LEX_KEYWORD_BULK LEX_KEYWORD_BY LEX_KEYWORD_CASCADE LEX_KEYWORD_CASE LEX_KEYWORD_CAST LEX_KEYWORD_CHECK LEX_KEYWORD_CHECKPOINT LEX_KEYWORD_CLOSE LEX_KEYWORD_CLUSTERED // COALESCE no need to have it as a keyword. It is a normal sysfunc. LEX_KEYWORD_COLLATE LEX_KEYWORD_COLUMN LEX_KEYWORD_COMMIT LEX_KEYWORD_COMPUTE LEX_KEYWORD_CONSTRAINT LEX_KEYWORD_CONTAINS LEX_KEYWORD_CONTAINSTABLE LEX_KEYWORD_CONTINUE LEX_KEYWORD_CONVERT LEX_KEYWORD_CREATE LEX_KEYWORD_CROSS LEX_KEYWORD_CURRENT LEX_KEYWORD_CURRENT_TIMESTAMP // sysfunc without parentheses LEX_KEYWORD_CURRENT_USER // sysfunc without parentheses LEX_KEYWORD_CURSOR LEX_KEYWORD_DATABASE LEX_KEYWORD_DBCC LEX_KEYWORD_DEALLOCATE LEX_KEYWORD_DEBUG // I add this keyword for rsql. It is not a MS SQL Server keyword. LEX_KEYWORD_DECLARE LEX_KEYWORD_DEFAULT LEX_KEYWORD_DELETE LEX_KEYWORD_DENY LEX_KEYWORD_DESC LEX_KEYWORD_DISK LEX_KEYWORD_DISTINCT LEX_KEYWORD_DISTRIBUTED // DOUBLE I don't put it in keyword, because I want it in specialword as datatype. LEX_KEYWORD_DROP LEX_KEYWORD_DUMP LEX_KEYWORD_ELSE LEX_KEYWORD_END LEX_KEYWORD_ERRLVL LEX_KEYWORD_EXCEPT LEX_KEYWORD_EXEC LEX_KEYWORD_EXECUTE LEX_KEYWORD_EXISTS LEX_KEYWORD_EXIT LEX_KEYWORD_EXTERNAL LEX_KEYWORD_FALSE // I add this keyword for rsql. It is not a MS SQL Server keyword. LEX_KEYWORD_FETCH LEX_KEYWORD_FILE LEX_KEYWORD_FILLFACTOR LEX_KEYWORD_FOR LEX_KEYWORD_FOREIGN LEX_KEYWORD_FREETEXT LEX_KEYWORD_FREETEXTTABLE LEX_KEYWORD_FROM LEX_KEYWORD_FULL LEX_KEYWORD_FUNCTION LEX_KEYWORD_GOTO LEX_KEYWORD_GRANT LEX_KEYWORD_GROUP LEX_KEYWORD_HAVING LEX_KEYWORD_HOLDLOCK LEX_KEYWORD_IDENTITY LEX_KEYWORD_IDENTITY_INSERT LEX_KEYWORD_IDENTITYCOL LEX_KEYWORD_IF LEX_KEYWORD_INDEX LEX_KEYWORD_INNER LEX_KEYWORD_INSERT LEX_KEYWORD_INTERSECT LEX_KEYWORD_INTO LEX_KEYWORD_JOIN LEX_KEYWORD_KEY LEX_KEYWORD_KILL LEX_KEYWORD_LEFT // for LEFT JOIN, but is also the sysfunc LEFT() LEX_KEYWORD_LINENO LEX_KEYWORD_LOAD LEX_KEYWORD_NATIONAL LEX_KEYWORD_NOCHECK LEX_KEYWORD_NONCLUSTERED LEX_KEYWORD_NULL LEX_KEYWORD_OF LEX_KEYWORD_OFF LEX_KEYWORD_OFFSETS LEX_KEYWORD_ON LEX_KEYWORD_OPEN LEX_KEYWORD_OPENDATASOURCE LEX_KEYWORD_OPENQUERY LEX_KEYWORD_OPENROWSET LEX_KEYWORD_OPENXML LEX_KEYWORD_OPTION LEX_KEYWORD_ORDER LEX_KEYWORD_OUTER LEX_KEYWORD_OVER LEX_KEYWORD_PERCENT LEX_KEYWORD_PIVOT LEX_KEYWORD_PLAN LEX_KEYWORD_PRECISION LEX_KEYWORD_PRIMARY LEX_KEYWORD_PRINT LEX_KEYWORD_PROC LEX_KEYWORD_PROCEDURE // LEX_KEYWORD_PUBLIC // I want it to be a normal group name, which is an identifier LEX_KEYWORD_RAISERROR LEX_KEYWORD_READ LEX_KEYWORD_READTEXT LEX_KEYWORD_RECONFIGURE LEX_KEYWORD_REFERENCES LEX_KEYWORD_REPLICATION LEX_KEYWORD_RESTORE LEX_KEYWORD_RESTRICT LEX_KEYWORD_RETURN LEX_KEYWORD_REVERT LEX_KEYWORD_REVOKE LEX_KEYWORD_RIGHT // for RIGHT JOIN, but is also the sysfunc RIGHT() LEX_KEYWORD_ROLE // I add this keyword for rsql. It is not a MS SQL Server keyword. LEX_KEYWORD_ROLLBACK LEX_KEYWORD_ROWCOUNT LEX_KEYWORD_ROWGUIDCOL LEX_KEYWORD_RULE LEX_KEYWORD_SAVE LEX_KEYWORD_SCHEMA LEX_KEYWORD_SECURITYAUDIT LEX_KEYWORD_SELECT LEX_KEYWORD_SESSION_USER // sysfunc without parentheses LEX_KEYWORD_SET LEX_KEYWORD_SETUSER LEX_KEYWORD_SHOW // I add this keyword for rsql. It is not a MS SQL Server keyword. LEX_KEYWORD_SLEEP // I add this keyword for rsql. It is not a MS SQL Server keyword. LEX_KEYWORD_SHRINK // I add this keyword for rsql. It is not a MS SQL Server keyword. LEX_KEYWORD_SHUTDOWN LEX_KEYWORD_STATISTICS LEX_KEYWORD_SYSTEM_USER // sysfunc without parentheses LEX_KEYWORD_TABLE LEX_KEYWORD_TABLESAMPLE LEX_KEYWORD_TEXTSIZE LEX_KEYWORD_THEN LEX_KEYWORD_THROW LEX_KEYWORD_TO LEX_KEYWORD_TOP LEX_KEYWORD_TRAN LEX_KEYWORD_TRANSACTION LEX_KEYWORD_TRIGGER LEX_KEYWORD_TRUE // I add this keyword for rsql. It is not a MS SQL Server keyword. LEX_KEYWORD_TRUNCATE LEX_KEYWORD_TSEQUAL LEX_KEYWORD_UNION LEX_KEYWORD_UNIQUE LEX_KEYWORD_UNPIVOT LEX_KEYWORD_UPDATE LEX_KEYWORD_UPDATETEXT LEX_KEYWORD_USE LEX_KEYWORD_USER // sysfunc without parentheses LEX_KEYWORD_VALUES LEX_KEYWORD_VARYING LEX_KEYWORD_VIEW LEX_KEYWORD_WAITFOR LEX_KEYWORD_WHEN LEX_KEYWORD_WHERE LEX_KEYWORD_WHILE LEX_KEYWORD_WITH LEX_KEYWORD_WRITETEXT LEX_OPERATOR_LOGICAL_AND LEX_OPERATOR_LOGICAL_OR LEX_OPERATOR_LOGICAL_NOT LEX_OPERATOR_IS LEX_OPERATOR_LIKE LEX_OPERATOR_ESCAPE // pseudo-operator with no precedence, used in combination with LIKE operator LEX_OPERATOR_BETWEEN LEX_OPERATOR_IN LEX_OPERATOR_SOME LEX_OPERATOR_ANY LEX_OPERATOR_ALL LEX_OPERATOR_PLUS LEX_OPERATOR_MINUS LEX_OPERATOR_MULT LEX_OPERATOR_DIV LEX_OPERATOR_MOD LEX_OPERATOR_BIT_AND LEX_OPERATOR_BIT_OR LEX_OPERATOR_BIT_XOR LEX_OPERATOR_BIT_NOT LEX_OPERATOR_COMP_EQUAL LEX_OPERATOR_COMP_GREATER LEX_OPERATOR_COMP_LESS LEX_OPERATOR_COMP_GREATER_EQUAL LEX_OPERATOR_COMP_LESS_EQUAL LEX_OPERATOR_COMP_NOT_EQUAL LEX_OPERATOR_ASSIGN_PLUS LEX_OPERATOR_ASSIGN_MINUS LEX_OPERATOR_ASSIGN_MULT LEX_OPERATOR_ASSIGN_DIV LEX_OPERATOR_ASSIGN_MOD LEX_OPERATOR_ASSIGN_BIT_AND LEX_OPERATOR_ASSIGN_BIT_OR LEX_OPERATOR_ASSIGN_BIT_XOR LEX_LPAREN_SUB // Lexeme Lex_subtype for Lex_type LEX_LPAREN LEX_RPAREN_SUB // Lexeme Lex_subtype for Lex_type LEX_RPAREN LEX_LCURLY_SUB // Lexeme Lex_subtype for Lex_type LEX_LCURLY LEX_RCURLY_SUB // Lexeme Lex_subtype for Lex_type LEX_RCURLY LEX_COMMA_SUB // Lexeme Lex_subtype for Lex_type LEX_COMMA LEX_DOT_SUB // Lexeme Lex_subtype for Lex_type LEX_DOT LEX_SEMICOLON_SUB // Lexeme Lex_subtype for Lex_type LEX_SEMICOLON LEX_COLON_SUB // Lexeme Lex_subtype for Lex_type LEX_COLON LEX_QUESTIONMARK_SUB // Lexeme Lex_subtype for Lex_type LEX_QUESTIONMARK LEX_EXCLAMATIONMARK_SUB // Lexeme Lex_subtype for Lex_type LEX_EXCLAMATIONMARK LEX_START_OF_BATCH_SUB // Lexeme Lex_subtype for Lex_type LEX_START_OF_BATCH LEX_END_OF_BATCH_SUB // Lexeme Lex_subtype for Lex_type LEX_END_OF_BATCH LEX_DISPLAYWORD_EMPTY_STRING // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_IS_NULL // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_IS_NOT_NULL // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_IN_LIST // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_NOT_IN_LIST // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_NOT_LIKE // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_NOT_BETWEEN // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_SUBQUERY_ONE // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_SUBQUERY_MANY // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_TRUE // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_FALSE // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_FIRSTDAYOFWEEK // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_DMY // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_UTC_TO_LOCAL // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_COMPILE_VARCHAR_TO_REGEXPLIKE // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_FIT_NO_TRUNCATE_VARBINARY // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_DISPLAYWORD_FIT_NO_TRUNCATE_VARCHAR // Lexeme Lex_subtype for Lex_type LEX_DISPLAYWORD LEX_LITERAL_HEXASTRING_SUB // Lexeme Lex_subtype for Lex_type LEX_LITERAL_HEXASTRING. It will be converted to VARBINARY by the decorator. LEX_LITERAL_STRING_SUB // Lexeme Lex_subtype for Lex_type LEX_LITERAL_STRING. It will be converted to CHAR or VARCHAR by the decorator. LEX_LITERAL_NUMBER_INTEGRAL // Lexeme Lex_subtype for Lex_type LEX_LITERAL_NUMBER. It will be converted to BIT, TINYINT, SMALLINT, INT, BIGINT, or NUMERIC by the decorator. LEX_LITERAL_NUMBER_MONEY // Lexeme Lex_subtype for Lex_type LEX_LITERAL_NUMBER. It will be converted to MONEY by the decorator. LEX_LITERAL_NUMBER_NUMERIC // Lexeme Lex_subtype for Lex_type LEX_LITERAL_NUMBER. It will be converted to NUMERIC by the decorator. LEX_LITERAL_NUMBER_FLOAT // Lexeme Lex_subtype for Lex_type LEX_LITERAL_NUMBER. It will be converted to FLOAT by the decorator. LEX_VARIABLE_SUB // Lexeme Lex_subtype for Lex_type LEX_VARIABLE LEX_ATFUNC_SUB // Lexeme Lex_subtype for Lex_type LEX_ATFUNC LEX_IDENTPART_SUB // Lexeme Lex_subtype for Lex_type LEX_IDENTPART )
All Lex_subtype_t constants. The values for each subtype don't overlap. This way, there is no need to compare also Lex_types when comparing Lex_subtypes.
type Lex_type_t ¶
type Lex_type_t uint32 // LEX_KEYWORD, LEX_OPERATOR, LEX_IDENTPART, etc
const ( LEX_INVALID Lex_type_t = 0 // used only when an internal lexer function returns an error. It is only used internally by the lexer, and the caller never sees it. LEX_KEYWORD Lex_type_t = 1 << iota // Lexeme Lex_type. Lex_subtypes are LEX_KEYWORD_ADD, etc. LEX_OPERATOR // Lexeme Lex_type. Lex_subtypes are LEX_OPERATOR_LOGICAL_AND, LEX_OPERATOR_PLUS, etc LEX_LPAREN // Lexeme Lex_type. Lex_subtype is LEX_LPAREN_SUB. LEX_RPAREN // Lexeme Lex_type. Lex_subtype is LEX_RPAREN_SUB. LEX_LCURLY // Lexeme Lex_type. Lex_subtype is LEX_LCURLY_SUB. LEX_RCURLY // Lexeme Lex_type. Lex_subtype is LEX_LCURLY_SUB. LEX_COMMA // Lexeme Lex_type. Lex_subtype is LEX_COMMA_SUB. LEX_DOT // Lexeme Lex_type. Lex_subtype is LEX_DOT_SUB. LEX_SEMICOLON // Lexeme Lex_type. Lex_subtype is LEX_SEMICOLON_SUB. LEX_COLON // Lexeme Lex_type. Lex_subtype is LEX_COLON_SUB. LEX_QUESTIONMARK // Lexeme Lex_type. Lex_subtype is LEX_QUESTIONMARK_SUB. LEX_EXCLAMATIONMARK // Lexeme Lex_type. Lex_subtype is LEX_EXCLAMATIONMARK_SUB. LEX_START_OF_BATCH // Lexeme Lex_type. Lex_subtype is LEX_START_OF_BATCH_SUB. LEX_END_OF_BATCH // Lexeme Lex_type. Lex_subtype is LEX_END_OF_BATCH_SUB. LEX_DISPLAYWORD // Lexeme Lex_type. Lex_subtype is LEX_DISPLAYWORD_IS_NULL, etc. They are used to attach a lexeme to some tokens, just for display when displaying the AST tree (e.g. "in_list", "is_null"). LEX_LITERAL_NUMBER // Lexeme Lex_type. Lex_subtypes are LEX_LITERAL_NUMBER_INTEGRAL, etc. LEX_LITERAL_HEXASTRING // Lexeme Lex_type. Lex_subtype is LEX_LITERAL_HEXASTRING_SUB. LEX_LITERAL_STRING // Lexeme Lex_type. Lex_subtype is LEX_LITERAL_STRING_SUB. LEX_VARIABLE // Lexeme Lex_type. Lex_subtype is LEX_VARIABLE_SUB. LEX_ATFUNC // Lexeme Lex_type. Lex_subtype is LEX_ATFUNC_SUB. LEX_IDENTPART // Lexeme Lex_type. Lex_subtype is LEX_IDENTPART_SUB. )
All Lex_type_t constants. NOTE: some words like "and", "between", etc, are considered to be operator, and not keyword.
type Lexeme ¶
type Lexeme struct { Lex_type Lex_type_t // LEX_KEYWORD, LEX_OPERATOR, LEX_IDENTPART, etc Lex_subtype Lex_subtype_t // LEX_KEYWORD_ADD, LEX_OPERATOR_MINUS, LEX_IDENTPART_SUB, etc Lex_word string // string containing the literal word. It is in lowercase for LEX_KEYWORD, LEX_OPERATOR, LEX_VARIABLE, LEX_ATFUNC, LEX_IDENTPART. }
Lexemes are used by the lexer. Lexemes contain keywords, identifiers, operators, etc. The successive lexemes are read by the lexer, and returned by each call of l.Eat_next_lexeme().
A Lexeme is a string with some type information. It contains a Lex_type and Lex_subtype indicating if it is a keyword, identifier, etc. If it is LEX_KEYWORD, LEX_OPERATOR, LEX_VARIABLE, LEX_ATFUNC, LEX_IDENTPART, the Lex_word string is lowercase.
var ( LEXEME_INVALID Lexeme = Lexeme{} // it contains zero values: Lexeme{0, 0, ""}. It is only used internally by the lexer, and the caller never sees it, except in l.Current_lexeme_original for non identpart lexemes. LEXEME_KEYWORD_ASSERT_ Lexeme LEXEME_KEYWORD_ASSERT_NULL_ Lexeme LEXEME_KEYWORD_ASSERT_ERROR_ Lexeme LEXEME_KEYWORD_ADD Lexeme LEXEME_KEYWORD_ALTER Lexeme LEXEME_KEYWORD_AS Lexeme LEXEME_KEYWORD_ASC Lexeme LEXEME_KEYWORD_AUTHORIZATION Lexeme LEXEME_KEYWORD_BACKUP Lexeme LEXEME_KEYWORD_BEGIN Lexeme LEXEME_KEYWORD_BREAK Lexeme LEXEME_KEYWORD_BROWSE Lexeme LEXEME_KEYWORD_BULK Lexeme LEXEME_KEYWORD_BY Lexeme LEXEME_KEYWORD_CASCADE Lexeme LEXEME_KEYWORD_CASE Lexeme LEXEME_KEYWORD_CAST Lexeme LEXEME_KEYWORD_CHECK Lexeme LEXEME_KEYWORD_CHECKPOINT Lexeme LEXEME_KEYWORD_CLOSE Lexeme LEXEME_KEYWORD_CLUSTERED Lexeme LEXEME_KEYWORD_COLLATE Lexeme LEXEME_KEYWORD_COLUMN Lexeme LEXEME_KEYWORD_COMMIT Lexeme LEXEME_KEYWORD_COMPUTE Lexeme LEXEME_KEYWORD_CONSTRAINT Lexeme LEXEME_KEYWORD_CONTAINS Lexeme LEXEME_KEYWORD_CONTAINSTABLE Lexeme LEXEME_KEYWORD_CONTINUE Lexeme LEXEME_KEYWORD_CONVERT Lexeme LEXEME_KEYWORD_CREATE Lexeme LEXEME_KEYWORD_CROSS Lexeme LEXEME_KEYWORD_CURRENT Lexeme LEXEME_KEYWORD_CURRENT_TIMESTAMP Lexeme LEXEME_KEYWORD_CURRENT_USER Lexeme LEXEME_KEYWORD_CURSOR Lexeme LEXEME_KEYWORD_DATABASE Lexeme LEXEME_KEYWORD_DBCC Lexeme LEXEME_KEYWORD_DEALLOCATE Lexeme LEXEME_KEYWORD_DEBUG Lexeme LEXEME_KEYWORD_DECLARE Lexeme LEXEME_KEYWORD_DEFAULT Lexeme LEXEME_KEYWORD_DELETE Lexeme LEXEME_KEYWORD_DENY Lexeme LEXEME_KEYWORD_DESC Lexeme LEXEME_KEYWORD_DISK Lexeme LEXEME_KEYWORD_DISTINCT Lexeme LEXEME_KEYWORD_DISTRIBUTED Lexeme LEXEME_KEYWORD_DROP Lexeme LEXEME_KEYWORD_DUMP Lexeme LEXEME_KEYWORD_ELSE Lexeme LEXEME_KEYWORD_END Lexeme LEXEME_KEYWORD_ERRLVL Lexeme LEXEME_KEYWORD_EXCEPT Lexeme LEXEME_KEYWORD_EXEC Lexeme LEXEME_KEYWORD_EXECUTE Lexeme LEXEME_KEYWORD_EXISTS Lexeme LEXEME_KEYWORD_EXIT Lexeme LEXEME_KEYWORD_EXTERNAL Lexeme LEXEME_KEYWORD_FALSE Lexeme LEXEME_KEYWORD_FETCH Lexeme LEXEME_KEYWORD_FILE Lexeme LEXEME_KEYWORD_FILLFACTOR Lexeme LEXEME_KEYWORD_FOR Lexeme LEXEME_KEYWORD_FOREIGN Lexeme LEXEME_KEYWORD_FREETEXT Lexeme LEXEME_KEYWORD_FREETEXTTABLE Lexeme LEXEME_KEYWORD_FROM Lexeme LEXEME_KEYWORD_FULL Lexeme LEXEME_KEYWORD_FUNCTION Lexeme LEXEME_KEYWORD_GOTO Lexeme LEXEME_KEYWORD_GRANT Lexeme LEXEME_KEYWORD_GROUP Lexeme LEXEME_KEYWORD_HAVING Lexeme LEXEME_KEYWORD_HOLDLOCK Lexeme LEXEME_KEYWORD_IDENTITY Lexeme LEXEME_KEYWORD_IDENTITY_INSERT Lexeme LEXEME_KEYWORD_IDENTITYCOL Lexeme LEXEME_KEYWORD_IF Lexeme LEXEME_KEYWORD_INDEX Lexeme LEXEME_KEYWORD_INNER Lexeme LEXEME_KEYWORD_INSERT Lexeme LEXEME_KEYWORD_INTERSECT Lexeme LEXEME_KEYWORD_INTO Lexeme LEXEME_KEYWORD_JOIN Lexeme LEXEME_KEYWORD_KEY Lexeme LEXEME_KEYWORD_KILL Lexeme LEXEME_KEYWORD_LEFT Lexeme LEXEME_KEYWORD_LINENO Lexeme LEXEME_KEYWORD_LOAD Lexeme LEXEME_KEYWORD_NATIONAL Lexeme LEXEME_KEYWORD_NOCHECK Lexeme LEXEME_KEYWORD_NONCLUSTERED Lexeme LEXEME_KEYWORD_NULL Lexeme LEXEME_KEYWORD_OF Lexeme LEXEME_KEYWORD_OFF Lexeme LEXEME_KEYWORD_OFFSETS Lexeme LEXEME_KEYWORD_ON Lexeme LEXEME_KEYWORD_OPEN Lexeme LEXEME_KEYWORD_OPENDATASOURCE Lexeme LEXEME_KEYWORD_OPENQUERY Lexeme LEXEME_KEYWORD_OPENROWSET Lexeme LEXEME_KEYWORD_OPENXML Lexeme LEXEME_KEYWORD_OPTION Lexeme LEXEME_KEYWORD_ORDER Lexeme LEXEME_KEYWORD_OUTER Lexeme LEXEME_KEYWORD_OVER Lexeme LEXEME_KEYWORD_PERCENT Lexeme LEXEME_KEYWORD_PIVOT Lexeme LEXEME_KEYWORD_PLAN Lexeme LEXEME_KEYWORD_PRECISION Lexeme LEXEME_KEYWORD_PRIMARY Lexeme LEXEME_KEYWORD_PRINT Lexeme LEXEME_KEYWORD_PROC Lexeme LEXEME_KEYWORD_PROCEDURE Lexeme LEXEME_KEYWORD_RAISERROR Lexeme LEXEME_KEYWORD_READ Lexeme LEXEME_KEYWORD_READTEXT Lexeme LEXEME_KEYWORD_RECONFIGURE Lexeme LEXEME_KEYWORD_REFERENCES Lexeme LEXEME_KEYWORD_REPLICATION Lexeme LEXEME_KEYWORD_RESTORE Lexeme LEXEME_KEYWORD_RESTRICT Lexeme LEXEME_KEYWORD_RETURN Lexeme LEXEME_KEYWORD_REVERT Lexeme LEXEME_KEYWORD_REVOKE Lexeme LEXEME_KEYWORD_RIGHT Lexeme LEXEME_KEYWORD_ROLE Lexeme LEXEME_KEYWORD_ROLLBACK Lexeme LEXEME_KEYWORD_ROWCOUNT Lexeme LEXEME_KEYWORD_ROWGUIDCOL Lexeme LEXEME_KEYWORD_RULE Lexeme LEXEME_KEYWORD_SAVE Lexeme LEXEME_KEYWORD_SCHEMA Lexeme LEXEME_KEYWORD_SECURITYAUDIT Lexeme LEXEME_KEYWORD_SELECT Lexeme LEXEME_KEYWORD_SESSION_USER Lexeme LEXEME_KEYWORD_SET Lexeme LEXEME_KEYWORD_SETUSER Lexeme LEXEME_KEYWORD_SHOW Lexeme LEXEME_KEYWORD_SLEEP Lexeme LEXEME_KEYWORD_SHRINK Lexeme LEXEME_KEYWORD_SHUTDOWN Lexeme LEXEME_KEYWORD_STATISTICS Lexeme LEXEME_KEYWORD_SYSTEM_USER Lexeme LEXEME_KEYWORD_TABLE Lexeme LEXEME_KEYWORD_TABLESAMPLE Lexeme LEXEME_KEYWORD_TEXTSIZE Lexeme LEXEME_KEYWORD_THEN Lexeme LEXEME_KEYWORD_THROW Lexeme LEXEME_KEYWORD_TO Lexeme LEXEME_KEYWORD_TOP Lexeme LEXEME_KEYWORD_TRAN Lexeme LEXEME_KEYWORD_TRANSACTION Lexeme LEXEME_KEYWORD_TRIGGER Lexeme LEXEME_KEYWORD_TRUE Lexeme LEXEME_KEYWORD_TRUNCATE Lexeme LEXEME_KEYWORD_TSEQUAL Lexeme LEXEME_KEYWORD_UNION Lexeme LEXEME_KEYWORD_UNIQUE Lexeme LEXEME_KEYWORD_UNPIVOT Lexeme LEXEME_KEYWORD_UPDATE Lexeme LEXEME_KEYWORD_UPDATETEXT Lexeme LEXEME_KEYWORD_USE Lexeme LEXEME_KEYWORD_USER Lexeme LEXEME_KEYWORD_VALUES Lexeme LEXEME_KEYWORD_VARYING Lexeme LEXEME_KEYWORD_VIEW Lexeme LEXEME_KEYWORD_WAITFOR Lexeme LEXEME_KEYWORD_WHEN Lexeme LEXEME_KEYWORD_WHERE Lexeme LEXEME_KEYWORD_WHILE Lexeme LEXEME_KEYWORD_WITH Lexeme LEXEME_KEYWORD_WRITETEXT Lexeme LEXEME_OPERATOR_LOGICAL_AND Lexeme LEXEME_OPERATOR_LOGICAL_OR Lexeme LEXEME_OPERATOR_LOGICAL_NOT Lexeme LEXEME_OPERATOR_IS Lexeme LEXEME_OPERATOR_LIKE Lexeme LEXEME_OPERATOR_ESCAPE Lexeme LEXEME_OPERATOR_BETWEEN Lexeme LEXEME_OPERATOR_IN Lexeme LEXEME_OPERATOR_SOME Lexeme LEXEME_OPERATOR_ANY Lexeme LEXEME_OPERATOR_ALL Lexeme LEXEME_OPERATOR_PLUS Lexeme LEXEME_OPERATOR_MINUS Lexeme LEXEME_OPERATOR_MULT Lexeme LEXEME_OPERATOR_DIV Lexeme LEXEME_OPERATOR_MOD Lexeme LEXEME_OPERATOR_BIT_AND Lexeme LEXEME_OPERATOR_BIT_OR Lexeme LEXEME_OPERATOR_BIT_XOR Lexeme LEXEME_OPERATOR_BIT_NOT Lexeme LEXEME_OPERATOR_COMP_EQUAL Lexeme LEXEME_OPERATOR_COMP_GREATER Lexeme LEXEME_OPERATOR_COMP_LESS Lexeme LEXEME_OPERATOR_COMP_GREATER_EQUAL Lexeme LEXEME_OPERATOR_COMP_LESS_EQUAL Lexeme LEXEME_OPERATOR_COMP_NOT_EQUAL Lexeme LEXEME_OPERATOR_ASSIGN_PLUS Lexeme LEXEME_OPERATOR_ASSIGN_MINUS Lexeme LEXEME_OPERATOR_ASSIGN_MULT Lexeme LEXEME_OPERATOR_ASSIGN_DIV Lexeme LEXEME_OPERATOR_ASSIGN_MOD Lexeme LEXEME_OPERATOR_ASSIGN_BIT_AND Lexeme LEXEME_OPERATOR_ASSIGN_BIT_OR Lexeme LEXEME_OPERATOR_ASSIGN_BIT_XOR Lexeme LEXEME_LPAREN Lexeme LEXEME_RPAREN Lexeme LEXEME_LCURLY Lexeme LEXEME_RCURLY Lexeme LEXEME_COMMA Lexeme LEXEME_DOT Lexeme LEXEME_SEMICOLON Lexeme LEXEME_COLON Lexeme LEXEME_QUESTIONMARK Lexeme LEXEME_EXCLAMATIONMARK Lexeme LEXEME_START_OF_BATCH Lexeme LEXEME_END_OF_BATCH Lexeme LEXEME_DISPLAYWORD_EMPTY_STRING Lexeme LEXEME_DISPLAYWORD_IS_NULL Lexeme LEXEME_DISPLAYWORD_IS_NOT_NULL Lexeme LEXEME_DISPLAYWORD_IN_LIST Lexeme LEXEME_DISPLAYWORD_NOT_IN_LIST Lexeme LEXEME_DISPLAYWORD_NOT_LIKE Lexeme LEXEME_DISPLAYWORD_NOT_BETWEEN Lexeme LEXEME_DISPLAYWORD_SUBQUERY_ONE Lexeme LEXEME_DISPLAYWORD_SUBQUERY_MANY Lexeme LEXEME_DISPLAYWORD_TRUE Lexeme LEXEME_DISPLAYWORD_FALSE Lexeme LEXEME_DISPLAYWORD_FIRSTDAYOFWEEK Lexeme LEXEME_DISPLAYWORD_DMY Lexeme LEXEME_DISPLAYWORD_UTC_TO_LOCAL Lexeme LEXEME_DISPLAYWORD_COMPILE_VARCHAR_TO_REGEXPLIKE Lexeme LEXEME_DISPLAYWORD_FIT_NO_TRUNCATE_VARBINARY Lexeme LEXEME_DISPLAYWORD_FIT_NO_TRUNCATE_VARCHAR Lexeme LEXEME_SYSVAR_CURRENT_LANGUAGE Lexeme LEXEME_SYSVAR_SYSTEM_USER_NAME Lexeme LEXEME_SYSVAR_SYSTEM_USER_ID Lexeme LEXEME_SYSVAR_CURRENT_DB_NAME Lexeme LEXEME_SYSVAR_CURRENT_DB_ID Lexeme LEXEME_SYSVAR_CURRENT_SCHEMA_NAME Lexeme LEXEME_SYSVAR_CURRENT_SCHEMA_ID Lexeme LEXEME_SYSVAR_CURRENT_USER_NAME Lexeme LEXEME_SYSVAR_CURRENT_USER_ID Lexeme LEXEME_SYSVAR_CURRENT_TIMESTAMP Lexeme )
Constant Lexemes (for keywords, operators, punctuation, etc)
func Create_lexeme ¶
func Create_lexeme(lex_type Lex_type_t, lex_subtype Lex_subtype_t, word string) Lexeme
Create_lexeme returns a lexeme. Argument 'word' is the string in original case. For LEX_OPERATOR, LEX_VARIABLE, LEX_ATFUNC, LEX_IDENTPART, the string stored in lexeme will be lowercased. You cannot create a LEX_KEYWORD lexeme, because they already exist in G_KEYWORDS_OPERATORS map.
func Create_lexeme_original ¶
func Create_lexeme_original(lex_type Lex_type_t, lex_subtype Lex_subtype_t, word string) Lexeme
Create_lexeme_original is same as Create_lexeme, but don't lowercase. It can only be user for LEX_IDENTPART. Else panics.
func (*Lexeme) Is_auxword ¶
func (*Lexeme) Is_identpart_max ¶
func (*Lexeme) Type_string ¶
type Lexer ¶
type Lexer struct { Current_line_old Coord_t // last value of Current_line. Current_line Coord_t // Will be read by parser. Line no (starting from 1) and position (starting from 1) of the current lexeme. Eat_next_lexeme() updates this value. Current_lexeme Lexeme // Will be read by parser. Current lexeme. Eat_next_lexeme() updates this value. Contains LEXEME_END_OF_BATCH when EOS. Current_lexeme_original Lexeme // for LEX_IDENTPART, contains lexeme in original case. Else, LEXEME_INVALID. Info_line Coord_t // line no and pos used for error localization. This information is updated by Eat_next_lexeme() or Info_line_update(). EOS bool // end-of-batch flag. If true, Rune0, Rune1, Rune2 == 0. Rune0 rune // Next rune to parse. Rune0, Rune1 and Rune2 make up a sliding window. Rune1 rune // lookahead rune. 0 when EOS. Rune2 rune // lookahead rune. next_rune() will read a new rune at rune3_offset and put it here. 0 when EOS. // contains filtered or unexported fields }
Lexer takes a sql batch as input, and calling Eat_next_lexeme() returns the successive Lexemes in l.Current_lexeme.
For the user, the important fields are :
- Current_line line no and position of current lexeme.
- Current_lexeme lexeme that has just been read, in lowercase except for LEX_LITERAL_STRING, LEX_LITERAL_HEXASTRING, LEX_LITERAL_NUMBER.
- Rune0 next character to parse. A call to Eat_next_lexeme() will begin to parse starting at this position. Always valid, contains 0 if Current_lexeme==LEXEME_END_OR_BATCH.
- Rune1 lookahead character. Always valid, contains 0 if Current_lexeme==LEXEME_END_OR_BATCH.
- Rune2 lookahead character. Always valid, contains 0 if Current_lexeme==LEXEME_END_OR_BATCH.
Rune0, Rune1 and Rune2 acts like a sliding window, which moves exactly one rune forwards at each call to next_rune().
IF YOU ADD A NEW FIELD TO Lexer STRUCT, YOU MUST MODIFY Memorize_to() and Restore_from() ACCORDINGLY.
func (*Lexer) Attach_batch ¶
Attach_batch normalizes to NFC and attaches a batch to the lexer, and initializes the lexer state. The batch text is then ready to be parsed.
It returns an error if the batch begins with a star comment /* that has no ending mark */
func (*Lexer) Eat_next_lexeme ¶
Eat_next_lexeme reads the lexeme at the current position in the SQL batch. This function is the core function of the lexer.
It reads one or two characters at the l.rune0_offset position, and calls a specialized lexeme reading function accordingly, acting like a multiplexer.
This function stores a Lexeme at each call in l.Current_lexeme, in lowercase for identifiers, keywords, operators, variables, atfuncs. It can contain successively : - LEXEME_INVALID, before a batch has been attached. - LEXEME_START_OF_BATCH, if l.Eat_next_lexeme() has not been called yet. - a lexeme for keyword, operator, identpart, literal, or punctuations. - LEXEME_END_OF_BATCH For error display, the field l.Current_line contains the line no and position of the lexeme. l.Current_lexeme contains the last lexeme correctly read. For LEX_IDENTPART, l.Current_lexeme_original contains lexeme in original case. For other lex_types, it contains LEXEME_INVALID.
func (*Lexer) Info_line_put_old ¶
func (l *Lexer) Info_line_put_old()
Info_line_put_old updates line information by setting it to the previous lexeme line/pos, which is displayed in case of error.
If you want to have the previous line/pos in error message, which is necessary in some cases to have a better line coordinate, you must call Info_line_put_old() just after calling Eat_next_lexeme(). You can then use Info_line_update() to manually refresh it, or wait until the next call to Eat_next_lexeme().
func (*Lexer) Info_line_update ¶
func (l *Lexer) Info_line_update()
Info_line_update updates line information, which is displayed in case of error. Info_line is updated each time a new lexeme is read. But if Info_line_put_old() has been called, Info_line is set to the previous lexeme line/pos, until Info_line_update() or Eat_next_lexeme() is used.
func (*Lexer) Memorize_to ¶
func (*Lexer) Print_Info_line ¶
func (l *Lexer) Print_Info_line()
func (*Lexer) Restore_from ¶
func (*Lexer) Set_option_quoted_identifier ¶
Set_option_quoted_identifier sets the corresponding lexer option. You should call this function before your first call to l.Eat_next_lexeme(). You can call it before of after l.Attach_batch().
type Lexer_bak ¶
type Lexer_bak struct { Current_line_old Coord_t Current_line Coord_t Current_lexeme Lexeme Current_lexeme_original Lexeme Info_line Coord_t EOS bool Rune0 rune Rune1 rune Rune2 rune // contains filtered or unexported fields }
Lexer_bak is used to save the state of the Lexer (current lexing position in batch). USed by l.Memorize_to() and l.Restore_from().