sqlp

package module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 27, 2023 License: Unlicense Imports: 5 Imported by: 1

README

Overview

SQL Parse: parser and formatter for rewriting foreign code embedded in SQL queries, such as parameter placeholders: $1 or :ident, or code encased in delimiters: () [] {}. Anything the parser doesn't recognize is preserved as text.

API docs: https://pkg.go.dev/github.com/mitranim/sqlp.

Changelog

v0.3.0

Renamed method .Append in various types to .AppendTo for consistency with other libraries.

v0.2.0

Various optimizations.

  • Added Type, Region, Token for use by the tokenizer; see below.

  • Tokenization is now allocation-free and around x2 faster in benchmarks. Instead of generating Node instances, the tokenizer generates stack-allocated Token instances.

v0.1.4

Added NodeWhitespace. This is emitted for any non-zero amount of whitespace. NodeText now contains only non-whitespace. The performance impact seems negligible.

v0.1.3

Support incremental parsing via Tokenizer. Added a few utility functions related to tree traversal. Minor breaking renaming.

v0.1.2

Added missing (*Error).Unwrap.

v0.1.1

Replaced []rune with string. When parsing, we treat the input string as UTF-8, decoding on the fly.

v0.1.0

First tagged release.

License

https://unlicense.org

Misc

I'm receptive to suggestions. If this library almost satisfies you but needs changes, open an issue or chat me up. Contacts: https://mitranim.com/#contacts

Documentation

Overview

Parser and formatter for rewriting foreign code embedded in SQL queries, such as parameter placeholders: `$1` or `:ident`, or code encased in delimiters: `()` `[]` `{}`. It supports the following SQL features:

• ” : single quotes.

• "" : double quotes.

• “ : grave quotes (non-standard).

• -- : line comments.

• /* : block comments.

• :: : Postgres-style cast operator (non-standard).

In addition, it supports the following:

• () : content in parens.

• [] : content in brackets.

• {} : content in braces.

• $1 $2 ... : ordinal parameter placeholders.

• :identifier : named parameter placeholders.

Supporting SQL quotes and comments allows us to correctly ignore text inside special delimiters that happens to be part of a string, quoted identifier, or comment.

Tokenization vs Parsing

This library supports incremental parsing token by token, via `Tokenizer`. It also lets you convert a sequence of tokens into a fully-built AST via `Parser`. Choose the approach that better suits your use case.

Usage

Oversimplified example:

nodes, err := Parse(`select * from some_table where :ident::uuid = id`)
panic(err)

WalkNodePtr(nodes, func(ptr *Node) {
	switch node := (*ptr).(type) {
	case NodeNamedParam:
		*ptr = node + `_renamed`
	}
})

The AST now looks like this:

nodes := Nodes{
	NodeText(`select * from some_table where `),
	NodeNamedParam(`ident_renamed`),
	NodeDoubleColon{},
	NodeText(`uuid = id`),
}

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func DeepWalkNode added in v0.2.0

func DeepWalkNode(val Node, fun func(Node))

Similar to `WalkNode`, but performs a deep walk, invoking the function only for "leaf nodes" that don't implement `Walker`.

func WalkNode added in v0.2.0

func WalkNode(val Node, fun func(Node))

Walks the node, invoking the given function for each non-nil node that doesn't implement `Walker`. Nodes that implement `Walker` receive the function as input, with implementation-specific behavior. All `Walker` implementations in this package perform a shallow walk, invoking a given function once for each immediate child.

func WalkNodePtr added in v0.2.0

func WalkNodePtr(val *Node, fun func(*Node))

Similar to `WalkNode`, but invokes the function for node pointers rather than node values. Allows AST editing.

Types

type BraceNodes added in v0.2.0

type BraceNodes Nodes

Nodes enclosed in braces: {}.

func (BraceNodes) AppendTo added in v0.3.0

func (self BraceNodes) AppendTo(buf []byte) []byte

Implement `Node`.

func (BraceNodes) CopyNode added in v0.2.0

func (self BraceNodes) CopyNode() Node

Implement `Copier` by calling `Nodes.Copy`.

func (BraceNodes) Nodes added in v0.2.0

func (self BraceNodes) Nodes() Nodes

Implement `Coll`. Free cast with no allocation.

func (BraceNodes) String added in v0.2.0

func (self BraceNodes) String() string

Implement `Node`. Also implements `fmt.Stringer` for debug purposes.

func (BraceNodes) WalkNode added in v0.2.0

func (self BraceNodes) WalkNode(fun func(Node))

Implement `Walker` by calling `Nodes.WalkNode`.

func (BraceNodes) WalkNodePtr added in v0.2.0

func (self BraceNodes) WalkNodePtr(fun func(*Node))

Implement `PtrWalker` by calling `Nodes.WalkNodePtr`.

type BracketNodes added in v0.2.0

type BracketNodes Nodes

Nodes enclosed in brackets: [].

func (BracketNodes) AppendTo added in v0.3.0

func (self BracketNodes) AppendTo(buf []byte) []byte

Implement `Node`.

func (BracketNodes) CopyNode added in v0.2.0

func (self BracketNodes) CopyNode() Node

Implement `Copier` by calling `Nodes.Copy`.

func (BracketNodes) Nodes added in v0.2.0

func (self BracketNodes) Nodes() Nodes

Implement `Coll`. Free cast with no allocation.

func (BracketNodes) String added in v0.2.0

func (self BracketNodes) String() string

Implement `Node`. Also implements `fmt.Stringer` for debug purposes.

func (BracketNodes) WalkNode added in v0.2.0

func (self BracketNodes) WalkNode(fun func(Node))

Implement `Walker` by calling `Nodes.WalkNode`.

func (BracketNodes) WalkNodePtr added in v0.2.0

func (self BracketNodes) WalkNodePtr(fun func(*Node))

Implement `PtrWalker` by calling `Nodes.WalkNodePtr`.

type Coll added in v0.2.0

type Coll interface{ Nodes() Nodes }

Implemented by collection types such as `Nodes` and `ParenNodes`.

type Copier added in v0.2.0

type Copier interface{ CopyNode() Node }

Implemented by collection types such as `Nodes` and `ParenNodes`. Used by the global `CopyNode` function.

type Node

type Node interface {
	// Implement `fmt.Stringer`. Must return the SQL representation of the node,
	// matching the source it was parsed from.
	String() string

	// Must append the same text representation as `.String`. Allows more
	// efficient text encoding for large AST.
	AppendTo([]byte) []byte
}

AST node. May be a primitive token or a structure. `Tokenizer` emits only primitive tokens.

func CopyNode added in v0.2.0

func CopyNode(node Node) Node

Makes a copy that should be safe to modify without affecting the original.

type NodeCommentBlock

type NodeCommentBlock string

Content of a block comment: /* */.

func (NodeCommentBlock) AppendTo added in v0.3.0

func (self NodeCommentBlock) AppendTo(buf []byte) []byte

func (NodeCommentBlock) String

func (self NodeCommentBlock) String() string

type NodeCommentLine

type NodeCommentLine string

Content of a line comment: --, including the newline.

func (NodeCommentLine) AppendTo added in v0.3.0

func (self NodeCommentLine) AppendTo(buf []byte) []byte

func (NodeCommentLine) String

func (self NodeCommentLine) String() string

type NodeDoubleColon

type NodeDoubleColon struct{}

Postgres cast operator: ::. Allows to disambiguate casts from named params.

func (NodeDoubleColon) AppendTo added in v0.3.0

func (self NodeDoubleColon) AppendTo(buf []byte) []byte

func (NodeDoubleColon) String

func (self NodeDoubleColon) String() string

type NodeNamedParam

type NodeNamedParam string

Named parameter preceded by colon: :identifier

func (NodeNamedParam) AppendTo added in v0.3.0

func (self NodeNamedParam) AppendTo(buf []byte) []byte

func (NodeNamedParam) String

func (self NodeNamedParam) String() string

type NodeOrdinalParam

type NodeOrdinalParam int

Postgres-style ordinal parameter placeholder: $1, $2, $3, ...

func (NodeOrdinalParam) AppendTo added in v0.3.0

func (self NodeOrdinalParam) AppendTo(buf []byte) []byte

func (NodeOrdinalParam) Index added in v0.1.4

func (self NodeOrdinalParam) Index() int

Convenience method that returns the corresponding Go index (starts at zero).

func (NodeOrdinalParam) String

func (self NodeOrdinalParam) String() string

type NodeQuoteDouble

type NodeQuoteDouble string

Text inside double quotes: "". Escape sequences are not supported yet.

func (NodeQuoteDouble) AppendTo added in v0.3.0

func (self NodeQuoteDouble) AppendTo(buf []byte) []byte

func (NodeQuoteDouble) String

func (self NodeQuoteDouble) String() string

type NodeQuoteGrave

type NodeQuoteGrave string

Text inside grave quotes: “. Escape sequences are not supported yet.

func (NodeQuoteGrave) AppendTo added in v0.3.0

func (self NodeQuoteGrave) AppendTo(buf []byte) []byte

func (NodeQuoteGrave) String

func (self NodeQuoteGrave) String() string

type NodeQuoteSingle

type NodeQuoteSingle string

Text inside single quotes: ”. Escape sequences are not supported yet.

func (NodeQuoteSingle) AppendTo added in v0.3.0

func (self NodeQuoteSingle) AppendTo(buf []byte) []byte

func (NodeQuoteSingle) String

func (self NodeQuoteSingle) String() string

type NodeText

type NodeText string

Arbitrary non-whitespace text that wasn't recognized by the parser. When generated by the parser, the node is always non-empty and consists entirely of non-whitespace characters.

func (NodeText) AppendTo added in v0.3.0

func (self NodeText) AppendTo(buf []byte) []byte

func (NodeText) String

func (self NodeText) String() string

type NodeWhitespace added in v0.1.4

type NodeWhitespace string

Whitespace. When generated by the parser, the node is always non-empty and consists entirely of whitespace characters.

func (NodeWhitespace) AppendTo added in v0.3.0

func (self NodeWhitespace) AppendTo(buf []byte) []byte

func (NodeWhitespace) Node added in v0.2.0

func (self NodeWhitespace) Node() Node

func (NodeWhitespace) String added in v0.1.4

func (self NodeWhitespace) String() string

type Nodes

type Nodes []Node

Arbitrary sequence of AST nodes. When serializing, doesn't print any start or end delimiters.

func Parse

func Parse(src string) (Nodes, error)

Parses SQL text and returns the resulting AST. For the AST structure, see `Node` and the various node types. Also see `Tokenizer` and `Tokenizer.Next` for incremental parsing.

Example:

nodes, err := Parse(`select * from some_table where id = :ident`)
panic(err)

WalkNodePtr(nodes, func(ptr *Node) {
	switch (*ptr).(type) {
	case NodeNamedParam:
		*ptr = NodeOrdinalParam(1)
	}
})

func (Nodes) AppendTo added in v0.3.0

func (self Nodes) AppendTo(buf []byte) []byte

Implement the `Node` interface. Simply concatenates the stringified representations of the inner nodes, skipping any nil nodes.

`Nodes` can be arbitrarily nested without affecting the output. For example, both `Nodes{}` and `Nodes{Nodes{}}` will print "".

func (Nodes) CopyNode added in v0.2.0

func (self Nodes) CopyNode() Node

Implements `Copier` by calling `Nodes.CopyNodes`.

func (Nodes) CopyNodes added in v0.2.0

func (self Nodes) CopyNodes() Nodes

Makes a deep copy whose mutations won't affect the original.

func (Nodes) Nodes added in v0.2.0

func (self Nodes) Nodes() Nodes

func (Nodes) Procure added in v0.2.0

func (self Nodes) Procure(fun func(Node) Node) Node

func (Nodes) ProcureLast added in v0.2.0

func (self Nodes) ProcureLast(fun func(Node) Node) Node

func (Nodes) String

func (self Nodes) String() string

func (Nodes) WalkNode added in v0.2.0

func (self Nodes) WalkNode(fun func(Node))

Implement `Walker`. Calls `fun` for each non-nil node in the sequence.

func (Nodes) WalkNodePtr added in v0.2.0

func (self Nodes) WalkNodePtr(fun func(*Node))

Implement `PtrWalker`. Calls `fun` for each non-nil node in the sequence.

type ParenNodes added in v0.2.0

type ParenNodes Nodes

Nodes enclosed in parentheses: ().

func (ParenNodes) AppendTo added in v0.3.0

func (self ParenNodes) AppendTo(buf []byte) []byte

Implement `Node`.

func (ParenNodes) CopyNode added in v0.2.0

func (self ParenNodes) CopyNode() Node

Implement `Copier` by calling `Nodes.Copy`.

func (ParenNodes) Nodes added in v0.2.0

func (self ParenNodes) Nodes() Nodes

Implement `Coll`. Free cast with no allocation.

func (ParenNodes) String added in v0.2.0

func (self ParenNodes) String() string

Implement `Node`. Also implements `fmt.Stringer` for debug purposes.

func (ParenNodes) WalkNode added in v0.2.0

func (self ParenNodes) WalkNode(fun func(Node))

Implement `Walker` by calling `Nodes.WalkNode`.

func (ParenNodes) WalkNodePtr added in v0.2.0

func (self ParenNodes) WalkNodePtr(fun func(*Node))

Implement `PtrWalker` by calling `Nodes.WalkNodePtr`.

type Parser added in v0.2.0

type Parser struct{ Tokenizer }

See `Parse`.

func (*Parser) Parse added in v0.2.0

func (self *Parser) Parse() (nodes Nodes, err error)

See `Parse`.

type PtrWalker added in v0.2.0

type PtrWalker interface{ WalkNodePtr(func(*Node)) }

Implemented by collection types such as `Nodes` and `ParenNodes`. Used by the global function `WalkNodePtr`.

type Region added in v0.2.0

type Region [2]int

Represents a region in source text. Part of `Token`. The regions generated by this package are either all-zero, or have non-negative indexes corresponding to valid positions in source text.

func (Region) HasLen added in v0.2.0

func (self Region) HasLen() bool

Same as having a positive length.

func (Region) IsEmpty added in v0.2.0

func (self Region) IsEmpty() bool

Same as having no length.

func (Region) Len added in v0.2.0

func (self Region) Len() int

Difference between end and start.

func (Region) Slice added in v0.2.0

func (self Region) Slice(val string) string

Returns a substring corresponding to the given region. Permissive: if the string is too short on either side, this will adjust the positions instead of panicking.

type Token added in v0.2.0

type Token struct {
	Region
	Type
}

Region of source text generated by `Tokenizer`.

func (Token) Node added in v0.2.0

func (self Token) Node(src string) Node

Takes full source text and attempts to parse an atomic node corresponding to the region and type of the current token. The output is always non-nil, but if the source text doesn't match the token, or if the token's type can't be converted to a single atomic node, this will panic. This is used internally by `Parser`.

func (Token) NodeCommentBlock added in v0.2.0

func (self Token) NodeCommentBlock(src string) NodeCommentBlock

Used by `Token.Node`.

func (Token) NodeCommentLine added in v0.2.0

func (self Token) NodeCommentLine(src string) NodeCommentLine

Used by `Token.Node`.

func (Token) NodeDoubleColon added in v0.2.0

func (self Token) NodeDoubleColon(src string) NodeDoubleColon

Used by `Token.Node`.

func (Token) NodeNamedParam added in v0.2.0

func (self Token) NodeNamedParam(src string) NodeNamedParam

Used by `Token.Node`.

func (Token) NodeOrdinalParam added in v0.2.0

func (self Token) NodeOrdinalParam(src string) NodeOrdinalParam

Used by `Token.Node`.

func (Token) NodeQuoteDouble added in v0.2.0

func (self Token) NodeQuoteDouble(src string) NodeQuoteDouble

Used by `Token.Node`.

func (Token) NodeQuoteGrave added in v0.2.0

func (self Token) NodeQuoteGrave(src string) NodeQuoteGrave

Used by `Token.Node`.

func (Token) NodeQuoteSingle added in v0.2.0

func (self Token) NodeQuoteSingle(src string) NodeQuoteSingle

Used by `Token.Node`.

func (Token) NodeText added in v0.2.0

func (self Token) NodeText(src string) NodeText

Used by `Token.Node`.

func (Token) NodeWhitespace added in v0.2.0

func (self Token) NodeWhitespace(src string) NodeWhitespace

Used by `Token.Node`.

type Tokenizer added in v0.1.4

type Tokenizer struct {
	Source string
	// contains filtered or unexported fields
}

Incremental parser. Example usage:

tokenizer := Tokenizer{Source: `select * from some_table where some_col = $1`}

for {
	tok := tokenizer.Next()
	if tok.IsInvalid() {
		break
	}
	fmt.Printf("%#v\n", tok)
}

Tokenization is allocation-free, but parsing is always slow, and should be amortized by caching whenever possible.

func (*Tokenizer) Token added in v0.2.0

func (self *Tokenizer) Token() Token

Returns the next token. Upon reaching EOF, returns `Token{}`. Use `Token.IsInvalid` to detect end of iteration.

type Type added in v0.2.0

type Type byte

Type of a `Token` generated by `Tokenizer`.

const (
	TypeInvalid Type = iota
	TypeText
	TypeWhitespace
	TypeQuoteSingle
	TypeQuoteDouble
	TypeQuoteGrave
	TypeCommentLine
	TypeCommentBlock
	TypeDoubleColon
	TypeOrdinalParam
	TypeNamedParam
	TypeParenOpen
	TypeParenClose
	TypeBracketOpen
	TypeBracketClose
	TypeBraceOpen
	TypeBraceClose
)

func (Type) IsInvalid added in v0.2.0

func (self Type) IsInvalid() bool

True if zero. Used to detect end of tokenization.

type Walker added in v0.2.0

type Walker interface{ WalkNode(func(Node)) }

Implemented by collection types such as `Nodes` and `ParenNodes`. Used by the global function `WalkNode`.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL