expr

package module
v0.0.0-...-6f549d6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 21, 2024 License: Apache-2.0 Imports: 23 Imported by: 2

README

Aggregate expression engines

This repo contains Inngest's aggregate expression engine service, turning O(n^2) expression matching into O(n).

It does this by:

  1. Parsing each expression whilst lifting literals out of expressions
  2. Breaking expressions down into subgroups (matching && comparators)
  3. Storing each group's comparator in a matching engine for fast lookups

When an event is received, instead of matching the event against every expression, we instead inspect each matching engine to filter out invalid expressions. This leaves us with a subset of expressions that are almost always matching for the event, simplifying the number of expressions to match against.

Copyright Inngest 2024.

Documentation

Index

Constants

View Source
const (
	EngineTypeNone = iota

	EngineTypeStringHash
	EngineTypeNullMatch
	EngineTypeBTree // TODO

)
View Source
const (
	// VarPrefix is the lifted variable name used when extracting idents from an
	// expression.
	VarPrefix = "vars"
)

Variables

View Source
var (
	ErrEvaluableNotFound      = fmt.Errorf("Evaluable instance not found in aggregator")
	ErrInvalidType            = fmt.Errorf("invalid type for tree")
	ErrExpressionPartNotFound = fmt.Errorf("expression part not found")
)
View Source
var (
	CacheTime = time.Hour
)

Functions

This section is empty.

Types

type AggregateEvaluator

type AggregateEvaluator interface {
	// Add adds an expression to the tree evaluator.  This returns true
	// if the expression is aggregateable, or false if the expression will be
	// evaluated each time an event is received.
	Add(ctx context.Context, eval Evaluable) (bool, error)

	// Remove removes an expression from the aggregate evaluator
	Remove(ctx context.Context, eval Evaluable) error

	// Evaluate checks input data against all exrpesssions in the aggregate in an optimal
	// manner, only evaluating expressions when necessary (based off of tree matching).
	//
	// Note that any expressions added that cannot be evaluated optimally by trees
	// are evaluated every time this function is called.
	//
	// Evaluate returns all matching Evaluables, plus the total number of evaluations
	// executed.
	Evaluate(ctx context.Context, data map[string]any) ([]Evaluable, int32, error)

	// AggregateMatch returns all expression parts which are evaluable given the input data.
	AggregateMatch(ctx context.Context, data map[string]any) ([]*StoredExpressionPart, error)

	// Len returns the total number of aggregateable and constantly matched expressions
	// stored in the evaluator.
	Len() int

	// AggregateableLen returns the number of expressions being matched by aggregated trees.
	AggregateableLen() int

	// ConstantLen returns the total number of expressions that must constantly
	// be matched due to non-aggregateable clauses in their expressions.
	ConstantLen() int
}

AggregateEvaluator represents a group of expressions that must be evaluated for a single event received.

An AggregateEvaluator instance exists for every event name being matched.

func NewAggregateEvaluator

func NewAggregateEvaluator(
	parser TreeParser,
	eval ExpressionEvaluator,
	evalLoader EvaluableLoader,
) AggregateEvaluator

type CELCompiler

type CELCompiler interface {
	// Compile calls Compile on the expression, parsing and validating the AST.
	// This returns the AST, issues during validation, and args lifted.
	Compile(expr string) (*cel.Ast, *cel.Issues, LiftedArgs)
	// Parse calls Parse on an expression, but does not check the expression
	// for valid variable names etc. within the env.
	Parse(expr string) (*cel.Ast, *cel.Issues, LiftedArgs)
}

CELCompiler represents a CEL compiler which takes an expression string and returns a CEL AST, any issues during parsing, and any lifted and replaced from the expression.

By default, *cel.Env fulfils this interface. In production, it's common to provide a caching layer on top of *cel.Env to optimize parsing, as it's the slowest part of the expression process.

func EnvCompiler

func EnvCompiler(env *cel.Env) CELCompiler

EnvCompiler turns a *cel.Env into a CELParser.

func NewCachingCompiler

func NewCachingCompiler(env *cel.Env, cache *ccache.Cache) CELCompiler

NewCachingCompiler returns a CELCompiler which lifts quoted literals out of the expression as variables and uses caching to cache expression parsing, resulting in improved performance when parsing expressions.

type EngineType

type EngineType int

type Evaluable

type Evaluable interface {
	// GetID returns a unique identifier for the evaluable item.  If there are
	// two instances of the same expression, the identifier should return a unique
	// string for each instance of the expression (eg. for two pauses).
	//
	// It has the Get prefix to reduce collisions with implementations who expose an
	// ID member.
	GetID() uuid.UUID

	// GetExpression returns an expression as a raw string.
	//
	// It has the Get prefix to reduce collisions with implementations who expose an
	// Expression member.
	GetExpression() string
}

Evaluable represents an evaluable expression with a unique identifier.

type EvaluableLoader

type EvaluableLoader func(ctx context.Context, evaluableIDs ...uuid.UUID) ([]Evaluable, error)

EvaluableLoader returns one or more evaluable items given IDs.

type ExpressionEvaluator

type ExpressionEvaluator func(ctx context.Context, e Evaluable, input map[string]any) (bool, error)

ExpressionEvaluator is a function which evalues an expression given input data, returning a boolean and error.

type ExpressionPart

type ExpressionPart struct {
	// GroupID represents a group ID for the expression part.
	//
	// Within an expression, multiple predicates may be chained with &&.  Each
	// of these must evaluate to `true` for an expression to match.  Group IDs
	// are shared amongst each predicate within an expression.
	//
	// This lets us determine whether the entire group has been matched.
	GroupID   groupID
	Predicate *Predicate
	Parsed    *ParsedExpression
}

ExpressionPart represents a predicate group which is part of an expression. All parts for the given group ID must evaluate to true for the predicate to be matched.

func (ExpressionPart) Equals

func (p ExpressionPart) Equals(n ExpressionPart) bool

func (ExpressionPart) EqualsStored

func (p ExpressionPart) EqualsStored(n *StoredExpressionPart) bool

func (ExpressionPart) Hash

func (p ExpressionPart) Hash() uint64

func (ExpressionPart) ToStored

func (p ExpressionPart) ToStored() *StoredExpressionPart

type Leaf

type Leaf struct {
	Evals []*ExpressionPart
}

Leaf represents the leaf within a tree. This stores all expressions which match the given expression.

For example, adding two expressions each matching "event.data == 'foo'" in an ART creates a leaf node with both evaluable expressions stored in Evals

Note that there are many sub-clauses which need to be matched. Each leaf is a subset of a full expression. Therefore,

type LiftedArgs

type LiftedArgs interface {
	// Get a lifted variable argument from the parsed expression.
	Get(val string) (any, bool)
	// Return all lifted variables as a map.
	Map() map[string]any
}

LiftedArgs represents a set of variables that have been lifted from expressions and replaced with identifiers, eg `id == "foo"` becomes `id == vars.a`, with "foo" lifted as "vars.a".

type MatchingEngine

type MatchingEngine interface {
	// Type returns the EngineType
	Type() EngineType
	// Match takes an input event, containing key:value pairs of data, and
	// matches the given data to any ExpressionParts stored in the engine.
	//
	// Each implementation of the engine may differ on granularity of
	// expression parts received.  Some may return false positives, but
	// each MatchingEngine should NEVER omit ExpressionParts which match
	// the given input.
	Match(ctx context.Context, input map[string]any) ([]*StoredExpressionPart, error)
	// Add adds a new expression part to the matching engine for future matches.
	Add(ctx context.Context, p ExpressionPart) error
	// Remove removes an expression part from the matching engine, ensuring that the
	// ExpressionPart will not be matched in the future.
	Remove(ctx context.Context, p ExpressionPart) error

	// Search searches for a given variable<>value match, returning any expression
	// parts that match.
	//
	// Similar to match, each implementation of the engine may differ on
	// granularity of expression parts received.  Some may return false positives by
	// ignoring the variable name.  Note that each MatchingEngine should NEVER
	// omit ExpressionParts which match the given input;  false positives are okay,
	// but not returning valid matches must be impossible.
	Search(ctx context.Context, variable string, input any) []*StoredExpressionPart
}

MatchingEngine represents an engine (such as a b-tree, radix trie, or simple hash map) which matches a predicate over many expressions.

type Node

type Node struct {
	GroupID groupID

	// Ands contains predicates at this level of the expression that are joined together
	// with an && operator.  All nodes in this set must evaluate to true in order for this
	// node in the expression to be truthy.
	//
	// Note that if any on of the Ors nodes evaluates to true, this node is truthy, regardless
	// of whether the Ands set evaluates to true.
	Ands []*Node `json:"and,omitempty"`

	// Ors represents matching OR groups within this expression.  For example, in
	// the expression `a == b && (c == 1 || d == 1)` the top-level predicate group will
	// have a child group containing the parenthesis sub-expression.
	//
	// At least one of the Or node's sub-trees must evaluate to true for the node to
	// be truthy, alongside all Ands.
	Ors []*Node `json:"or,omitempty"`

	// Predicate represents the predicate for this node.  This must evaluate to true in order
	// for the expression to be truthy.
	//
	// If this is nil, this is a parent container for a series of AND or Or checks.
	// a == b
	Predicate *Predicate
}

PredicateGroup represents a group of predicates that must all pass in order to execute the given expression. For example, this might contain two predicates representing an expression with two operators combined with "&&".

MATCHING & EVALUATION

A node evaluates to true if ALL of the following conditions are met:

- All of the ANDS are truthy. - One or more of the ORs are truthy

In essence, if there are ANDs and ORs, the ORs are implicitly added to ANDs:

(A && (B || C))

This requres A *and* either B or C, and so we require all ANDs plus at least one node from OR to evaluate to true

func (Node) HasPredicate

func (n Node) HasPredicate() bool

func (*Node) String

func (n *Node) String() string

type ParsedExpression

type ParsedExpression struct {
	Root Node

	// Vars represents rewritten literals within the expression.
	//
	// This allows us to rewrite eg. `event.data.id == "foo"` into
	// `event.data.id == vars.a` such that multiple different literals
	// share the same expression.  Using the same expression allows us
	// to cache and skip CEL parsing, which is the slowest aspect of
	// expression matching.
	Vars LiftedArgs

	// Evaluable stores the original evaluable interface that was parsed.
	EvaluableID uuid.UUID

	HasMacros bool
}

ParsedExpression represents a parsed CEL expression into our higher-level AST.

Expressions are simplified and canonicalized, eg. !(a == b) becomes a != b and !(b <= a) becomes (a > b).

func (ParsedExpression) RootGroups

func (p ParsedExpression) RootGroups() []*Node

RootGroups returns the top-level matching groups within an expression. This is a small utility to check the number of matching groups easily.

type Predicate

type Predicate struct {
	// Literal represents the literal value that the operator compares against.  If two
	// variable are being compared, this is nil and LiteralIdent holds a pointer to the
	// name of the second variable.
	Literal any

	// Ident is the ident we're comparing to, eg. the variable.
	Ident string

	// LiteralIdent represents the second literal that we're comparing against,
	// eg. in the expression "event.data.a == event.data.b" this stores event.data.b
	LiteralIdent *string

	// Operator is the binary operator being used.  NOTE:  This always assumes that the
	// ident is to the left of the operator, eg "event.data.value > 100".  If the value
	// is to the left of the operator, the operator will be switched
	// (ie. 100 > event.data.value becomes event.data.value < 100)
	Operator string
}

Predicate represents a predicate that must evaluate to true in order for an expression to be considered as viable when checking an event.

This is equivalent to a CEL overload/function/macro.

func (Predicate) LiteralAsFloat64

func (p Predicate) LiteralAsFloat64() (float64, error)

func (Predicate) LiteralAsString

func (p Predicate) LiteralAsString() string

func (Predicate) String

func (p Predicate) String() string

type RandomReader

type RandomReader func(p []byte) (n int, err error)

type StoredExpressionPart

type StoredExpressionPart struct {
	GroupID     groupID
	PredicateID uint64
	Parsed      *ParsedExpression
	Ident       *string
}

StoredExpressionPart is a lightweight expression part which only stores a hash of the predicate to reduce memory usage.

type StringExpression

type StringExpression string

StringExpression is a string type that implements Evaluable, useful for basic ephemeral expressions that have no lifetime.

func (StringExpression) GetExpression

func (s StringExpression) GetExpression() string

func (StringExpression) GetID

func (s StringExpression) GetID() uuid.UUID

type TreeParser

type TreeParser interface {
	Parse(ctx context.Context, eval Evaluable) (*ParsedExpression, error)
}

TreeParser parses an expression into a tree, with a root node and branches for each subsequent OR or AND expression.

func NewTreeParser

func NewTreeParser(ep CELCompiler) TreeParser

NewTreeParser returns a new tree parser for a given *cel.Env

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL