regexache

package module
v0.23.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 7, 2023 License: MPL-2.0 Imports: 9 Imported by: 3

README

regexache

regexache is a thread-safe regular expression cache, providing a drop-in replacement for regexp.MustCompile() (regexache calls regexp.MustCompile() on your behalf to populate the cache). This special purpose cache specifically addresses regular expressions, which use a lot of memory. In a project with about ~4500 regexes, using regexache saved nearly 20% total memory use.

Unlike excellent caches, such as go-cache or memcached, the calling code does not need to know anything about the cache or instantiate it, simply dropping in regexache.MustCompile() in place of regexp.MustCompile(). There are cons to this approach but for an existing large project, they may be outweighed by not needing to rework existing code (other than the drop in).

For projects with few regular expressions, caching is unlikely to improve memory use--stick with static use of regexp.MustCompile(). For projects with thousands of regular expressions, and especially untracked duplicates, using regexache can save significant memory.

Potential problems with using regexache include cache contention and preventing garbage collection of regular expressions. Cache contention results from the cache map being read-locked for reads and locked for updates. For garbage collection, if you're not using regexache and instantiate a regular expressions locally and it goes out of scope without any references to it remaining, Go may reclaim the memory. However, regexache keeps pointers to the regular expressions in the cache so they cannot be garbage collected until the entry expires and is cleaned out of the cache. Benchmark various expiration settings to see what works best.

Using regexache

Using regexache is simple. If this is your code before, see below for code after.

Before regexache:

package main

import (
	"fmt"
	"regexp"
)

func main() {
	var validID = regexp.MustCompile(`^[a-z]+\[[0-9]+\]$`)

	fmt.Println(validID.MatchString("adam[23]"))
	fmt.Println(validID.MatchString("eve[7]"))
	fmt.Println(validID.MatchString("Job[48]"))
	fmt.Println(validID.MatchString("snakey"))
}

(Playground)

After regexache:

package main

import (
	"fmt"

	"github.com/YakDriver/regexache"
)

func main() {
	var validID = regexache.MustCompile(`^[a-z]+\[[0-9]+\]$`)

	fmt.Println(validID.MatchString("adam[23]"))
	fmt.Println(validID.MatchString("eve[7]"))
	fmt.Println(validID.MatchString("Job[48]"))
	fmt.Println(validID.MatchString("snakey"))
}

(Playground)

Environment Variables

Env Var Description
REGEXACHE_OFF Any value will turn regexache completely off. Useful for testing with and without caching. When off, regexache.MustCompile() is equivalent to regexp.MustCompile(). By default, regexache caches entries.
REGEXACHE_OUTPUT File to output the cache contents to. Default: Empty (Don't output cache).
REGEXACHE_OUTPUT_MIN Minimum number of lookups entries need to include when listing cache entries. Default: 1.
REGEXACHE_OUTPUT_INTERVAL If outputing the cache, output every X milliseconds. Default: 1000 (1 second).
REGEXACHE_STANDARDIZE Standardize expressions before caching. Default: Empty (Don't standardize).

Tests

Control (not using the cache).
Results - Single VPC: 6.76GB, Two AppRunner: 17.89GB

export REGEXACHE_OFF=1

Example of a running memory profile test of a single VPC acceptance test:

TF_ACC=1 go test \
    ./internal/service/ec2/... \
    -v -parallel 1 \
    -run='^TestAccVPC_basic$' \
    -cpuprofile cpu.prof \
    -memprofile mem.prof \
    -bench \
    -timeout 60m
pprof -http=localhost:4599 mem.prof

Example of a running memory profile test of two parallel AppRunner acceptance tests:

TF_ACC=1 go test \
    ./internal/service/apprunner/... \
    -v -parallel 2 \
    -run='TestAccAppRunnerService_ImageRepository_autoScaling|TestAccAppRunnerService_ImageRepository_basic' \
    -cpuprofile cpu.prof \
    -memprofile mem.prof \
    -bench \
    -timeout 60m
pprof -http=localhost:4599 mem.prof

Documentation

Index

Constants

View Source
const (
	REGEXACHE_OFF             = "REGEXACHE_OFF"
	REGEXACHE_OUTPUT          = "REGEXACHE_OUTPUT"
	REGEXACHE_OUTPUT_INTERVAL = "REGEXACHE_OUTPUT_INTERVAL"
	REGEXACHE_OUTPUT_MIN      = "REGEXACHE_OUTPUT_MIN"
	REGEXACHE_PRELOAD_OFF     = "REGEXACHE_PRELOAD_OFF"
	REGEXACHE_STANDARDIZE     = "REGEXACHE_STANDARDIZE"
)

Variables

This section is empty.

Functions

func MustCompile

func MustCompile(str string) *regexp.Regexp

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL