gomempool

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 30, 2022 License: MIT Imports: 3 Imported by: 0

README

gomempool

Build Status

A []byte pool implementation for Go.

Go network programs can get into a bit of garbage collection trouble if they are constantly allocating buffers of []byte to process network requests. This library gives you an interface you can program to to easily re-use []bytes without too much additional work. It also provides statistics so you can query how well the Pool is working for you.

This is fully covered with godoc, including examples, motivation, and everything else you might otherwise expect from a README.md on GitHub. (DRY.)

This is currently at version 1.0.0. Semantic versioning will be used for version numbers.

Code Signing

Starting with commit f94a124, I will be signing this repository with the "jerf" keybase account. If you are viewing this repository through GitHub, you should see the commits as showing as "verified" in the commit view.

(Bear in mind that due to the nature of how git commit signing works, there may be runs of unverified commits; what matters is that the top one is signed.)

sync.Pool

"What about sync.Pool?" sync.Pool efficiently pools a homogeneous collection of objects, whereas the focus of gomempool here is explicitly on heterogeneous []bytes. They don't overlap much.

Documentation

Overview

Package gomempool implements a simple memory pool for byte slices.

When To Use gomempool

The Go garbage collector runs more often when more bytes are allocated. (For full details, see the runtime package documentation on the GOGC variable.) Avoiding allocations can help the stop-the-world GC run much less often.

To determine if you should use this, first deploy your code into as realistic an environment as possible. Extract a runtime.MemStats structure from your running code. Examine the various Pause fields in that structure to determine if you have a GC problem, preferably in conjunction with some other monitoring of real performance. (Remember the numbers are in nanoseconds.) If the numbers you have are OK for your use case, STOP HERE. Do not proceed.

If you are generating a lot of garbage collection pauses, the next question is why. Profile the heap. If the answer is anything other than []byte slices, STOP HERE. Do not proceed.

Finally, if you are indeed seeing a lot of allocations of []bytes, you may wish to preceed with using this library. gomempool is a power tool; it can save your program, but it can blow your feet off, too. (I've done both.)

That said, despite the narrow use case of this library, it can have an effect in certain situations, which I happened to encounter. I suspect the biggest use case is a network application that often allocates large messages, which is what I have, causing an otherwise relatively memory-svelte program to allocate dozens of megabytes per second of []byte buffers to process messages.

Using Gomempool

To use the pool, there are three basic steps:

  1. Create a pool
  2. Obtain byte slices
  3. Optionally return the byte slices

The following prose documentation will cover each step at length. Expand the Example below for concise usage examples and things you can copy & paste.

Create A Pool

First, create a pool.

pool := gomempool.New(
        64,          // minimum sized slice to store and hand out
        1024 * 1024, // maximum sized slice to store and hand out
        10,          // maximum number of buffers to save
    )

A *Pool is obtained by calling gomempool.New. All methods on the *Pool are threadsafe. The pool can be configured via the New call.

var pool *gomempool.Pool

A nil pointer of type *gomempool.Pool is also a valid pool. This will use normal make() to create byte slices and simply discard the slice when asked to .Return() it. This is convenient for testing whether you've got a memory error in your code, because you can swap in a nil Pool pointer without changing any other code. If an error goes away when you do that, you have a memory error. (Probably .Return()ing something too soon.)

Obtain byte slices

[]bytes in the pool come with an "Allocator" that is responsible for returning them correctly.

allocator := pool.GetNewAllocator()
bytes := allocator.Allocate(453)

// use the bytes

To obtain an allocator, you call GetNewAllocator() on the pool. This returns an allocator that is not yet used. You may then call .Allocate(uint64) on it to assign a []byte from the pool. This []byte is then associated with that Allocator until you call .Return() on the allocator, after which the []byte goes back into the pool.

Allocations never fail (barring a complete out-of-memory situation, of course). If the pool does not have a correctly-sized []byte on hand, it will create one.

If you ask for more bytes than the pool is configured to store, the Allocator will create a transient []byte, which it will not manage. You can check whether you are invoking this case by calling .MaxSize on the pool.

You MUST NOT call .Return() until you are entirely done with the []byte. This includes shared slices you may have created; this is by far the easiest way to get in trouble with a []byte pool, as it is easy to accidentally introduce sharing without realizing it.

You must also make sure not to do anything with your []byte that might cause another []byte to be created instead; for instance, using your pooled []byte in an "append" call is dangerous, because the runtime might decide to give you back a []byte that backs to an entirely different array. In this case your []byte and your Allocator will cease to be related. If the Allocator is correctly managed, your code will not fail, but you won't be getting any benefit, either.

allocator2 := pool.GetNewAllocator()
allocator2.Allocate(738)
newBytes := allocator.Bytes()

You may retrieve the []byte slice corresponding to an Allocator at any time by calling .Bytes(). This means that if you do need to pass the []byte and Allocator around, it suffices to pass just the Allocator. (Though, be aware that the Allocator's []byte will be of the original size you asked for. This can not be changed, as you can not change the original slice itself.)

Allocators can be reused freely, as long as they are used correctly. However, an individual Allocator is not threadsafe. Its interaction with the Pool is, but its internal values are not; do not use the same Allocator from more than one goroutine at a time.

// using the same allocator as above
allocator2.Allocate(723) // PANIC!

Once allocated, an allocator will PANIC if you try to allocate again with ErrBytesAlreadyAllocated.

Once .Return() has been called, an allocator will PANIC if you try to .Return() the []byte again.

If no []byte is currently allocated, .Bytes() will PANIC if called.

This is fully on purpose. All of these situations represent profound errors in your code. This sort of error is just as dangerous in Go as it is in any other language. Go may not segfault, but memory management issues can still dish out the pain; better to find out earlier rather than later.

thirdBytes, newAllocator := pool.Allocate(23314)

You can combine obtaining an Allocator and a particular sized []byte by calling .Allocate() on the pool.

The Allocators returned by the nil *Pool use make() to create new slices every time, and simply discard the []byte when done, but they enforce the exact same rules as the "real" Allocators, and panic in all the same places. This is so there is as little difference as possible between the two types of pools.

Optionally return the byte slices

bytes, allocator := pool.Allocate(29348)
defer allocator.Return()
// use bytes

Calling .Return() on an allocator is optional. If a []byte is not returned to the pool, you do not get any further benefit from the pool for that []byte, but the garbage collector will still clean it up normally. This means using a pool is still feasible even if some of your code paths may need to retain a []byte for a long or complicated period of time.

Best Usage

If you have a byte slice that is getting passed through some goroutines, I recommend creating a structure that holds all the relevant data about the object bound together with the allocator:

 type UsesBytesFromPool struct {
     alloc         gomempool.Allocator
     ParsedMessage ParsedMessage
}

which makes it easy to keep the two bound together, and pass them around, with only the last user finally performing the deallocation.

This library does not provide this structure since all I could give you is basically the above struct, with an interface{} in it.

Additional Functionality

You can query the pool for its cache statistics by calling Stats(), which will return a structure describing how the individual buckets are performing.

Example
pool := New(4096, 10*1024*1024, 20)

// simple usage
bytes, alloc := pool.Allocate(12345)
bytes[0] = 0

// use bytes
alloc.Return()
// stop using bytes now, it could become anything

// the nil Pool pointer has an implementation as well,
// which uses normal garbage collection.
pool = nil
bytes, alloc = pool.Allocate(12345)
alloc.Return()

// You can also disconnect the act of obtaining an Allocator
// from actually allocating. Presumably, one might pass this to
// a function or something.
alloc = pool.GetNewAllocator()
bytes = alloc.Allocate(12345)
defer alloc.Return()

// ... use bytes...
Output:

Index

Examples

Constants

This section is empty.

Variables

View Source
var ErrBytesAlreadyAllocated = errors.New("can't allocate again, because this already has allocated bytes")

This is "panic"ed out when attempting to Allocate() with an Allocator that is currently still allocated.

View Source
var ErrBytesAlreadyReturned = errors.New("can't perform operation because the bytes were already returned")

This is "panic"ed out when attempting to Return() an Allocation that has already been returned.

Functions

This section is empty.

Types

type Allocator

type Allocator interface {
	// Allocate allocates the requested number of bytes and returns the
	// correct []byte. Panics if this allocator is currently allocated.
	Allocate(uint64) []byte

	// Bytes returns the currently-allocated bytes, or panics if there
	// aren't any. This will represent the same range of memory as what
	// Allocate returned.
	Bytes() []byte

	// This deallocates the allocator. If a pool is being used, this
	// returns the []byte to the pool; if the nil pool is being used, this
	// simply drops the []byte reference and lets the GC pick it up.
	Return()
}

A Allocator wraps some allocated bytes.

type Pool

type Pool struct {
	// contains filtered or unexported fields
}

A Pool implements the memory pool.

func New

func New(minSize, maxSize, maxBufs uint64) *Pool

New returns a new Pool.

maxSize indicates the maximum size we're willing to allocate. The minSize is the min we're willing to allocate. maxBufs is the maximum number of buffers we're willing to save at a time. Max memory consumed by unused buffers is approx maxBufs * maxSize * 2.

func (*Pool) Allocate

func (p *Pool) Allocate(size uint64) ([]byte, Allocator)

Allocate creates an Allocator, and also immediately obtains and returns a byte slice of the given size, associated with the returned Allocator.

func (*Pool) GetNewAllocator

func (p *Pool) GetNewAllocator() Allocator

GetNewAllocator returns an Allocator that can be used to obtain byte slices.

func (*Pool) MaxSize

func (p *Pool) MaxSize() uint64

MaxSize returns the maximum size of byte slices the pool will manage.

func (*Pool) MinSize

func (p *Pool) MinSize() uint64

MinSize returns the minimum size of byte slices the pool will return.

func (*Pool) Stats

func (p *Pool) Stats() []Stat

Stats atomically fetches the stats, and returns you an independent copy. It automatically filters out any buckets that have seen no activity.

type Stat

type Stat struct {
	Size      uint64
	Hit       uint64
	Miss      uint64
	Returned  uint64
	Discarded uint64
	Depth     uint64
}

A Stat records statistics for memory allocations of the given .Size.

A Hit is when a user requested an allocation of a certain size, and the Pool handed out a suitable memory chunk from what it had on hand.

A Miss is when a user requested an allocation of a certain size, and the Pool had to create a new []byte due to not having anything on hand.

A Returned count is the number of buffers that have been .Return()ed.

A Discarded count means that the user .Return()ed, but the Pool already had its maximum number on hand of []bytes of the given size, so it has left the returned []byte to the tender mercies of the standard GC.

The Depth is a snapshot of how deep the linked list of unused buffers currently is for this bucket.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL