defaultdict

package module
v0.2.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 18, 2022 License: BSD-3-Clause Imports: 1 Imported by: 1

README

Go Reference Go Report Card

go-defaultdict

Go implementation of Python's defaultdict, in a way that's both thread-safe and memory efficient.

Overview

Underneath it pairs a sync.Map with a sync.Pool, and removed all direct store/write accesses to the map. As a result, the only way to mutate the map is through Load/Get, (which either create a new value for you if this is the first access to the key, or return the value created by a previous Load/Get), then mutate the value returned directly (in a thread-safe way).

Here are 2 example usages:

  1. To implement a rowlock. See my rowlock package for detailed example.

  2. To implement a concurrent counter-by-key. See package example or below for details.

Example Code

Here's a step-by-step example to create a concurrent counter-by-key.

First, create a generator, which simply returns an *int64 so it can be used by atomic int64 functions:

generator := func() defaultdict.Pointer {
  return new(int64)
}

Then, create the map:

m := defaultdict.New(generator)

When you need to add the counter, get by key then use atomic.AddInt64:

atomic.AddInt64(m.Get(key).(*int64), 1)

When you need to get the counter value, just get by key then use atomic.LoadInt64:

fmt.Printf(
  "Key %v was added %d times\n",
  key,
  atomic.LoadInt64(
    m.Get(key).(*int64),
  ),
)

Or use Range:

m.Range(func(key defaultdict.Comparable, value defaultdict.Pointer) bool {
  fmt.Printf("Key %v was added %d times\n", key, atomic.LoadInt64(value.(*int64)))
  return true
})

License

BSD License.

Documentation

Overview

Package defaultdict implements Python's defaultdict, in a way that's both thread-safe and memory-efficient.

There are two example use cases for it:

1. To implement a row lock, that every row (key) has its own lock.

2. To implement a concurrent counter, that every key has its own atomic int as the counter.

Example

This example demonstrates how to use defaultdict to implement a thread-safe counter.

package main

import (
	"fmt"
	"sync"
	"sync/atomic"

	"go.yhsif.com/defaultdict"
)

func main() {
	generator := func() *int64 {
		// Just create a new *int64 so it can be used as atomic int64.
		return new(int64)
	}
	m := defaultdict.New[string](generator)
	var wg sync.WaitGroup
	for i := 0; i < 10; i++ {
		for j := 0; j < i; j++ {
			wg.Add(1)
			go func(key string) {
				defer wg.Done()
				atomic.AddInt64(m.Get(key), 1)
			}(fmt.Sprintf("key-%d", i))
		}
	}

	wg.Wait()
	m.Range(func(key string, value *int64) bool {
		fmt.Printf("Key %v was added %d times\n", key, atomic.LoadInt64(value))
		return true
	})

}
Output:


Key key-1 was added 1 times
Key key-2 was added 2 times
Key key-3 was added 3 times
Key key-4 was added 4 times
Key key-5 was added 5 times
Key key-6 was added 6 times
Key key-7 was added 7 times
Key key-8 was added 8 times
Key key-9 was added 9 times

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Generator

type Generator[T Pointer] func() T

Generator defines the function used to generate the default value of the map.

func SharedPoolMapGenerator

func SharedPoolMapGenerator[K comparable, V Pointer](g Generator[V]) Generator[Map[K, V]]

SharedPoolMapGenerator creates a Generator that returns a Map using g as the Generator.

It's different from just:

func generator[K comparable, V defaultdict.Pointer](g defaultdict.Generator[V]) Map[K, V] {
  return defaultdict.New[K](g)
}

That the Map returned by the same SharedPoolMapGenerator shares the same underlying pool, so it's more memory efficient when used as the second (or third, etc.) layer of a Map.

This is an example of running the benchmark test with go1.15.6:

$ go test -bench .
goos: linux
goarch: amd64
pkg: go.yhsif.com/defaultdict
cpu: Intel(R) Core(TM) i5-7260U CPU @ 2.20GHz
BenchmarkSharedPoolMapGenerator/shared-4                    9691            117636 ns/op            1093 B/op          5 allocs/op
BenchmarkSharedPoolMapGenerator/naive-4                     9882            121344 ns/op            3305 B/op         15 allocs/op
PASS
ok      go.yhsif.com/defaultdict        2.368s
Example

This example demonstrates how to use SharedPoolGenerator to implement a thread-safe counter with 2 layers of keys.

package main

import (
	"fmt"
	"sync"
	"sync/atomic"

	"go.yhsif.com/defaultdict"
)

func main() {
	generator := defaultdict.SharedPoolMapGenerator[string](func() *int64 {
		// Just create a new *int64 so it can be used as atomic int64.
		return new(int64)
	})
	m := defaultdict.New[string](generator)
	var wg sync.WaitGroup
	for i := 1; i < 4; i++ {
		for j := 1; j < 4; j++ {
			for k := 0; k < i*j; k++ {
				wg.Add(1)
				go func(key1, key2 string) {
					defer wg.Done()
					atomic.AddInt64(m.Get(key1).Get(key2), 1)
				}(fmt.Sprintf("key1-%d", i), fmt.Sprintf("key2-%d", j))
			}
		}
	}

	wg.Wait()
	m.Range(func(key1 string, value1 defaultdict.Map[string, *int64]) bool {
		value1.Range(func(key2 string, value2 *int64) bool {
			fmt.Printf("%v/%v was added %d times\n", key1, key2, atomic.LoadInt64(value2))
			return true
		})
		fmt.Println()
		return true
	})

}
Output:


key1-1/key2-1 was added 1 times
key1-1/key2-2 was added 2 times
key1-1/key2-3 was added 3 times

key1-2/key2-1 was added 2 times
key1-2/key2-2 was added 4 times
key1-2/key2-3 was added 6 times

key1-3/key2-1 was added 3 times
key1-3/key2-2 was added 6 times
key1-3/key2-3 was added 9 times

func (Generator[T]) ToPool

func (g Generator[T]) ToPool() *sync.Pool

ToPool creates a *sync.Pool from this Generator.

type Map

type Map[K comparable, V Pointer] interface {
	// Same as in sync.Map, see above for notes about the bool returns.
	Delete(key K)
	Load(key K) (V, bool)
	LoadAndDelete(key K) (V, bool)
	Range(f func(key K, value V) bool)

	// Same as Load, just without the bool return.
	Get(key K) V
}

Map defines a map.

There are a few slight differences in Load and LoadAndDelete comparing to sync.Map:

1. The value type is guaranteed to be the same as the type returned by the Generator used to create this DefaultDict, never nil, even if this is a new key.

2. The bool return being false means that the value is directly from the Generator.

func New

func New[K comparable, V Pointer](g Generator[V]) Map[K, V]

New creates a new DefaultDict.

It pairs a sync.Map with a sync.Pool under the hood.

type Pointer

type Pointer = any

Pointer is the value type of a Map.

It's used for documentation purpose only.

In a Map, all values should be pointers, and those pointers should be safe to be mutated concurrently, for the following reasons:

1, A Map is for concurrent mutations (if you only need concurrent read-only access, a builtin map would suffice)

2. There's no Store function provided by Map interface. All mutations are done by Get/Load then mutate the returned value directly.

As an example, you can use *int64 as Pointer, and do mutations via atomic int64 operations. But slices, even though they are pointers, should not be used here directly. You usually need to pair slice with a lock.

Example (Slice)
package main

import (
	"fmt"
	"sync"

	"go.yhsif.com/defaultdict"
)

type MySliceValue struct {
	lock  sync.Mutex
	slice []int
}

func (m *MySliceValue) Append(i int) {
	m.lock.Lock()
	defer m.lock.Unlock()
	m.slice = append(m.slice, i)
}

func (m *MySliceValue) Len() int {
	m.lock.Lock()
	defer m.lock.Unlock()
	return len(m.slice)
}

func MySliceValueGenerator() *MySliceValue {
	return new(MySliceValue)
}

func main() {
	const key = "foo"
	m := defaultdict.New[string](MySliceValueGenerator)
	m.Get(key).Append(1)
	m.Get(key).Append(2)
	fmt.Println(m.Get(key).Len())   // 2
	fmt.Println(m.Get("bar").Len()) // 0

}
Output:

2
0

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL