mapset

package module
v2.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 10, 2020 License: MIT Imports: 8 Imported by: 3

README

pyraset

Build Status Codacy Badge GoDoc Go Report Card Dependabot Status GitHub License GitHub last commit

pyraset is an improved version of golang-set that features fast structure and subset support

How to get it

pyraset uses go modules and is currently released under the v2 path.

go get -u github.com/gofunky/pyraset/v2

What to regard when migrating from golang-set

Before migrating from golang-set to pyraset, there a few things to consider. Nevertheless, the necessary code changes make your code more awesome.

  1. pyraset uses simplified constructors, accepting only variadic arguments, check the docs for details.
  2. Equal has become deep.
  3. In tests, use cmp instead of reflect for comparisons.

Why another set library

Before pyraset, I had been using golang-set. It works fine for simple operations on small sets. However, I noticed that some Equal calls did not work as I expected them to work.

Take this example:

a := NewSet("test")
b := NewSet("test")
subSetA := NewSet("foo")
subSetB := NewSet("foo")

// They should be equal, since both contain "test"
fmt.Printf("are a and b equal: %v\n", a.Equal(b))
// are a and b equal: true -> pass

a.Add(subSetA)

// They should not be equal, only a contains subSetA
fmt.Printf("are a and b equal: %v\n", a.Equal(b))
//  are a and b equal: false -> pass

b.Add(subSetB)

// Shouldn't they know be equal? Both are Set { "test", Set { "foo" } }
fmt.Printf("are a and b equal: %v\n", a.Equal(b))
//  are a and b equal: false -> surprise

// Let's try to use subSetA instead
b.Remove(subSetB)
b.Add(subSetA)

fmt.Printf("are a and b equal: %v\n", a.Equal(b))
//  are a and b equal: true

What this example shows, apparently, is that the set doesn't use deep equality checks. For interface types, that is no surprise, for subsets, however, it is indeed unexpected because even Power creates subsets.

In pyraset, this is solved by using hashes. So basically, instead of a comparable map as used in golang-set, we use a hashmap. Hashing, however has a performance tradeoff. CRUD operations are more complex, set operations, in contrast, are performing better. Some operations perform multiple magnitudes better than golang-set.

Hashing gives the user another benefit. pyraset uses hashstructure and xxhash by default. Accordingly, one may know compare any struct or interface that hashstructure accepts as it was a deep equals.

With Contains, one may notice that a comparable map performs much better. Hashing every given element, has a cost. Other operations that rely on Contains also have this bottleneck. To solve this, pyraset uses a comparable cache map that maps the generated hashes. The overhead for the cache map is fair and, in turn, improves the performance of all operations that compare externally given elements.

How to use it

In the progress of refactoring golang-set, some new operations have been introduced, others have been dropped.

There are only three constructors anymore. All three accept variadic arguments, no slices. These are the constructors:

  • NewSet(...) for thread-safe and cached sets
  • NewUnsafeSet(...) for non-thread-safe but cached sets
  • SetOptions.New(...) for sets that are based on individual options (e.g., disabled caching)

Set.Add(...) and Set.Remove(...) also accept variadic arguments.

Furthermore, there are new "converters" for thread safety:

  • Set.CoreSet() to receive the non-thread-safe version from any set
  • Set.ThreadSafe() to receive the thread-safe version from any set

The calculated hash representation can be accessed via Set.Hash().

UpdateHash

pyraset calculates the hash only once for every element to improve the performance. However, if someone added a pointer to the set and modified the referred struct, pyraset will not notice the changes implicitly. If the elements aren't immutable, one has to call Set.UpdateHash() explicitly if the corresponding set operations should consider possible changes. The update call is optional. If the set operation results only depend on the original values, there is no update necessary.

Example
a := NewSet("test")
b := NewSet("test")
subSetA := NewSet("foo")
subSetB := NewSet("foo")
a.Add(subSetA)
b.Add(subSetB)

fmt.Printf("are a and b equal: %v\n", a.Equal(b))
// are a and b equal: true -> check

subSetA.Add("bar")
subSetB.Add("bar")
fmt.Printf("are a and b equal: %v\n", a.Equal(b))
// are a and b equal: false -> oh no, what's wrong?

updatedA := a.UpdateHash()
updatedB := b.UpdateHash()

fmt.Printf("are a and b equal: %v\n", a.Equal(b))
// are a and b equal: true -> check
fmt.Printf("number of updates:\na:%v\nb:%v\n", updatedA, updatedB)
// number of updates:
// a: 1
// b: 1
// -> check
Transformed examples as known from golang-set
requiredClasses := pyraset.NewSet()
requiredClasses.Add("Cooking")
requiredClasses.Add("English")
requiredClasses.Add("Math")
requiredClasses.Add("Biology")

scienceClasses := pyraset.NewSet("Biology", "Chemistry")

electiveClasses := pyraset.NewSet()
electiveClasses.Add("Welding")
electiveClasses.Add("Music")
electiveClasses.Add("Automotive")

bonusClasses := pyraset.NewSet()
bonusClasses.Add("Go Programming")
bonusClasses.Add("Python Programming")

// Show me all the available classes one can take
allClasses := requiredClasses.Union(scienceClasses).
    Union(electiveClasses).Union(bonusClasses)
fmt.Println(allClasses)
// Set{Cooking, English, Math, Chemistry, Welding, Biology, Music, Automotive, Go Programming, Python Programming}


// Is cooking considered a science class?
fmt.Println(scienceClasses.Contains("Cooking"))
// false

//Show me all classes that are not science classes.
fmt.Println(allClasses.Difference(scienceClasses))
// Set{Music, Automotive, Go Programming, Python Programming, Cooking, English, Math, Welding}

// Which science classes are also required classes?
fmt.Println(scienceClasses.Intersect(requiredClasses))
// Set{Biology}

// How many bonus classes do you offer?
fmt.Println(bonusClasses.Cardinality())
// 2

// Do you have the following classes? Welding, Automotive and English?
fmt.Println(allClasses.IsSuperset(pyraset.NewSet("Welding", "Automotive", "English")))
// true

Performance

The following charts show the ns/op ratios for the different set operations contrasted to golang-set. The smaller the values, the better the scores are, proportionally to the other ones.

safety

safety

More benchmark results can be accessed any analyzed here.

Documentation

Overview

Package mapset implements a simple and generic set collection. Items stored within it are unordered and unique. It supports typical set operations: membership testing, intersection, union, difference, symmetric difference and cloning.

mapset provides two implementations of the Set interface. The default implementation is safe for concurrent access, but a non-thread-safe implementation is also provided for programs that can benefit from the slight complexity improvement and that can enforce mutual exclusion through other means.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Iterator

type Iterator struct {
	C <-chan interface{}
	// contains filtered or unexported fields
}

Iterator defines an iterator over a Set, its C channel can be used to range over the Set's elements.

Example
package main

import (
	"fmt"
)

type YourType struct {
	Name string
}

func main() {
	set := NewSet(
		&YourType{Name: "Alise"},
		&YourType{Name: "Bob"},
		&YourType{Name: "John"},
		&YourType{Name: "Nick"},
	)

	var found *YourType
	it := set.Iterator()

	for elem := range it.C {
		if elem.(*YourType).Name == "John" {
			found = elem.(*YourType)
			it.Stop()
		}
	}

	fmt.Printf("Found %+v\n", found)

}
Output:

Found &{Name:John}

func (*Iterator) Stop

func (i *Iterator) Stop()

Stop stops the Iterator, no further elements will be received on C, C will be closed.

type OrderedPair

type OrderedPair struct {
	First  interface{}
	Second interface{}
}

OrderedPair represents a 2-tuple of values.

func (*OrderedPair) Equal

func (pair *OrderedPair) Equal(other OrderedPair) bool

Equal determines of this pair equals the given pair.

func (OrderedPair) String

func (pair OrderedPair) String() string

String outputs a 2-tuple in the form "(A, B)".

type Set

type Set interface {
	hashstructure.Hashable

	// UpdateHash updates the currently calculated hash of the set.
	// Use it if underlying elements are mutable and once they may have been changed.
	// It's not necessary to update the hashes but comparators will treat the element as if was not changed.
	// UpdateHash returns the number of updated hashes, and thus, the modified elements.
	UpdateHash() (updated int)

	// Add the given elements to this set.
	Add(i ...interface{})

	// Cardinality determines the number of elements in the set.
	Cardinality() int

	// Empty determines if the set is empty.
	Empty() bool

	// Clear removes all elements from the set, leaving the empty set.
	Clear()

	// Clone produces a clone of the set using the same implementation, duplicating all keys.
	Clone() Set

	// Contains determines whether the given items are all in the set.
	Contains(i ...interface{}) bool

	// Difference determines the difference between this set and the given set. The returned set will contain
	// all elements of this set that are not also elements of other.
	//
	// Note that the argument to Difference  must be of the same type as the receiver of the method.
	// Otherwise, Difference will panic.
	Difference(other Set) Set

	// Equal determines if two sets are equal to each other. If they have the same cardinality and contain the same
	// elements, they are considered equal. The order in which the elements were added is irrelevant.
	//
	// Note that the argument to Equal must be  of the same type as the receiver of the  method.
	// Otherwise, Equal will panic.
	Equal(other Set) bool

	// Returns a new set containing only the elements that exist only in both sets.
	//
	// Note that the argument to Intersect must be of the same type as the receiver of the method.
	// Otherwise, Intersect will panic.
	Intersect(other Set) Set

	// IsProperSubset determines if every element in this set is in the other set but the two sets are not equal.
	//
	// Note that the argument to IsProperSubset must be of the same type as the receiver of the method.
	// Otherwise, IsProperSubset will panic.
	IsProperSubset(other Set) bool

	// IsProperSuperset determines if every element in the other set is in this set but the two sets are notequal.
	//
	// Note that the argument to IsSuperset must be of the same type as the receiver of the method.
	// Otherwise, IsSuperset will panic.
	IsProperSuperset(other Set) bool

	// IsSubset determines if every element in this set is in the other set.
	//
	// Note that the argument to IsSubset  must be of the same type as the receiver of the method.
	// Otherwise, IsSubset will panic.
	IsSubset(other Set) bool

	// IsSuperset determines if every element in the other set is in this set.
	//
	// Note that the argument to IsSuperset  must be of the same type as the receiver of the method.
	// Otherwise, IsSuperset will panic.
	IsSuperset(other Set) bool

	// Each iterates over elements and executes the passed func against each element.
	// If passed func returns true, stop iteration eagerly.
	Each(func(interface{}) bool)

	// Iter returns a channel of elements that you can range over.
	Iter() <-chan interface{}

	// Iterator that you can use to range over the set.
	Iterator() *Iterator

	// Remove the given elements from this set.
	Remove(i ...interface{})

	// String provides a convenient string representation of the current state of the set.
	String() string

	// SymmetricDifference provides a new set with all elements which are  in either this set or the other set
	// but not in both.
	//
	// Note that the argument to SymmetricDifference must be of the same type as the receiver of the method.
	// Otherwise, SymmetricDifference will panic.
	SymmetricDifference(other Set) Set

	// Union provides a new set with all elements in this set and the given set.
	//
	// Note that the argument to Union must be of the same type as the receiver of the method.
	// Otherwise, IsSuperset will panic.
	Union(other Set) Set

	// Pop removes and returns an arbitrary item from the set.
	Pop() interface{}

	// PowerSet builds all subsets of a given set (Power Set).
	PowerSet() Set

	// CartesianProduct builds the Cartesian Product of this set and the given set.
	CartesianProduct(other Set) Set

	// ToSlice converts the members of the set as to a slice.
	ToSlice() []interface{}

	// CoreSet provides the non-thread-safe core set.
	CoreSet() threadUnsafeSet

	// ThreadSafe provides the thread-safe core set.
	ThreadSafe() *threadSafeSet

	// MarshalJSON creates a JSON array from the set, it marshals all elements
	MarshalJSON() ([]byte, error)

	// UnmarshalJSON recreates a set from a JSON array, it only decodes primitive types.
	// Numbers are decoded as json.Number.
	UnmarshalJSON(p []byte) error
}

Set is the primary interface provided by the mapset package. It represents an unordered set of data and a large number of operations that can be applied to that set.

func NewSet

func NewSet(elements ...interface{}) Set

NewSet creates a set that contains the given elements. Operations on the resulting set are thread-safe.

func NewUnsafeSet

func NewUnsafeSet(elements ...interface{}) Set

NewUnsafeSet creates a set that contains the given elements. Operations on the resulting set are not thread-safe.

type SetOptions

type SetOptions struct {
	// Cache enables the hashing cache so that previously added items can be quickly looked up.
	// The cache improves performance of comparable items and builtins.
	// The performance is only increased since golang hashing is slow due to the reflection used in the byte.Writer.
	// Unless generics or fast type checks are introduced in golang, this option trades some memory for performance.
	Cache bool
	// Unsafe makes the set non-thread-safe.
	Unsafe bool
	// Hasher overrides the default hash function.
	Hasher hash.Hash64
}

SetOptions contain options that affect the set construction.

func (SetOptions) New

func (o SetOptions) New(elements ...interface{}) (set Set)

New creates a new set with the given options.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL