safetyfast

package module
v0.0.0-...-635f495 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 14, 2022 License: MIT Imports: 4 Imported by: 0

README

Go Reference Go Report Card codecov Build Status

SafetyFast - Put thread-safety first, with the performance of safety last.

This is a Go library that implements synchronization primitives over Intel TSX (hardware transactional primitives).

go get github.com/linux4life798/safetyfast

Checkout the SafetyFast Project Page.

Benchmarking

The following plot shows the number of milliseconds it took for 8 goroutines to increments 480000 random elements (per goroutine) of an array of ints. The x axis denotes how large (and therefore sparse) the array was. The synchronization primitive used during the increment is indicated as a series/line.

Performance Graph

Note that, as the array size increases, the likelihood of two goroutines touching the same element at the same instance decreases. This is why we see such a dramatic increase in speed, when using either the HLE or RTM style synchronization primitive.

The SystemMutex is just sync.Mutex.

It is also worth observing that the performance started to degrade towards the very large array sizes. This is most likely due to a cache size limitation.

Snippets

Using RTM

m := map[string]int{
    "word1": 0,
}

c := NewRTMContexDefault()
c.Atomic(func() {
    // Action to be done transactionally
    m["word1"] = m["word1"] + 1
})

Using HLE

m := map[string]int{
    "word1": 0,
}

var lock safetyfast.SpinHLEMutex
lock.Lock()
// Action to be done transactionally
m["word1"] = m["word1"] + 1
lock.Unlock()


Check if your CPU supports Intel TSX

Use the doihavetsx utility

go get github.com/linux4life798/safetyfast/doihavetsx
doihavetsx

The output should look something like:

CPU Brand:  Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
RTM:        Yes
HLE:        Yes

Common CPUs and Machines

CPU Name CPU Codename / Generation TSX Supported Machine Description
Intel(R) Xeon(R) CPU E3-1505M v5 @ 2.80GHz Skylake/6th Yes Dell Precision 5510
Intel(R) Xeon(R) Gold 6136 CPU @ 3.00GHz Skylake/6th Yes Dell Precision 7920
Intel(R) Core(TM) i7-3720QM CPU @ 2.60GHz Ivy Bridge/3rd No MacBook Pro (Retina, Mid 2012)
Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz Kaby Lake/7th Yes MacBook Pro "Core i7" 2.9 15" Touch/Mid-2017
Intel(R) Core(TM) i7-4650U CPU @ 1.70GHz Haswell/4th No (Yes-before microcode install) MacBook Air (13-inch, Early 2014)
Intel(R) Core(TM) i7-7Y75 CPU @ 1.30GHz Kaby Lake/7th Yes MacBook (Retina, 12-inch, 2017)
Intel(R) Core(TM) i7-4980HQ CPU @ 2.80GHz Haswell/4th No MacBook Pro (Retina, 15-inch, Mid 2015)
Intel(R) Core(TM) i7-7920HQ CPU @ 3.10GHz Kaby Lake/7th Yes MacBook Pro (15-inch, 2017)

Please add your machine to this table! Pull-request or issues welcome.



Code Examples

Checking for HLE and RTM support in code

It is necessary to check that the CPU you are using support Intel RTM and/or Intel HLE instruction sets, since safetyfast does not check. This can be accomplished by using the Intel provided cpuid package, as shown below.

import (
  "github.com/intel-go/cpuid"
)

func main() {
	if !cpuid.HasExtendedFeature(cpuid.RTM) {
		panic("The CPU does not support Intel RTM")
	}

	if !cpuid.HasExtendedFeature(cpuid.HLE) {
		panic("The CPU does not support Intel HLE")
	}
}

Using RTM

package main

import (
    "fmt"
    "sync"
    "github.com/linux4life798/safetyfast"
)

func main() {
    m := map[string]int{
        "word1": 0,
        "word2": 0,
    }

    c := safetyfast.NewRTMContexDefault()
    var wg sync.WaitGroup

    wg.Add(2)
    go c.Atomic(func() {
        // Action to be done transactionally
        m["word1"] = m["word1"] + 1
        wg.Done()
    })
    go c.Atomic(func() {
        // Action to be done transactionally
        m["word1"] = m["word1"] + 1
        wg.Done()
    })
    wg.Wait()

    fmt.Println("word1 =", m["word1"])
}

Using HLE

package main

import (
    "fmt"
    "sync"
    "github.com/linux4life798/safetyfast"
)

func main() {
    m := map[string]int{
        "word1": 0,
        "word2": 0,
    }

    var lock safetyfast.SpinHLEMutex
    var wg sync.WaitGroup

    wg.Add(2)
    go func() {
        lock.Lock()
        // Action to be done transactionally
        m["word1"] = m["word1"] + 1
        lock.Unlock()
        wg.Done()
    }()
    go func() {
        lock.Lock()
        // Action to be done transactionally
        m["word1"] = m["word1"] + 1
        lock.Unlock()
        wg.Done()
    }()
    wg.Wait()

    fmt.Println("word1 =", m["word1"])
}

Documentation

Index

Constants

View Source
const LockAttempts = int32(200)

LockAttempts sets how many times the spin loop is willing to try to fetching the lock.

Variables

This section is empty.

Functions

func HLESpinCountLock

func HLESpinCountLock(val, attempts *int32)

HLESpinCountLock tries to set val to 1 at most attempts times using Intel HLE. It is implemented as a spin lock that decrements attempts for each attempt. The spin operation makes use of the PAUSE and XACQUIRE LOCK XCHG instructions. If attempts is 0 when the function returns, the lock was not acquired and the spin lock gave up. Please note that attempts must be greater 0 when called.

func HLESpinLock

func HLESpinLock(val *int32)

HLESpinLock repeatedly tries to set val to 1 using Intel HLE and XCHG. It is implemented as a spin lock that makes use of the PAUSE and LOCK XCHG instructions. Please note that this function will never return unless the lock is acquired. This means a deadlock will occur if the holder of the lock is descheduled by the goruntime. Please use HLESpinCountLock to limit the spins and manually invoke runtime.Gosched periodically, insted.

func HLETryLock

func HLETryLock(val *int32) int32

HLETryLock attempts only once to acquire the lock by writing a 1 to val using HLE primitives. This function returns a 0 if the lock was acquired.

func HLEUnlock

func HLEUnlock(val *int32)

HLEUnlock writes a 0 to val to indicate the lock has been released using HLE primitives

func Lock1XCHG32

func Lock1XCHG32(val *int32) (old int32)

Lock1XCHG32 will atomically write 1 to val while returning the old value. The size of val must be 32 bits.

func Lock1XCHG64

func Lock1XCHG64(val *int64) (old int64)

Lock1XCHG64 will atomically write 1 to val while returning the old value. The size of val must be 64 bits.

func Lock1XCHG8

func Lock1XCHG8(val *int8) (old int8)

Lock1XCHG8 will atomically write 1 to val while returning the old value. The size of val must be 8 bits.

func Mfence

func Mfence()

Mfence executes the MFENCE x86 instruction.

func Pause

func Pause()

Pause executes the PAUSE x86 instruction.

func SetAndFence32

func SetAndFence32(val *int32)

SetAndFence32 writes a 1 to val and asserts an MFENCE.

func SpinCountLock

func SpinCountLock(val, attempts *int32)

SpinCountLock implements a basic spin lock that tries to acquire the lock only attempts times. It assumes the lock has been acquired when it successfully writes a 1 to val, while the original value was 0. This implementation spins on a read only and then uses the XCHG instruction in order to claim the lock. The loop makes use of the PAUSE hint instruction.

func SpinLock

func SpinLock(val *int32)

SpinLock implements a basic spin lock that waits forever until the lock can be acquired. It assumes the lock has been acquired when it successfully writes a 1 to val, while the original value was 0. This implementation spins on a read only and then uses the XCHG instruction in order to claim the lock. The loop makes use of the PAUSE hint instruction.

func SpinLockAtomics

func SpinLockAtomics(val *int32)

SpinLockAtomics implements a very basic spin forever style lock that uses the built in atomics.SwapInt32 function in order to claim the lock.

Types

type AtomicContext

type AtomicContext interface {
	// Atomic will execute commiter exactly once per call in a manor that
	// appears to be atomic with respect to other commiters launched from this
	// AtomicContext.
	Atomic(commiter func())
}

AtomicContext is the interface provided by a synchronization primitive that is capable of running a functions in an atomic context.

type LockedContext

type LockedContext struct {
	// contains filtered or unexported fields
}

LockedContext provides an AtomicContext that utilizes any sync.Locker.

func NewLockedContext

func NewLockedContext(lock sync.Locker) *LockedContext

NewLockedContext creates a LockedContext that uses lock as the sync method.

func (*LockedContext) Atomic

func (c *LockedContext) Atomic(commiter func())

Atomic executes commiter atomically with respect to other commiters launched from this context.

type RTMContext

type RTMContext struct {
	// contains filtered or unexported fields
}

RTMContext holds the shared state for the fallback path if the RTM transaction fails

func NewRTMContex

func NewRTMContex(l sync.Locker) *RTMContext

NewRTMContex creates an AtomicContext that tries to use Intel RTM, but can fallback to using the provided sync.Locker.

func NewRTMContexDefault

func NewRTMContexDefault() *RTMContext

NewRTMContexDefault creates an AtomicContext that tries to use Intel RTM, but can fallback to using the native sync.Mutex.

func (*RTMContext) Atomic

func (r *RTMContext) Atomic(commiter func())

Atomic executes the commiter in an atomic fasion.

func (*RTMContext) CapacityAborts

func (r *RTMContext) CapacityAborts() uint64

CapacityAborts returns the number of aborts that were due to cache capacity. If you see lots of capacity aborts, this means the commiter function if touching too many memory locations and is unlikely to be reaping any gains from using an RTMContext.

type SpinHLEMutex

type SpinHLEMutex int32

SpinHLEMutex is sync.Mutex replacement that uses HLE

func (*SpinHLEMutex) Lock

func (m *SpinHLEMutex) Lock()

func (*SpinHLEMutex) Unlock

func (m *SpinHLEMutex) Unlock()

type SpinMutex

type SpinMutex int32

func (*SpinMutex) IsLocked

func (m *SpinMutex) IsLocked() bool

func (*SpinMutex) Lock

func (m *SpinMutex) Lock()

Fastest

func (*SpinMutex) Unlock

func (m *SpinMutex) Unlock()

type SpinMutexASM

type SpinMutexASM int32

func (*SpinMutexASM) Lock

func (m *SpinMutexASM) Lock()

func (*SpinMutexASM) Unlock

func (m *SpinMutexASM) Unlock()

type SpinMutexBasic

type SpinMutexBasic struct {
	// contains filtered or unexported fields
}

func (*SpinMutexBasic) Lock

func (m *SpinMutexBasic) Lock()

func (*SpinMutexBasic) Unlock

func (m *SpinMutexBasic) Unlock()

Directories

Path Synopsis
This is a contrived benchmark that caters to transactional memory primitives.
This is a contrived benchmark that caters to transactional memory primitives.
This simple program prints the name of the cpu and if it supports Intel RTM and HLE.
This simple program prints the name of the cpu and if it supports Intel RTM and HLE.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL