import "gorgonia.org/gorgonia"
Package gorgonia is a library that helps facilitate machine learning in Go. Write and evaluate mathematical equations involving multidimensional arrays easily. Do differentiation with them just as easily.
Autodiff showcases automatic differentiation
Code:
g := NewGraph() var x, y, z *Node var err error // define the expression x = NewScalar(g, Float64, WithName("x")) y = NewScalar(g, Float64, WithName("y")) if z, err = Add(x, y); err != nil { log.Fatal(err) } // set initial values then run Let(x, 2.0) Let(y, 2.5) // by default, LispMachine performs forward mode and backwards mode execution m := NewLispMachine(g) defer m.Close() if err = m.RunAll(); err != nil { log.Fatal(err) } fmt.Printf("z: %v\n", z.Value()) if xgrad, err := x.Grad(); err == nil { fmt.Printf("dz/dx: %v\n", xgrad) } if ygrad, err := y.Grad(); err == nil { fmt.Printf("dz/dy: %v\n", ygrad) }
Output:
z: 4.5 dz/dx: 1 dz/dy: 1
Basic example of representing mathematical equations as graphs.
In this example, we want to represent the following equation
z = x + y
Code:
g := NewGraph() var x, y, z *Node var err error // define the expression x = NewScalar(g, Float64, WithName("x")) y = NewScalar(g, Float64, WithName("y")) if z, err = Add(x, y); err != nil { log.Fatal(err) } // create a VM to run the program on machine := NewTapeMachine(g) defer machine.Close() // set initial values then run Let(x, 2.0) Let(y, 2.5) if err = machine.RunAll(); err != nil { log.Fatal(err) } fmt.Printf("%v", z.Value())
Output:
4.5
Code:
xV, yV, bs := prep() concurrentTraining(xV, yV, bs, epochs) fmt.Printf("x:\n%1.1v", xV) fmt.Printf("y:\n%1.1v", yV)
Output:
x: ⎡-0.0003 0.01 0.04 0.09 0.2⎤ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ . . . ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎣-0.0003 0.01 0.04 0.09 0.2⎦ y: [0.3 0.3 0.3 0.3 ... 0.3 0.3 0.3 0.3]
Gorgonia provides an API that is fairly idiomatic - most of the functions in in the API return (T, error). This is useful for many cases, such as an interactive shell for deep learning. However, it must also be acknowledged that this makes composing functions together a bit cumbersome.
To that end, Gorgonia provides two alternative methods. First, the `Lift` based functions; Second the `Must` function
Code:
// Lift g := NewGraph() x := NewMatrix(g, Float32, WithShape(2, 3), WithInit(RangedFrom(0)), WithName("a")) y := NewMatrix(g, Float32, WithShape(3, 2), WithInit(ValuesOf(float32(2))), WithName("b")) z := NewMatrix(g, Float32, WithShape(2, 1), WithInit(Zeroes()), WithName("bias")) wrong := NewMatrix(g, Float64, WithShape(2, 3), WithInit(RangedFrom(0)), WithName("wrong")) // Different LiftXXX functions exist for different API signatures // A good way to do this is to have some instantiated functions at the top level of the package mul := Lift2(Mul) add := Lift2(Add) addB := Lift2Broadcast(BroadcastAdd) sq := Lift1(Square) sm := Lift1Axial(SoftMax) nn := sm(sq(addB(mul(x, y), z, nil, []byte{1}))) // OK nnPlusWrong := add(nn, wrong) // Wrong types. Will Error fmt.Printf("nn: %v\nAn error occurs: %v\n", nn, nnPlusWrong.Err()) // Must() h := NewGraph() a := NewMatrix(h, Float32, WithShape(2, 3), WithInit(RangedFrom(0)), WithName("a")) b := NewMatrix(h, Float32, WithShape(3, 2), WithInit(ValuesOf(float32(2))), WithName("b")) c := NewMatrix(h, Float32, WithShape(2, 1), WithInit(RangedFrom(0)), WithName("c")) wrong2 := NewMatrix(h, Float64, WithShape(2, 3), WithInit(RangedFrom(0)), WithName("wrong")) // This is OK nn2 := Must(SoftMax( Must(Square( Must(BroadcastAdd( Must(Mul(a, b)), c, nil, []byte{1}, )), )), )) fmt.Printf("nn2: %v\n", nn2) defer func() { if r := recover(); r != nil { fmt.Printf("An error occurs (caught by recover()): %v\n", r) } }() nn2PlusWrong := Must(Add(nn2, wrong2)) _ = nn2PlusWrong
Output:
nn: Softmax{-1, false}()(%9) :: Matrix float32 An error occurs: Type inference error. Op: + false. Children: [Matrix float32, Matrix float64], OpType:Matrix a → Matrix a → Matrix a: Unable to unify while inferring type of + false: Unification Fail: float64 ~ float32 cannot be unified nn2: Softmax{-1, false}()(%9) :: Matrix float32 An error occurs (caught by recover()): Type inference error. Op: + false. Children: [Matrix float32, Matrix float64], OpType:Matrix a → Matrix a → Matrix a: Unable to unify while inferring type of + false: Unification Fail: float64 ~ float32 cannot be unified
Code:
g := NewGraph()
a := NodeFromAny(g, tensor.New(tensor.WithShape(2, 3), tensor.WithBacking([]float64{1, 2, 3, 4, 5, 6})))
m1, _ := Mean(a, 1)
m2, _ := KeepDims(a, false, func(a *Node) (*Node, error) { return Mean(a, 1) })
m3, _ := Mean(a, 0)
m4, _ := KeepDims(a, true, func(a *Node) (*Node, error) { return Mean(a, 0) })
m5, _ := KeepDims(a, true, func(a *Node) (*Node, error) { return Mean(a) })
// these reads are necessary as the VM may feel free to clobber the underlying data.
// e.g. if m1.Value() is used in the print statement below, the answer will be wrong.
// This is because before the VM executes the operations, a check is done to see if unsafe
// operations may be done. Unsafe operations are useful in saving memory.
// In this example, Reshape can be unsafely done if no other node is "using" m1,
// so m1.Value() will have its shape clobbered. Thus if m1.Value() is read after the VM has run,
// there is no guarantee that the data is correct. The only way around this is to "use" m1, by the Read() function.
var m1v, m2v, m3v, m4v Value
Read(m1, &m1v)
Read(m2, &m2v)
Read(m3, &m3v)
Read(m4, &m4v)
vm := NewTapeMachine(g)
if err := vm.RunAll(); err != nil {
panic(err)
}
fmt.Printf("a:\n%v\n", a.Value())
fmt.Printf("m1 (shape: %v):\n%v\n", m1.Value().Shape(), m1v)
fmt.Printf("m2 (shape: %v):\n%v\n", m2.Value().Shape(), m2v)
fmt.Printf("m3 (shape: %v):\n%v\n", m3.Value().Shape(), m3v)
fmt.Printf("m4 (shape: %v):\n%v\n", m4.Value().Shape(), m4v)
fmt.Printf("m5 (shape: %v):\n%v\n", m5.Value().Shape(), m5.Value())
Output:
a: ⎡1 2 3⎤ ⎣4 5 6⎦ m1 (shape: (2)): [2 5] m2 (shape: (2, 1)): C[2 5] m3 (shape: (3)): [2.5 3.5 4.5] m4 (shape: (1, 3)): R[2.5 3.5 4.5] m5 (shape: (1, 1)): [[3.5]]
Linear Regression Example
The formula for a straight line is
y = mx + c
We want to find an `m` and a `c` that fits the equation well. We'll do it in both float32 and float64 to showcase the extensibility of Gorgonia
Code:
package main import ( "fmt" "log" "math/rand" "runtime" . "gorgonia.org/gorgonia" "gorgonia.org/tensor" ) const ( vecSize = 1000000 ) // manually generate a fake dataset which is y=2x+random func xy(dt tensor.Dtype) (x tensor.Tensor, y tensor.Tensor) { var xBack, yBack interface{} switch dt { case Float32: xBack = tensor.Range(tensor.Float32, 1, vecSize+1).([]float32) yBackC := tensor.Range(tensor.Float32, 1, vecSize+1).([]float32) for i, v := range yBackC { yBackC[i] = v*2 + rand.Float32() } yBack = yBackC case Float64: xBack = tensor.Range(tensor.Float64, 1, vecSize+1).([]float64) yBackC := tensor.Range(tensor.Float64, 1, vecSize+1).([]float64) for i, v := range yBackC { yBackC[i] = v*2 + rand.Float64() } yBack = yBackC } x = tensor.New(tensor.WithBacking(xBack), tensor.WithShape(vecSize)) y = tensor.New(tensor.WithBacking(yBack), tensor.WithShape(vecSize)) return } func random(dt tensor.Dtype) interface{} { rand.Seed(13370) switch dt { case tensor.Float32: return rand.Float32() case tensor.Float64: return rand.Float64() default: panic("Unhandled dtype") } } func linregSetup(Float tensor.Dtype) (m, c *Node, machine VM) { var xT, yT Value xT, yT = xy(Float) g := NewGraph() x := NewVector(g, Float, WithShape(vecSize), WithName("x"), WithValue(xT)) y := NewVector(g, Float, WithShape(vecSize), WithName("y"), WithValue(yT)) m = NewScalar(g, Float, WithName("m"), WithValue(random(Float))) c = NewScalar(g, Float, WithName("c"), WithValue(random(Float))) pred := Must(Add(Must(Mul(x, m)), c)) se := Must(Square(Must(Sub(pred, y)))) cost := Must(Mean(se)) if _, err := Grad(cost, m, c); err != nil { log.Fatalf("Failed to backpropagate: %v", err) } // machine := NewLispMachine(g) // you can use a LispMachine, but it'll be VERY slow. machine = NewTapeMachine(g, BindDualValues(m, c)) return m, c, machine } func linregRun(m, c *Node, machine VM, iter int, autoCleanup bool) (retM, retC Value) { if autoCleanup { defer machine.Close() } model := []ValueGrad{m, c} solver := NewVanillaSolver(WithLearnRate(0.001), WithClip(5)) // good idea to clip if CUDA { runtime.LockOSThread() defer runtime.UnlockOSThread() } var err error for i := 0; i < iter; i++ { if err = machine.RunAll(); err != nil { fmt.Printf("Error during iteration: %v: %v\n", i, err) break } if err = solver.Step(model); err != nil { log.Fatal(err) } machine.Reset() // Reset is necessary in a loop like this } return m.Value(), c.Value() } func linearRegression(Float tensor.Dtype, iter int) (retM, retC Value) { defer runtime.GC() m, c, machine := linregSetup(Float) return linregRun(m, c, machine, iter, true) } // Linear Regression Example // // The formula for a straight line is // y = mx + c // We want to find an `m` and a `c` that fits the equation well. We'll do it in both float32 and float64 to showcase the extensibility of Gorgonia func main() { var m, c Value // Float32 m, c = linearRegression(Float32, 500) fmt.Printf("float32: y = %3.3fx + %3.3f\n", m, c) // Float64 m, c = linearRegression(Float64, 500) fmt.Printf("float64: y = %3.3fx + %3.3f\n", m, c) }
Code:
xV, yV, _ := prep() nonConcurrentTraining(xV, yV, epochs) fmt.Printf("x:\n%1.1v", xV) fmt.Printf("y:\n%1.1v", yV)
Output:
x: ⎡-0.0003 0.01 0.04 0.09 0.2⎤ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ . . . ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎢-0.0003 0.01 0.04 0.09 0.2⎥ ⎣-0.0003 0.01 0.04 0.09 0.2⎦ y: [0.3 0.3 0.3 0.3 ... 0.3 0.3 0.3 0.3]
SymbolicDiff showcases symbolic differentiation
Code:
g := NewGraph() var x, y, z *Node var err error // define the expression x = NewScalar(g, Float64, WithName("x")) y = NewScalar(g, Float64, WithName("y")) if z, err = Add(x, y); err != nil { log.Fatal(err) } // symbolically differentiate z with regards to x and y // this adds the gradient nodes to the graph g var grads Nodes if grads, err = Grad(z, x, y); err != nil { log.Fatal(err) } // create a VM to run the program on machine := NewTapeMachine(g) defer machine.Close() // set initial values then run Let(x, 2.0) Let(y, 2.5) if err = machine.RunAll(); err != nil { log.Fatal(err) } fmt.Printf("z: %v\n", z.Value()) if xgrad, err := x.Grad(); err == nil { fmt.Printf("dz/dx: %v | %v\n", xgrad, grads[0].Value()) } if ygrad, err := y.Grad(); err == nil { fmt.Printf("dz/dy: %v | %v\n", ygrad, grads[1].Value()) }
Output:
z: 4.5 dz/dx: 1 | 1 dz/dy: 1 | 1
analysis.go api_gen.go batch.go bitmap.go blas.go broadcast.go collections.go compile.go concurrency.go const.go device.go differentiation.go doc.go dual.go engine.go equalities.go ermagerdmonards.go errors.go execution.go formatter.go gorgonia.go graph.go interfaces.go math.go math_nooptim.go mathutils_amd64.go nn.go node.go node_set.go noextern.go op.go op_by_indices.go op_infidel.go op_math.go op_math_noextern.go op_nn.go op_nondiff.go op_reduction.go op_softmax.go op_sparsemax.go op_tensor.go op_types.go op_upsample.go op_yolo.go operations.go operations_nondiff.go operatorLinAlg.go operatorLinAlg_const.go operatorPointwise_binary.go operatorPointwise_binary_const.go operatorPointwise_unary.go operatorPointwise_unary_const.go operatorPointwise_unary_gen.go opt.go perf.go regalloc.go release.go shape.go slice.go solvers.go stabilization.go templates.go type.go typeSystem.go utils.go values.go values_primitives.go values_utils.go vm.go vm_genera.go vm_genera_nocuda.go vm_tape.go vm_tape_nocuda.go walker.go weights.go
CUDA indicates if this build is using CUDA
DEBUG indicates if this build is in debug mode. It is not.
var ( // Float64 ... Float64 = tensor.Float64 // Float32 ... Float32 = tensor.Float32 // Int ... Int = tensor.Int // Int64 ... Int64 = tensor.Int64 // Int32 ... Int32 = tensor.Int32 // Byte ... Byte = tensor.Uint8 // Bool ... Bool = tensor.Bool // Ptr is equivalent to interface{}. Ugh Ugh Ugh Ptr = tensor.UnsafePointer )
func BatchNorm(x, scale, bias *Node, momentum, epsilon float64) (retVal, γ, β *Node, op *BatchNormOp, err error)
BatchNorm applies a batchnormalization. This operator can be used in forward pass or for training. In an evaluation only, the "op" output can be discared. In training phase, γ, β can be discarded and the op should be used.
func BatchNorm1d(x, scale, bias *Node, momentum, epsilon float64) (retVal, γ, β *Node, op *BatchNormOp, err error)
BatchNorm1d applies a batchnormalization to a matrix. This operator can be used in forward pass or for training. In an evaluation only, the "op" output can be discared. In training phase, γ, β can be discarded and the op should be used.
Binomial32 returns a []float32 drawn from a binomial distribution given the trial and probability parameters.
Binomial64 returns a []float64 drawn from a binomial distribution given the trial and probability parameters.
Broadcast apply the pattern to the input nodes and returns two nodes suitable for a binary operator. Broadcast works somewhat like Numpy's broadcast, except it's now exposed as a function.
Broadcasts with nils in both left and right patterns will yield the original inputs.
Code:
g := NewGraph() x := NewMatrix(g, Float64, WithShape(2, 3), WithName("x")) y := NewMatrix(g, Float64, WithShape(2, 3), WithName("y")) a, b, err := Broadcast(x, y, NewBroadcastPattern(nil, nil)) if err != nil { fmt.Printf("Error: %v\n", err) return } fmt.Printf("a == x %t; b == y %t", a == x, b == y)
Output:
a == x true; b == y true
CheckOne checks whether an input is an error
Compile takes a graph and outputs a program suitable for *tapeMachine to run
func CompileFunction(g *ExprGraph, inputs, outputs Nodes) (prog *program, locMap map[*Node]register, err error)
CompileFunction takes a graph, subsets it based on the input and output nodes provided and outputs a program suitable for *tapeMachine to run. It is analogous to theano.Function(). If some input nodes are not used or is not reachable, this function will return an error
func DebugDerives()
DebugDerives turns on the derivation debug option when printing a graph
DimSizersToShapes is a convenience function to convert a slice of DimSizer to a slice of tensor.Shape. It will return an error if any of them isn't a tensor.Shape
func DontDebugDerives()
DontDebugDerives turns off derivation debug option when printing a graph. It is off by default
Err is a function that returns a gErr. It wraps errors with stack information. A gErr implements Result, as well as error. This way, the Err() method acts as an unwrapper.
func FmtNodeMap(m interface{}) mapFmt
FmtNodeMap is a convenience function to print map[*Node]<T>
The fmt flag that makes it all nicely formatted is "-". Because a map consists of two types (key's type and val's type), and the Go fmt verb doesn't quite allow us to do something like "%ds", a hack is introduced to enable nicer printing of map[*Node]<T>
Here's the hack: The "#" flag is used to indicate if the map will use the Node's ID or Name when formatting the map.
%-v nodeName:%v %-#v nodeID:%v %-d nodeName:%x %-#d nodeID: %x %-p nodeName:%p %-#p nodeID:%p
If the "-" flag is not found, then the formatter returns the default Go format for map[<T>]<T2>
Gaussian32 returns a []float32 drawn from a gaussian distribution as defined by the mean and stdev
Gaussian64 returns a []float64 drawn from a gaussian distribution as defined by the mean and stdev
GlorotEtAlN32 returns float32 weights sampled from a normal distribution using the methods specified in Glorot et. al (2010). See also: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
GlorotEtAlN64 returns float64 weights sampled from a normal distribution using the methods specified in Glorot et. al (2010). See also: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
GlorotEtAlU32 returns float32 weights sampled from a uniform distribution using the methods specified in Glorot et. al (2010). See also: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
For best results, use:
1.0 for gain for weights that will be used in linear and/or sigmoid units math.Sqrt(2.0) for gain for weights that will be used in ReLU units math.Sqrt(2.0 / (1+alpha*alpha)) for ReLU that are leaky with alpha
GlorotEtAlU64 returns float64 weights sampled from a uniform distribution using the methods specified in Glorot et. al (2010). See also: http://jmlr.org/proceedings/papers/v9/glorot10a/glorot10a.pdf
For best results, use:
1.0 for gain for weights that will be used in linear and/or sigmoid units math.Sqrt(2.0) for gain for weights that will be used in ReLU units math.Sqrt(2.0 / (1+alpha*alpha)) for ReLU that are leaky with alpha
GraphCollisionStats returns the collisions in the graph only when built with the debug tag, otherwise it's a noop that returns 0
HeEtAlN64 returns float64 weights sampled from a normal distro, using the methods described in He et al (2015). The formula is:
randn(n) * sqrt(2/n)
See also https://arxiv.org/abs/1502.01852
For best results, use:
1.0 for gain for weights that will be used in linear and/or sigmoid units math.Sqrt(2.0) for gain for weights that will be used in ReLU units math.Sqrt(2.0 / (1+alpha*alpha)) for ReLU that are leaky with alpha
HeEtAlU64 returns float64 weights sampled from a uniform distro, using the methods described in He et al (2015). The formula is:
randn(n) * sqrt(2/n)
See also https://arxiv.org/abs/1502.01852
For best results, use:
1.0 for gain for weights that will be used in linear and/or sigmoid units math.Sqrt(2.0) for gain for weights that will be used in ReLU units math.Sqrt(2.0 / (1+alpha*alpha)) for ReLU that are leaky with alpha
Let binds a Value to a node that is a variable. A variable is represented as a *Node with no Op. It is equivalent to :
x = 2
Lift1 decorates a function with a precheck and post function lifting
Lift1Axial decorates a function with a precheck and post function lifting
Lift2 decorates a function with a precheck and post function lifting
func Lift2Broadcast(fn func(a, b *Node, pat1, pat2 []byte) (*Node, error)) func(a, b Input, pat1, pat2 []byte) Result
Lift2Broadcast decorates a function with a precheck and post function lifting
NewLispMachine creates a VM that executes the graph as it is traversed. Depending on the VMOpts passed in this VM is also capable of performing automatic differentiation.
NewTapeMachine creates a VM that compiles a graph into a prog.
ReturnNode returns a node to the pool. It does not check that the *Node has been removed from the graph. USE WITH CAUTION.
ReturnType ...
S creates a tensor.Slice. end is optional. It should be passed in as the first param of the optionals. step is optional. It should be passed in as the second param of the optionals.
Default end is start+1. Default step is 1, unless end == step+1, then it defaults to 0
SetDerivOf is used to hack around the fundamental limitations of Gorgonia.
Specifically it is used to set a node as the derivative of another node, used in the cuDNN version of batch norm.
The cuDNN BatchNorm operation produces the derivatives for the scale and bias as a side effect of calculating the derivative of the input. Because Gorgonia's Ops are modelled as pure functions (and no tuples) this causes a bit of trouble. With the clever use of scratch space ops multireturn can be simulated. But this causes derivatives to not be set correctly.
SetOptimizationLevel sets the fast math optimization level. By default, fast math is turned off, and this function is a no-op.
Use the `fastmath` build tag to use fast math
TransformResult is like LiftResult, but allows for custom data types that fulfil Mker
TypeOf returns the Type of the value
Uniform32 returns a []float64 drawn from a uniform distribution between [low, high) that is provided
Uniform64 returns a []float64 drawn from a uniform distribution between [low, high) that is provided
UnsafeLet binds a Value to any node, not just a variable node. This means that you can use it to change any node's value at the runtime of the graph. UNSAFE!
Additional notes: if `be` is a tensor.Slice, and the node's op is a sliceOp or sliceIncrOp, the op's slice will be replaced with the new slice.
Use defines which BLAS implementation gorgonia should use. The default is Gonum's Native. These are the other options:
Use(blase.Implementation()) Use(cubone.Implementation()) Use(cgo.Implementation)
Note the differences in the brackets. The blase and cubone ones are functions.
func UseNonStable()
UseNonStable turns off the stabilization functions when building graphs.
func UseStabilization()
UseStabilization sets the global option to invoke stabilization functions when building the graph. Numerical stabilization is on by default
ValueClose checks whether two values are close to one another. It's predominantly used as an alternative equality test for floats
ValueEq is the equality function for values
WalkGraph walks a graph. It returns a channel of *Nodes, so be sure to consume the channel or there may be a deadlock
WithGraphName is a ExprGraph construction option that provides a name.
An ADOp is an Op that supports automatic differentiation.
type AdaGradSolver struct {
// contains filtered or unexported fields
}
AdaGradSolver is the solver that does adaptive gradient descent. Read the paper: http://jmlr.org/papers/v12/duchi11a.html
func NewAdaGradSolver(opts ...SolverOpt) *AdaGradSolver
NewAdaGradSolver creates a new AdaGradSolver with sane-ish default values
func (s *AdaGradSolver) Step(model []ValueGrad) (err error)
Step steps through each node in the model and applies the Adaptive Gradient gradient descent algorithm on the value.
This function will error out if the nodes do not have an associated Grad value.
type AdamSolver struct {
// contains filtered or unexported fields
}
AdamSolver is the Adaptive Moment Estimation solver (basically RMSProp on steroids). Paper: http://arxiv.org/abs/1412.6980
We overload the purpose of existing data structure of a *dualValue. However, instead of just holding a value and its derivative, the cache's *dualValues hold the Means of gradients (in .Value) and the variances of the gradients (in .d)
func NewAdamSolver(opts ...SolverOpt) *AdamSolver
NewAdamSolver creates an Adam solver with these default values:
eta (learn rate) : 0.001 eps (smoothing factor) : 1e-8 beta1 : 0.9 beta2 : 0.999 batch : 1
func (s *AdamSolver) Step(model []ValueGrad) (err error)
Step steps through each node in the model and applies the Adaptive Moment Estimation gradient descent algorithm on the value.
This function will error out if the nodes do not have an associated Grad value.
type Arena interface { Get(dev Device, size int64) (tensor.Memory, error) // Get returns a NoOpError when it cannot get a memory. Please allocate GetFromValue(dev Device, v Value) (tensor.Memory, error) // Gets a memory and copies the values into the memory and returns it. Put(dev Device, mem tensor.Memory, size int64) // puts the memory back into the arena PutValue(dev Device, v Value) // puts the memory back into the arena // Transfers memory from device to device Transfer(toDev, fromDev Device, v Value, synchronous bool) (retVal Value, err error) }
Arena is a representation of a pool of tensor.Memory
type AutoDiffError struct{}
AutoDiffError is an error which should be passed if the function is not differentiable. This is useful for Op implementations
func (err AutoDiffError) Error() string
B represents a bool value.
Data returns the original representation of the Value
Dtype returns the Dtype of the value
Format implements fmt.Formatter
MemSize satisfies the tensor.Memory interface
Pointer returns the pointer as an unsafe.Pointer. Satisfies the tensor.Memory interface
Shape returns a scalar shape for all scalar values
Size returns 0 for all scalar Values
Uintptr satisfies the tensor.Memory interface
BLAS represents all the possible implementations of BLAS. The default is Gonum's Native
WhichBLAS returns the BLAS that gorgonia uses.
type BarzilaiBorweinSolver struct {
// contains filtered or unexported fields
}
BarzilaiBorweinSolver / Barzilai-Borwein performs Gradient Descent in steepest descend direction Solves 0 = F(x), by
xᵢ₊₁ = xᵢ - eta * Grad(F)(xᵢ)
Where the learn rate eta is calculated by the Barzilai-Borwein method:
eta(xᵢ) = <(xᵢ - xᵢ₋₁), (Grad(F)(xᵢ) - Grad(F)(xᵢ₋₁))> / ∥(Grad(F)(xᵢ) - Grad(F)(xᵢ₋₁))∥²
The input learn rate is used for the first iteration.
TODO: Check out stochastic implementations, e.g. "Barzilai-Borwein Step Size for Stochastic Gradient Descent" https://arxiv.org/abs/1605.04131
func NewBarzilaiBorweinSolver(opts ...SolverOpt) *BarzilaiBorweinSolver
NewBarzilaiBorweinSolver creates a new Barzilai-Borwein solver withs some default values: the learn rate is set to 0.001 and the solver does not use clipping.
func (s *BarzilaiBorweinSolver) Step(model []ValueGrad) (err error)
Step steps through each node in the model and applies the Barzilai-Borwein gradient descent algorithm on the value.
This function will error out if the nodes do not have an associated Grad value.
type BatchNormOp struct {
// contains filtered or unexported fields
}
BatchNormOp is a batch normalization process as described by Ioffe and Szegedy (2015) - http://arxiv.org/abs/1502.03167
Normalization is done as:
γ(x - μ) / σ + β
γ is the scaling factor and β is the offset factor. These are created by BatchNorm()
func (op *BatchNormOp) Arity() int
Arity returns 1
func (op *BatchNormOp) CallsExtern() bool
CallsExtern is false
func (op *BatchNormOp) DiffWRT(inputs int) []bool
DiffWRT ...
func (op *BatchNormOp) Do(values ...Value) (retVal Value, err error)
Do performs the batchnorm computation on the values
func (op *BatchNormOp) DoDiff(ctx ExecutionContext, inputs Nodes, output *Node) error
DoDiff does the gradient computation
func (op *BatchNormOp) Hashcode() uint32
Hashcode ...
InferShape from the input values
func (op *BatchNormOp) OverwritesInput() int
OverwritesInput is -1 (operator doesn't overwrite any input value)
func (op *BatchNormOp) Reset() error
Reset the operator by zeroing the internals scratch spaces
func (op *BatchNormOp) ReturnsPtr() bool
ReturnsPtr is true
func (op *BatchNormOp) SetTesting()
SetTesting configure the op for testing mode
func (op *BatchNormOp) SetTraining()
SetTraining configure the op for training mode. A call to this function implicitly calls the Reset() method
func (op *BatchNormOp) String() string
SymDiff ...
func (op *BatchNormOp) Type() hm.Type
Type ...
UsePreallocDo ...
func (op *BatchNormOp) WriteHash(h hash.Hash)
WriteHash ...
type Batched interface { WorkAvailable() <-chan struct{} DoWork() }
Batched interface describes any object that can process batch work
BatchedBLAS interface describes any object that can process BLAS work in batch
BatchedDevice is the superset of BatchedBLAS and the batched CUDA workflow.
A BinaryOp is an Op that takes only two inputs
BroadcastPattern is actually a bit array. It's split into 2 nibbles - the left nibble represents the left operand, the right nibble represents the right operand:
xxxx|xxxx
The least significant bit of each nibble is elem 0. Concrete examples:
00000010 (0x02) = broadcast axis 1 of the right operand 00000001 (0x01) = broadcast axis 0 of the right operand 00000101 (0x09) = broadcast axis 0 AND axis 2 of the right operand 00010000 (0x10) = broadcast axis 0 of the left operand 00110000 (0x30) = broadcast axis 0 and axis 1 of the lef operand
You get the drill.
Do note that the current limitation of the BroadcastPattern allows only up to 4 dimensions per operand.
func NewBroadcastPattern(leftAxes, rightAxes []byte) BroadcastPattern
NewBroadcastPattern is a helper function to create broadcast patterns
CLDoer uses OpenCL to perform the Op. As of now, there are NO Ops that support OpenCL
type CUDAADOp interface { ADOp CUDADoDiff(extern External, dev Device, inputs Nodes, output *Node) error }
A CUDAADOp operation have a specific method to run with CUDA
type CUDADoer interface { CUDADo(extern External, dev Device, prealloc Value, inputs ...Value) (retVal Value, err error) }
CUDADoer uses CUDA to perform the Op.
CloneErrorer represents any type that can clone itself and return an error if necessary
type Cloner interface {
Clone() interface{}
}
Cloner represents any type that can clone itself.
CopierFrom represents any type that can copy data from the source provided.
CopierTo represents any type that can copy data to the destination.
Device represents the device where the code will be executed on. In this build, all code will run on the CPU
Alloc allocates memory on the device. This is currently a NO-OP in this build
Free frees the memory on the device. This is currently a NO-OP in this build
IsGPU will always return false in this build
String implements fmt.Stringer and runtime.Stringer
DimSizer is any type (typically a tensor.Shape) that allows querying for a dimension size given an input dimension.
ShapesToDimSizers is a convenience function to convert a slice of tensor.Shape to a slice of DimSizer
Dtyper represents any type (typically a Value) that knows its own Dtype
Errer is an interface that can return an error.
ExecutionContext informs how an op should be executed
type ExprGraph struct {
// contains filtered or unexported fields
}
ExprGraph is a data structure for a directed acyclic graph (of expressions). This structure is the main entry point for Gorgonia.
NewGraph creates a new graph. Duh
AddNode adds n to the graph. It panics if the added node ID matches an existing node ID.
AllNodes is like Nodes, but returns Nodes instead of []graph.Node. Nodes() has been reserved for the graph.Directed interface, so this one is named AllNodes instead
ByName returns nodes that have the name provided. Bear in mind that the name that is compared to is the internal name, not the result of calling node.Name(). The reason for doing this is for ease of finding only names that are user-supplied, instead of autogenerated names
Clone clones the graph. All nodes gets cloned, and their values are cloned as well.
Constant returns a constant that may be found in the graph. If no constant were found, a new one is created instead
Edge returns the edge from u to v if such an edge exists and nil otherwise. The node v must be directly reachable from u as defined by the From method.
Edges returns all the edges in the graph.
ExactSubgraphRoots creates a subgraph from the roots provided. The difference between SubgraphRoots and ExactSubgraphRoots is that ExactSubGraphRoots will not attempt to discover if any nodes are missing.
Given a function like the following:
z = x + y set(x, -x.Grad) // setting the value of x to the negative of the gradient
When SubgraphRoots is used on z, the `-x.Grad` will be included. When using ExactSubgraphRoots, only `x` and `y` are included in the subgraph
From returns all nodes in g that can be reached directly from n.
Has returns whether the node exists within the graph.
HasEdgeBetween returns whether an edge exists between nodes x and y without considering direction.
HasEdgeFromTo returns whether an edge exists in the graph from u to v.
Inputs returns a list of nodes which are inputs (that is to say, the user is required to set a value in it)
Node returns the node in the graph with the given ID.
Nodes returns all the nodes in the graph.
RemoveNode removes n from the graph, as well as any edges attached to it. If the node is not in the graph it is a no-op.
Roots returns a list of nodes that are not children of any other nodes
SetEdge adds e, an edge from one node to another. If the nodes do not exist, they are added. It will panic if the IDs of the e.From and e.To are equal.
Subgraph subsets a graph. This function has overloaded meanings - If only one node is passed in, it assumes that the one node is the root, otherwise, it treats ns as the subset of nodes to be included in the subgraph
SubgraphRoots creates a subgraph, assuming the provided nodes are roots to the new subgraph.
To returns all nodes in g that can reach directly to n.
ToDot generates the graph in graphviz format. The use of this is to generate for the entire graph which may have multiple trees with different roots TODO: This is getting unwieldy. Perhaps refactor out into a ToDot(...Opt)?
UnbindAll unbinds all the values from the nodes
UnbindAllNonInputs unbinds all the values from nodes that aren't input nodes
ExternMetadata is used to hold metadata about external execution devices. In this build, it's an empty struct because the default build doesn't use external devices to execute the graph on
func (m *ExternMetadata) Cleanup()
Cleanup cleans up the ancillary allocations made during the calling of batched external device function.
The reason for this method is due to the fact that there is currently no way to free memory while the context is still running without causing some weirdness to the CUDA calls.
This is a No-op in this build
func (m *ExternMetadata) DoWork() error
DoWork flushes any batched cgo calls. In this build it only flushes the batched BLAS calls.
Get allocates a memory of the size. In this build it returns a NoOpError.
GetFromValue allocates a memory of the size of v. In this build it returns a NoOpError, and v itself
func (m ExternMetadata) HasFunc(name string) bool
HasFunc will always return false in this build
Put puts a previously allocated memory slab of the provided size back into the pool. Currently this is a No-op in this build.
func (m *ExternMetadata) PutValue(dev Device, v Value)
PutValue puts a previously allocated value into the pool. In this build, it is a noop.
func (m *ExternMetadata) Reset()
Reset is a noop function for compatibility with the Cuda build
func (m *ExternMetadata) Signal()
Signal sends a signal down the workavailable channel, telling the VM to call the DoWork method. Signal is a synchronous method
func (m *ExternMetadata) Sync() chan struct{}
Sync returns the sync channel
func (m *ExternMetadata) Transfer(toDev, fromDev Device, v Value, synchronous bool) (retVal Value, err error)
Transfer transfers a value from device to device. In this build, it's a noop, returning the input value, and a nil error
func (m *ExternMetadata) WorkAvailable() <-chan bool
WorkAvailable returns a channel of empty struct, which is used to signal to the VM when there is work available. The VM will then call the DoWork method.
External is a representation of an external device (cuda/cgo/openCL), conceptually modelled as a machine.
type ExternalOp struct { Op ExecutionContext Prealloc Value Incr Value // is this a Incr? IncrDoers have higher precedence over PreallocDo UseUnsafe bool // Is this an unsafe op? Lowest of all "special" Dos }
ExternalOp is an op that contains an external context. This allows for ops to be run without needing a VM
func NewAddOp(a, b *Node, ctx ExecutionContext) *ExternalOp
NewAddOp creates a new *ExternalOp that wraps an add op
func NewExternalOp(op Op, ctx ExecutionContext, prealloc Value) *ExternalOp
NewExternalOp creates a new *ExternalOp.
func NewHadamardProdOp(a, b *Node, ctx ExecutionContext) *ExternalOp
NewHadamardProdOp creates a new *ExternalOp that wraps a mul op
func NewSubOp(a, b *Node, ctx ExecutionContext) *ExternalOp
NewSubOp creates a new *ExternalOp that wraps a sub op
func (op *ExternalOp) DetermineDevice(inputs Nodes, output *Node) error
DetermineDevice ...
func (op *ExternalOp) Do(vals ...Value) (Value, error)
Do performs the op,
func (op *ExternalOp) String() string
F32 represents a float32 value.
Data returns the original representation of the Value
Dtype returns the Dtype of the value
Format implements fmt.Formatter
MemSize satisfies the tensor.Memory interface
Pointer returns the pointer as an unsafe.Pointer. Satisfies the tensor.Memory interface
Shape returns a scalar shape for all scalar values
Size returns 0 for all scalar Values
Uintptr satisfies the tensor.Memory interface
F64 represents a float64 value.
Data returns the original representation of the Value
Dtype returns the Dtype of the value
Format implements fmt.Formatter
MemSize satisfies the tensor.Memory interface
Pointer returns the pointer as an unsafe.Pointer. Satisfies the tensor.Memory interface
Shape returns a scalar shape for all scalar values
Size returns 0 for all scalar Values
Uintptr satisfies the tensor.Memory interface
I represents a int value.
Data returns the original representation of the Value
Dtype returns the Dtype of the value
Format implements fmt.Formatter
MemSize satisfies the tensor.Memory interface
Pointer returns the pointer as an unsafe.Pointer. Satisfies the tensor.Memory interface
Shape returns a scalar shape for all scalar values
Size returns 0 for all scalar Values
Uintptr satisfies the tensor.Memory interface
I32 represents a int32 value.
Data returns the original representation of the Value
Dtype returns the Dtype of the value
Format implements fmt.Formatter
MemSize satisfies the tensor.Memory interface
Pointer returns the pointer as an unsafe.Pointer. Satisfies the tensor.Memory interface
Shape returns a scalar shape for all scalar values
Size returns 0 for all scalar Values
Uintptr satisfies the tensor.Memory interface
I64 represents a int64 value.
Data returns the original representation of the Value
Dtype returns the Dtype of the value
Format implements fmt.Formatter
MemSize satisfies the tensor.Memory interface
Pointer returns the pointer as an unsafe.Pointer. Satisfies the tensor.Memory interface
Shape returns a scalar shape for all scalar values
Size returns 0 for all scalar Values
Uintptr satisfies the tensor.Memory interface
IncrDoer increments the toIncr with the result of doing
InitWFn is a type of helper function to help initialize weights vector/matrices. It generates the backing required for the tensors.
It's typically used in closures
Gaussian creates a InitWFn with the specified parameters. Example Usage:
w := NewMatrix(g, Float64, WithName("w"), WithShape(2,2), WithInit(Gaussian(0, 1)))
This will create a backing slice of []float64, with the length of 4, and its values are drawn from a gaussian distro
GlorotN creates a InitWFn that populates a Value with weights normally sampled using Glorot et al.'s algorithm
GlorotU creates a InitWFn that populates a Value with weights uniformly sampled using Glorot et al.'s algorithm
Ones creates an InitWfn that populates a Value with ones. See Zeroes() for more explanation.
RangedFrom creates an InitWFn that populates a Value starting with the provided start, increamenting the number for each element in the value by 1
Uniform creates a InitWFn with the specified parameters. Example Usage:
w := NewMatrix(g, Float64, WithName("w"), WithShape(2,2), WithInit(Uniform(-1, 1)))
This will create a backing slice of []float64, with the length of 4, and its values are drawn from a uniform distro
ValuesOf creates an InitWrn that populates a value with val. This function will cause a panic if val's type is incompatible with the values type.
Zeroes creates an InitWfn that populates a Value with... zeroes. I don't know what you expected.
Input is something that can produce both a *Node and Nodes. Returning nil is OK.
Mker is an interface of any Input that can make a new version of itself
type Momentum struct {
// contains filtered or unexported fields
}
Momentum is the stochastic gradient descent optimizer with momentum item.
NewMomentum creates a new Momentum with sane-ish default values
Step steps through each node in the model and applies the Momentum stochastic gradient descent algorithm on the value.
This function will error out if the nodes do not have an associated Grad value.
Namer is anything that has a name
NoOpError is an error returned when an operation does nothing.
A NoRetOp is an Op that reads a value, but does not return any value. It's a representation of a not-pure function
type Node struct {
// contains filtered or unexported fields
}
A Node is a node in the computation graph
Abs performs a pointwise abs.
Add performs a pointwise add operation.
ApplyOp is the generic function application - for when no specialization is required
ApplyOpWithName applies the op, and then gives the node the given name
At is a symbolic operation for getting a value at the provided coordinates. If the input is a scalar, all the coordinates MUST be 0, or else an error will be returned.
func Auto(op func(a, b *Node, leftPattern, rightPattern []byte) (*Node, error), a, b *Node) (*Node, error)
Auto automatically calculates the padding for the given operations, for example:
gorgonia.Auto(gorgonia.BroadcastHadamardProd, a, b)
BatchedMatMul returns a node representing the batched mat mul operation.
A list of transpose options are allowed. The
Code:
g := NewGraph() a := NewTensor(g, Float64, 3, WithShape(2, 2, 3), WithInit(RangedFrom(1)), WithName("a")) b := NewTensor(g, Float64, 3, WithShape(2, 3, 2), WithInit(RangedFrom(13)), WithName("b")) c, err := BatchedMatMul(a, b) if err != nil { log.Fatal(err) } x := NewTensor(g, Float64, 4, WithShape(3, 2, 2, 3), WithInit(RangedFrom(1)), WithName("x")) y := NewTensor(g, Float64, 4, WithShape(3, 2, 3, 2), WithInit(RangedFrom(37)), WithName("y")) z, err := BatchedMatMul(x, y) if err != nil { log.Fatal(err) } m := NewTapeMachine(g) if err := m.RunAll(); err != nil { log.Fatal(err) } fmt.Printf("a: %v\n%v\n", a.Value().Shape(), a.Value().Data()) fmt.Printf("b: %v\n%v\n", b.Value().Shape(), b.Value().Data()) fmt.Printf("c: %v\n%v\n", c.Value().Shape(), c.Value().Data()) fmt.Printf("x: %v\n%v\n", x.Value().Shape(), x.Value().Data()) fmt.Printf("y: %v\n%v\n", y.Value().Shape(), y.Value().Data()) fmt.Printf("z: %v\n%v\n", z.Value().Shape(), z.Value().Data())
Output:
a: (2, 2, 3) [1 2 3 4 5 6 7 8 9 10 11 12] b: (2, 3, 2) [13 14 15 16 17 18 19 20 21 22 23 24] c: (2, 2, 2) [94 100 229 244 508 532 697 730] x: (3, 2, 2, 3) [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36] y: (3, 2, 3, 2) [37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72] z: (3, 2, 2, 2) [238 244 589 604 1084 1108 1489 1522 2146 2188 2605 2656 3424 3484 3937 4006 4918 4996 5485 5572 6628 6724 7249 7354]
Code:
g := NewGraph() a := NewTensor(g, Float64, 4, WithShape(2, 4, 3, 9), WithInit(RangedFrom(1)), WithName("a")) b := NewTensor(g, Float64, 4, WithShape(2, 4, 3, 9), WithInit(RangedFrom(13)), WithName("b")) c, err := BatchedMatMul(a, b, false, true) if err != nil { log.Fatal(err) } s, err := Sum(c) if err != nil { log.Fatal(err) } grads, err := Grad(s, a, b) if err != nil { log.Fatal(err) } m := NewTapeMachine(g) if err := m.RunAll(); err != nil { log.Fatal(err) } fmt.Printf("a: %v\n%v\n", a.Value().Shape(), a.Value().Data()) fmt.Printf("b: %v\n%v\n", b.Value().Shape(), b.Value().Data()) fmt.Printf("c: %v\n%v\n", c.Value().Shape(), c.Value().Data()) fmt.Printf("grads[0]:%v\n%v\n", grads[0].Shape(), grads[0].Value().Data())
Output:
a: (2, 4, 3, 9) [1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216] b: (2, 4, 3, 9) [13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228] c: (2, 4, 3, 3) [825 1230 1635 2202 3336 4470 3579 5442 7305 12732 15324 17916 16296 19617 22938 19860 23910 27960 37761 42540 47319 43512 49020 54528 49263 55500 61737 75912 82878 89844 83850 91545 99240 91788 100212 108636 127185 136338 145491 137310 147192 157074 147435 158046 168657 191580 202920 214260 203892 215961 228030 216204 229002 241800 269097 282624 296151 283596 297852 312108 298095 313080 328065 359736 375450 391164 376422 392865 409308 393108 410280 427452] grads[0]:(2, 4, 3, 9) [66 69 72 75 78 81 84 87 90 66 69 72 75 78 81 84 87 90 66 69 72 75 78 81 84 87 90 147 150 153 156 159 162 165 168 171 147 150 153 156 159 162 165 168 171 147 150 153 156 159 162 165 168 171 228 231 234 237 240 243 246 249 252 228 231 234 237 240 243 246 249 252 228 231 234 237 240 243 246 249 252 309 312 315 318 321 324 327 330 333 309 312 315 318 321 324 327 330 333 309 312 315 318 321 324 327 330 333 390 393 396 399 402 405 408 411 414 390 393 396 399 402 405 408 411 414 390 393 396 399 402 405 408 411 414 471 474 477 480 483 486 489 492 495 471 474 477 480 483 486 489 492 495 471 474 477 480 483 486 489 492 495 552 555 558 561 564 567 570 573 576 552 555 558 561 564 567 570 573 576 552 555 558 561 564 567 570 573 576 633 636 639 642 645 648 651 654 657 633 636 639 642 645 648 651 654 657 633 636 639 642 645 648 651 654 657]
BinaryXent is a convenience function for doing binary crossentropy stuff. The formula is as below:
-(y * logprob) + (1-y)(1-logprob)
BinomialRandomNode creates an input node that has a random op so that everytime the node is passed, random values will be plucked from a binomial distribution with the mean and stdev provided. The type of the node depends on the shape passed in. To get a scalar value at run time, don't pass in any shapes
Whilst technically the number of trials of a binomal distribution should be a discrete value (you can't have half a trial), to keep with API uniformity, trials is passed in as a float64, but will be truncated to an int at runtime.
Add performs a add. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
By default, Gorgonia operations do not perform broadcasting. To do broadcasting, you would need to manually specify the operation
Code:
g := NewGraph() a := NewVector(g, tensor.Float64, WithShape(2), WithName("a"), WithValue(tensor.New(tensor.WithBacking([]float64{100, 100})))) b := NewMatrix(g, tensor.Float64, WithShape(2, 2), WithName("b"), WithValue(tensor.New(tensor.WithShape(2, 2), tensor.WithBacking([]float64{1, 1, 2, 2})))) fmt.Printf("a = %v\nb =\n%v\n", a.Value(), b.Value()) _, err := Add(a, b) fmt.Printf("a + b yields an error: %v\n\n", err) // Note here the broadcasting of a is on the first axis, not the zeroth axis. Simply put, assume that it's already a (2,1) matrix. ab, err := BroadcastAdd(a, b, []byte{1}, nil) if err != nil { fmt.Printf("uh oh, something went wrong: %v\n", err) } ba, err := BroadcastAdd(b, a, nil, []byte{1}) if err != nil { fmt.Printf("uh oh, something went wrong: %v\n", err) } // Now, let's run the program machine := NewTapeMachine(g) defer machine.Close() if err = machine.RunAll(); err != nil { log.Fatal(err) } fmt.Printf("a +⃗ b =\n%v\n", ab.Value()) fmt.Printf("b +⃗ a =\n%v", ba.Value())
Output:
a = [100 100] b = ⎡1 1⎤ ⎣2 2⎦ a + b yields an error: Failed to infer shape. Op: + false: Shape mismatch: (2) and (2, 2) a +⃗ b = ⎡101 101⎤ ⎣102 102⎦ b +⃗ a = ⎡101 101⎤ ⎣102 102⎦
Eq performs a eq. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
Gt performs a gt. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
Gte performs a gte. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
Code:
// Broadcasting is useful. We can create triangular dense matrices simply g := NewGraph() a := NewMatrix(g, tensor.Float64, WithShape(3, 1), WithName("a"), WithInit(RangedFrom(0))) b := NewMatrix(g, tensor.Float64, WithShape(1, 4), WithName("b"), WithInit(RangedFrom(0))) tl, err := BroadcastGte(a, b, true, []byte{1}, []byte{0}) if err != nil { log.Fatalf("uh oh. Something went wrong %v", err) } tu, err := BroadcastLt(a, b, true, []byte{1}, []byte{0}) if err != nil { log.Fatalf("uh oh. Something went wrong %v", err) } m := NewTapeMachine(g) // PEDAGOGICAL: // Uncomment the following code if you want to see what happens behind the scenes // m.Close() // logger := log.New(os.Stderr, "",0) // m = NewTapeMachine(g, WithLogger(logger), WithWatchlist()) defer m.Close() if err = m.RunAll(); err != nil { log.Fatal(err) } fmt.Printf("triangular, lower:\n%v\n", tl.Value()) fmt.Printf("triangular, upper:\n%v\n", tu.Value())
Output:
triangular, lower: ⎡1 0 0 0⎤ ⎢1 1 0 0⎥ ⎣1 1 1 0⎦ triangular, upper: ⎡0 1 1 1⎤ ⎢0 0 1 1⎥ ⎣0 0 0 1⎦
HadamardDiv performs a hadamarddiv. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
HadamardProd performs a hadamardprod. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
Lt performs a lt. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
Lte performs a lte. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
Ne performs a ne. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
Pow performs a pow. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
Sub performs a sub. The operation is precomposed with a broadcast such that the shapes matches before operations commence.
ByIndices is an operation that takes the indices as input and return the selected values from those indices. The default axis in 0
Ceil performs a pointwise ceil.
Concat performs a concatenate on the provided axis and inputs.
Code:
g := NewGraph()
x := NewTensor(g, Float64, 4, WithShape(2, 3, 4, 5), WithInit(RangedFrom(0)), WithName("x"))
y := NewTensor(g, Float64, 4, WithShape(2, 3, 4, 5), WithInit(RangedFrom(120)), WithName("y"))
z, err := Concat(2, x, y)
if err != nil {
panic(err)
}
m := NewTapeMachine(g)
if err := m.RunAll(); err != nil {
panic(err)
}
tmp := fmt.Sprintf("z %v\n%v", z.Value().Shape(), z.Value())
fmt.Println(strings.Replace(tmp, "\n\n", "\n", -1)) // this is because
Output:
z (2, 3, 8, 5) ⎡ 0 1 2 3 4⎤ ⎢ 5 6 7 8 9⎥ ⎢ 10 11 12 13 14⎥ ⎢ 15 16 17 18 19⎥ ⎢120 121 122 123 124⎥ ⎢125 126 127 128 129⎥ ⎢130 131 132 133 134⎥ ⎣135 136 137 138 139⎦ ⎡ 20 21 22 23 24⎤ ⎢ 25 26 27 28 29⎥ ⎢ 30 31 32 33 34⎥ ⎢ 35 36 37 38 39⎥ ⎢140 141 142 143 144⎥ ⎢145 146 147 148 149⎥ ⎢150 151 152 153 154⎥ ⎣155 156 157 158 159⎦ ⎡ 40 41 42 43 44⎤ ⎢ 45 46 47 48 49⎥ ⎢ 50 51 52 53 54⎥ ⎢ 55 56 57 58 59⎥ ⎢160 161 162 163 164⎥ ⎢165 166 167 168 169⎥ ⎢170 171 172 173 174⎥ ⎣175 176 177 178 179⎦ ⎡ 60 61 62 63 64⎤ ⎢ 65 66 67 68 69⎥ ⎢ 70 71 72 73 74⎥ ⎢ 75 76 77 78 79⎥ ⎢180 181 182 183 184⎥ ⎢185 186 187 188 189⎥ ⎢190 191 192 193 194⎥ ⎣195 196 197 198 199⎦ ⎡ 80 81 82 83 84⎤ ⎢ 85 86 87 88 89⎥ ⎢ 90 91 92 93 94⎥ ⎢ 95 96 97 98 99⎥ ⎢200 201 202 203 204⎥ ⎢205 206 207 208 209⎥ ⎢210 211 212 213 214⎥ ⎣215 216 217 218 219⎦ ⎡100 101 102 103 104⎤ ⎢105 106 107 108 109⎥ ⎢110 111 112 113 114⎥ ⎢115 116 117 118 119⎥ ⎢220 221 222 223 224⎥ ⎢225 226 227 228 229⎥ ⎢230 231 232 233 234⎥ ⎣235 236 237 238 239⎦
Conv1d is a 1D convlution. It relies on Conv2D
func Conv2d(im, filter *Node, kernelShape tensor.Shape, pad, stride, dilation []int) (retVal *Node, err error)
Conv2d is a simple 2D convolution, to be used for CPU computation only. If CuDNN is used, use the CUDAConv2D function. These are the properties the inputs must fulfil:
- im: must have 4D shape. Expected format is BCHW (batch, channels, height, width) - filter: must have 4D shape: (batch, kernel, height, width) - kernelShape: shape of the filter kernel - pad: len(pad) == 2, defaults to []int{0, 0} if nil is passed - stride: len(stride) == 2, example: []int{1, 1} - dilation: len(dilation) == 2, defaults to []int{1, 1} if nil is passed
ConvType converts the type of the x Node from one type to other
Cos performs a pointwise cos.
Cube performs a pointwise cube.
DiagFlat takes the flattened value and creates a diagonal matrix from it.
It is non-differentiable.
Code:
g := NewGraph() // 2 dimensional aV := tensor.New(tensor.WithShape(2, 2), tensor.WithBacking([]float64{1, 2, 3, 4})) a := NodeFromAny(g, aV) b, err := DiagFlat(a) if err != nil { fmt.Println(err) return } m := NewTapeMachine(g) if err := m.RunAll(); err != nil { fmt.Println(err) return } fmt.Printf("a:\n%v\n", a.Value()) fmt.Printf("b:\n%v\n", b.Value()) // 3 dimensional aV = tensor.New(tensor.WithShape(2, 3, 2), tensor.WithBacking([]float64{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12})) a = NodeFromAny(g, aV, WithName("a'")) b2, err := DiagFlat(a) if err != nil { fmt.Println(err) return } m = NewTapeMachine(g) if err := m.RunAll(); err != nil { fmt.Println(err) } fmt.Printf("a:\n%v", a.Value()) fmt.Printf("b:\n%v\n", b2.Value()) // 1 dimensional aV = tensor.New(tensor.WithShape(2), tensor.WithBacking([]float64{1, 2})) a = NodeFromAny(g, aV, WithName("a''")) b3, err := DiagFlat(a) if err != nil { fmt.Println(err) return } m = NewTapeMachine(g) if err := m.RunAll(); err != nil { fmt.Println(err) } fmt.Printf("a:\n%v\n", a.Value()) fmt.Printf("b:\n%v\n", b3.Value()) // Scalars a = NodeFromAny(g, 100.0, WithName("aScalar")) _, err = DiagFlat(a) fmt.Println(err)
Output:
a: ⎡1 2⎤ ⎣3 4⎦ b: ⎡1 0 0 0⎤ ⎢0 2 0 0⎥ ⎢0 0 3 0⎥ ⎣0 0 0 4⎦ a: ⎡ 1 2⎤ ⎢ 3 4⎥ ⎣ 5 6⎦ ⎡ 7 8⎤ ⎢ 9 10⎥ ⎣11 12⎦ b: ⎡ 1 0 0 0 ... 0 0 0 0⎤ ⎢ 0 2 0 0 ... 0 0 0 0⎥ ⎢ 0 0 3 0 ... 0 0 0 0⎥ ⎢ 0 0 0 4 ... 0 0 0 0⎥ . . . ⎢ 0 0 0 0 ... 9 0 0 0⎥ ⎢ 0 0 0 0 ... 0 10 0 0⎥ ⎢ 0 0 0 0 ... 0 0 11 0⎥ ⎣ 0 0 0 0 ... 0 0 0 12⎦ a: [1 2] b: ⎡1 0⎤ ⎣0 2⎦ Cannot perform DiagFlat on a scalar equivalent node
Div is a shortcut function for HadamardDiv for scalar values. For matrix/tensor values, the matrix division operation is not yet handled, and will panic.
Dropout is a convenience function to implement dropout. It uses randomly zeroes out a *Tensor with a probability drawn from a uniform distribution
Eq performs a pointwise eq operation. retSame indicates if the data type of the return value should be the same as the input data type. It defaults to Bool otherwise.
Exp performs a pointwise exp.
Expm1 performs a pointwise expm1.
Floor performs a pointwise floor.
GaussianRandomNode creates an input node that has a random op so everytime the node is passed, random values will be plucked from a gaussian distribution with the mean and stdev provided. The type of the node depends on the shape passed in. To get a scalar value at run time, don't pass in any shapes
GlobalAveragePool2D consumes an input tensor X and applies average pooling across the values in the same channel. The expected input shape is BCHW where B is the batch size, C is the number of channels, and H and W are the height and the width of the data.
Gt performs a pointwise gt operation. retSame indicates if the data type of the return value should be the same as the input data type. It defaults to Bool otherwise.
Gte performs a pointwise gte operation. retSame indicates if the data type of the return value should be the same as the input data type. It defaults to Bool otherwise.
HadamardDiv performs a pointwise hadamarddiv operation.
HadamardProd performs a pointwise hadamardprod operation.
Im2Col converts a BCHW image block to columns. The kernel, pad and stride parameter must be shape of size 2, no more no less This poor naming scheme clearly comes from matlab
Inverse performs a pointwise inverse.
InverseSqrt performs a pointwise inversesqrt.
KeepDims is a function that ensures that input and output dimensions are the same though the shape may change.
The expandLeft flag in the function indicates if any shape expansion should be done leftwards or rightwards. For example, if fn() returns a tensor with a shape (3) and the desired dimension is 2, then if `expandLeft` is true the result will be `(1, 3)`. Otherwise the result will be `(3, 1)`.
At the moment, results that turn into scalars cannot have their dimensions kept - the semantics isn't well established yet and is a work in progress.
LeakyRelu returns a node whose underlying value is:
f(x) = alpha * x if x < 0 f(x) = x for x ⩾ 0
applied elementwise.
Log performs a pointwise log.
Log1p performs a pointwise log1p.
Log2 performs a pointwise log2.
LogSumExp performs addition in the log domain
Lt performs a pointwise lt operation. retSame indicates if the data type of the return value should be the same as the input data type. It defaults to Bool otherwise.
Lte performs a pointwise lte operation. retSame indicates if the data type of the return value should be the same as the input data type. It defaults to Bool otherwise.
Max performs a max() on the input and the provided axes.
MaxPool1D applies a maxpool on the node x.
MaxPool2D applies the kernel filter to the input node. The pad slice can have two different lengths.
- if len(pad) == 2, padding is assume to be symetric, and a padding is adding up *and* down to each dimension
paddedOutputH = pad[0] + inputH + pad[0] paddedOutputW = pad[1] + inputW + pad[1]
- if len(pad) == 4, padding is explicit and can be asymmetric.
paddedOutputH = pad[0] + inputH + pad[1] paddedOutputW = pad[2] + inputW + pad[3]
Mean performs a mean() on the input and the provided axes.
Mish is a novel activation function that is self regularizing.
https://arxiv.org/abs/1908.08681
Mul is the general handler for multiplication of nodes. It is extremely overloaded. Only use if you know what you're doing
If any of the nodes are ScalarType, then it'll be redirected to HadamardProd() instead If the nodes are both vectors (that is, have a shape of (x, 1) or (1, x)), then the operator used will be a vectorDot If only one of the nodes is a vector, then the operator used will be a matrix-vector multiplication will be used, and most importantly, a transpose will be used (when necessary) If both nodes are matrices, then well, matrix multiplication will be done
func Must(n *Node, err error, opts ...NodeConsOpt) *Node
Must indicates a node must be created. If there isn't a node created, or there was an error, it subsumes the error, and immediately panics
Ne performs a pointwise ne operation. retSame indicates if the data type of the return value should be the same as the input data type. It defaults to Bool otherwise.
Neg performs a pointwise neg.
func NewConstant(v interface{}, opts ...NodeConsOpt) *Node
NewConstant takes in any reasonable value and makes it a constant node.
NewMatrix creates a Node representing a variable that holds a matrix (nxm)
NewScalar creates a Node representing a variable that holds a scalar value
NewTensor creates a Node representing a variable that holds a tensor (any n-dimensional array with dimensions greater than 2)
func NewUniqueNode(opts ...NodeConsOpt) *Node
NewUniqueNode creates a new unique node in a graph. If no graph was specified in the construction options then it will just return a graphless node.
NewVector creates a Node representing a variable that holds a vector (nx1 matrix)
func NodeFromAny(g *ExprGraph, any interface{}, opts ...NodeConsOpt) *Node
NodeFromAny creates a Node from a tensor.Tensor, automatically filling in shape and type info
Norm returns the p-norm of a Value. Use p=2 if you want to use unordered norms.
This is a simpler version of the norms found in the Tensor package, which specializes and optimizes even more (well, given it's adapted from Numpy, it is clearly way more optimized)
OneHotVector creates a node representing a one hot vector
OuterProd returns a Node representing the outer product of two vectors. This function will return an error if both input nodes are not vectors
Pow performs a pointwise pow operation.
Ravel flattens the given node and returns the new node
Read allows for extraction of the value of the *Node at runtime into a Value. To achieve this, a pointer to a Value (*Value) is passed into this function, not a Value. The 'into' value remains nil until the execution of the graph (via a call to the Run() methods of the VM)
Rectify is a convenience function for creating rectified linear units activation functions. This function uses ⩾, which is the canonical version. If you want to use >, you can create your own by just following this.
func ReduceAdd(nodes Nodes, opts ...NodeConsOpt) (retVal *Node, err error)
ReduceAdd takes a slice of *Nodes, and folds them into one by adding
func ReduceMul(nodes Nodes, opts ...NodeConsOpt) (retVal *Node, err error)
ReduceMul is like foldl(*, nodes)
Reshape reshapes a node and returns a new node with the new shape
Set is the equivalent of doing this:
a = b
where a and b are both variables
Sigmoid performs a pointwise sigmoid.
Sign performs a pointwise sign.
Sin performs a pointwise sin.
SizeOf returns the size of a value along an axis
Slice slices a *Node. For T[:] slices, pass in nil. Will error out if node's type is not a Tensor
SoftMax implements the softmax operation. The softmax operation is a stable operation.
Code:
g := NewGraph()
t := tensor.New(tensor.WithShape(2, 3), tensor.WithBacking([]float64{1, 3, 2, 3, 2, 1}))
u := t.Clone().(*tensor.Dense)
v := tensor.New(tensor.WithShape(2, 2, 3), tensor.WithBacking([]float64{
1, 3, 2,
4, 2, 1,
3, 5, 3,
2, 1, 5,
}))
a := NodeFromAny(g, t, WithName("a"))
b := NodeFromAny(g, u, WithName("b"))
c := NodeFromAny(g, v, WithName("c"))
sm1 := Must(SoftMax(a))
sm0 := Must(SoftMax(b, 0))
sm := Must(SoftMax(c))
m := NewTapeMachine(g)
if err := m.RunAll(); err != nil {
panic(err)
}
fmt.Printf("a:\n%v\nsoftmax(a) - along last axis (default behaviour):\n%1.2f", a.Value(), sm1.Value())
fmt.Printf("b:\n%v\nsoftmax(b) - along axis 0:\n%1.2f", b.Value(), sm0.Value())
tmp := fmt.Sprintf("c %v:\n%v\nsoftmax(c) - along last axis (default behaviour) %v:\n%1.2f", c.Value().Shape(), c.Value(), sm.Value().Shape(), sm.Value())
fmt.Println(strings.Replace(tmp, "\n\n\n", "\n\n", -1))
// the requirement to use tmp and strings.Replace is because when Go runs example tests, it strips excess newlines.
Output:
a: ⎡1 3 2⎤ ⎣3 2 1⎦ softmax(a) - along last axis (default behaviour): ⎡0.09 0.67 0.24⎤ ⎣0.67 0.24 0.09⎦ b: ⎡1 3 2⎤ ⎣3 2 1⎦ softmax(b) - along axis 0: ⎡0.12 0.73 0.73⎤ ⎣0.88 0.27 0.27⎦ c (2, 2, 3): ⎡1 3 2⎤ ⎣4 2 1⎦ ⎡3 5 3⎤ ⎣2 1 5⎦ softmax(c) - along last axis (default behaviour) (2, 2, 3): ⎡0.09 0.67 0.24⎤ ⎣0.84 0.11 0.04⎦ ⎡0.11 0.79 0.11⎤ ⎣0.05 0.02 0.94⎦
Softplus performs a pointwise softplus.
Sparsemax - implements the sparsemax operation described here: http://proceedings.mlr.press/v48/martins16.pdf
Sqrt performs a pointwise sqrt.
Square performs a pointwise square.
Sub performs a pointwise sub operation.
Sum performs a sum() on the input and the provided axes.
Tanh performs a pointwise tanh.
Tensordot performs a tensor contraction of a and b along specified axes.
func ExampleTensordot_scalar() {
// Scalars g := NewGraph() a := NewScalar(g, Float64, WithValue(2.0), WithName("a")) b := NewScalar(g, Float64, WithValue(21.0), WithName("b")) c, err := Tensordot([]int{0}, []int{0}, a, b) if err != nil { fmt.Printf("Cannot call Tensordot. Error: %v\n", err) return } vm := NewTapeMachine(g) if err := vm.RunAll(); err != nil { fmt.Printf("Cannot perform scalars. Error %v\n", err) } fmt.Printf("c: %v (%v) of %v", c.Value(), c.Value().Dtype(), c.Value().Shape()) // Output: //...
}
Code:
g := NewGraph() a := NewVector(g, Float64, WithName("a"), WithShape(2), WithInit(RangedFrom(2))) b := NewVector(g, Float64, WithName("b"), WithShape(2), WithInit(RangedFrom(21))) c, err := Tensordot([]int{0}, []int{0}, a, b) if err != nil { fmt.Printf("Cannot call Tensordot. Error: %v\n", err) return } vm := NewTapeMachine(g) if err := vm.RunAll(); err != nil { fmt.Printf("Cannot perform tensordot on vectors. Error %v\n", err) } fmt.Printf("a %v b %v ", a.Value(), b.Value()) fmt.Printf("c: %v (%v) of %v", c.Value(), c.Type(), c.Value().Shape())
Output:
a [2 3] b [21 22] c: [108] (float64) of (1)
Transpose performs a transpose on the input and provided permutation axes.
UniformRandomNode creates an input node that has a random op so everytime the node is passed, random values will be plucked from a uniform distribution. The type of the node depends on the shape passed in. To get a scalar value at run time, don't pass in any shapes
Upsample2D - simply upscaling Tensor by scale factor.
1, 2 3, 4 converts to 1,1,2,2 1,1,2,2 3,3,4,4, 3,3,4,4,
func YOLOv3(input *Node, anchors []float32, masks []int, netSize, numClasses int, ignoreTresh float32, targets ...*Node) (*Node, error)
YOLOv3 https://arxiv.org/abs/1804.02767
Clone clones the node. There are some caveats:
- the graph is not copied over - the node essentially does not belong to a collection - there is no ID - the children are not cloned
CloneTo clones the node into a new graph. If CloneTo() is called on the same graph as the n, it will return n. The reason this is done is because at any given time, every node should be unique in the *ExprGraph.
TODO: clone children as well (this means that CloneTo() is only currently suitable fo input nodes)
Device returns the device the data will be on
Dims indicates how many dimensions the node's result has
Dtype returns the dtype of the node
Err always returns nil. However, this method is implemented to enable nicer composition of functions
Grad returns the gradient if there is one.
func (n *Node) GradOnDevice(dev Device, extern External) (retVal Value, allocOnExtern bool, err error)
GradOnDevice gets the gradient value of the node as a Value but on the desired device. In this build the device is always CPU, so it's equivalent to calling .Grad()
Graph returns the graph of the node
Groups to fulfil the encoding Grouper interface
Hashcode provides the hash for the tree, assuming that the node is the root of the tree. Original implementation was here by Vatine (who's apparently 80 years old and using SO!?!):
http://stackoverflow.com/questions/1988665/hashing-a-tree-structure
ID returns the ID of the node. This satisfies the gonum/graph.Node interface
IsColVec indicates if a node represents a Column Vector. This is based on the type of the node, not the actual value associated with the node
IsMatrix indicates if a node represents a matrix. This is based on the type of the node, not the actual value associated with the node
IsRowVec indicates if a node represents a Row Vector. This is based on the type of the node, not the actual value associated with the node
IsScalar indicates if a node represents a a scalar value. This is based on the type of the node, not the actual value associated with the node
IsVar returns true if the node represents a differentiable variable (i.e. it's an argument to the function that is not a statement)
IsVec returns whether this node is a vector
IsVector indicates if a node represents a vector value. This is based on the type of the node, not the actual value associated with the node
Name returns the name of the node. If a name was specified and it is too long, the short name will be used instead (except in inputs)
The short name is typically of the form: OpName(%1, %2 ...), making it read more like a function call
Node returns itself. This sorts of monoidal patterns are useful for compositions via interfaces.
Nodes returns n as a slice of *Node. Again, this is mostly useful for interfaces
Op returns the Op of the node
RestrictedToDot prints the graphviz compatible string but does not print the entire tree up and down indicates how many levels to look up, and how many levels to look down
Shape returns the shape of the node
Strides returns the strides of the value of the node
String() implements the fmt.Stringer interface
ToDot returns the graph as a graphviz compatible string. DEPRECATED: This function will be removed in the next release, please use the encoding/dot package
Type returns the type of the node
Value returns the valuse bound to the node. May return nil
func (n *Node) ValueOnDevice(dev Device, extern External) (retVal Value, allocOnExtern bool, err error)
ValueOnDevice gets the value of the node as a Value but on the desired device. In this build the device is always CPU, so it's equivalent to calling .Value()
WriteHash writes the hash to the provided Hash32.
NodeConsOpt is a function that provides construction options for any Node.
func In(g *ExprGraph) NodeConsOpt
In is a node construction option to set a node's graph. A `*Node`'s graph is immutable. If the graph has already been set, a check will be made that the specifiec *Graph and the *Graph set in *Node are the same. If they are not, the function will panic/
func WithChildren(children Nodes) NodeConsOpt
WithChildren sets the children of a node to the specified chidren. This construction option does NOT check if existing children exists, and will overwrite the existing children.
func WithGrad(any interface{}) NodeConsOpt
WithGrad is a node construction option that binds the value to the *Node. This function may panic if:
- There isn't already a value associated with the node (.boundTo == nil) - The type of the Value does not match the value of the node.
func WithGroupName(name string) NodeConsOpt
WithGroupName is a node construction option to group a *Node within a particular group. This option is useful for debugging with graphs. This function is deprecated and will proabably be remove in the next version.
func WithInit(fn InitWFn) NodeConsOpt
WithInit is a node construction option to initialize a *Node with the InitWFn provided.
func WithName(name string) NodeConsOpt
WithName is a node construction option that gives the *Node the provided name. This is especially useful in debugging graphs.
func WithOp(op Op) NodeConsOpt
WithOp is a node construction option to set a node's Op to the specified Op. `Op`s in `*Node`s are immutable once set and cannot be changed. If the node already has an Op specified a check will be made to see if the provided Op and the one already specified in the `*Node` is the same - do note that comparison of Ops is done using the `Hashcode()` method of Ops, and hash collisions MAY occur - If both ops are different, this function will panic.
func WithShape(shp ...int) NodeConsOpt
WithShape is a node construction option to initialize a *Node with a particular shape. This function panics if the shape's dimensions do not match the specified dimensions of the *Node.
func WithType(t hm.Type) NodeConsOpt
WithType is a node construction option to set a node to the specified type. Types in *Node are immutable once set. If the type has already been specified in the node, a check will be made to see if the both types are the same. If it isn't, it will panic.
func WithValue(any interface{}) NodeConsOpt
WithValue is a node construction option that binds the value to the *Node. This function may panic if:
- Gorgonia was unable to convert interface{} into a Value. - The type of the Value does not match the type of the nodes.
NodeSet is the primary type that represents a set
NewNodeSet creates and returns a reference to an empty set.
Add adds an item to the current set if it doesn't already exist in the set.
Cardinality returns how many items are currently in the set.
Clear clears the entire set to be the empty set.
Clone returns a clone of the set. Does NOT clone the underlying elements.
Contains determines if a given item is already in the set.
ContainsAll determines if the given items are all in the set
Difference returns a new set with items in the current set but not in the other set
Equal determines if two sets are equal to each other. If they both are the same size and have the same items they are considered equal. Order of items is not relevant for sets to be equal.
Intersect returns a new set with items that exist only in both sets.
IsSubset determines if every item in the other set is in this set.
IsSuperset determines if every item of this set is in the other set.
Iter returns a channel of type *Node that you can range over.
Remove allows the removal of a single item in the set.
SymmetricDifference returns a new set with items in the current set or the other set but not in both.
ToSlice returns the elements of the current set as a slice
Union returns a new set with all items in both sets.
Nodes is a slice of nodes, but it also acts as a set of nodes by implementing the Sort interface
Backpropagate backpropagates errors by performing reverse-mode symbolic differentiation, starting from the outputs, and working its way towads the inputs.
This is the rough algorithm:
1. Filter out nodes that are unreachable 2. Forwards analysis, where a list of nodes affecting the output is added to consideration 3. Backwards analysis, where a list of nodes affected by differentiating the output are added to the consideration 4. If there is a difference in both sets, it will cause an error (both sets should be the same) 5. Traverse the graph from output towards input. On each visit, perform the symbolic differentiation
For most cases, Grad() should be used instead of Backpropagate(), as Grad() performs several checks which would be the general use case, before calling Backpropagate()
Grad takes a scalar cost node and a list of with-regards-to, and returns the gradient
NodesFromInputs creates a Nodes from a list of Input.
Sort topologically sorts a ExprGraph: root of graph will be first nodes are sorted using gonum's SortStabilized function.
see https://godoc.org/gonum.org/v1/gonum/graph/topo#SortStabilized for more info
Unconcat is the opposite of the built in concat function TODO: port this back to Gorgonia and use Gorgonia's sli instead
Code:
g := NewGraph() x := NewTensor(g, Float64, 4, WithShape(2, 3, 4, 5), WithInit(RangedFrom(0)), WithName("x")) y := NewTensor(g, Float64, 4, WithShape(2, 3, 4, 5), WithInit(RangedFrom(120)), WithName("y")) z, err := Concat(2, x, y) if err != nil { panic(err) } unconcats, err := Unconcat(z, 2, 2) if err != nil { panic(err) } a, b := unconcats[0], unconcats[1] m := NewTapeMachine(g) if err := m.RunAll(); err != nil { panic(err) } tmp := fmt.Sprintf("a %v\n%v\nb %v\n%v", a.Value().Shape(), a.Value(), b.Value().Shape(), b.Value()) fmt.Println(strings.Replace(tmp, "\n\n", "\n", -1))
Output:
a (2, 3, 4, 5) ⎡ 0 1 2 3 4⎤ ⎢ 5 6 7 8 9⎥ ⎢ 10 11 12 13 14⎥ ⎣ 15 16 17 18 19⎦ ⎡ 20 21 22 23 24⎤ ⎢ 25 26 27 28 29⎥ ⎢ 30 31 32 33 34⎥ ⎣ 35 36 37 38 39⎦ ⎡ 40 41 42 43 44⎤ ⎢ 45 46 47 48 49⎥ ⎢ 50 51 52 53 54⎥ ⎣ 55 56 57 58 59⎦ ⎡ 60 61 62 63 64⎤ ⎢ 65 66 67 68 69⎥ ⎢ 70 71 72 73 74⎥ ⎣ 75 76 77 78 79⎦ ⎡ 80 81 82 83 84⎤ ⎢ 85 86 87 88 89⎥ ⎢ 90 91 92 93 94⎥ ⎣ 95 96 97 98 99⎦ ⎡100 101 102 103 104⎤ ⎢105 106 107 108 109⎥ ⎢110 111 112 113 114⎥ ⎣115 116 117 118 119⎦ b (2, 3, 4, 5) ⎡120 121 122 123 124⎤ ⎢125 126 127 128 129⎥ ⎢130 131 132 133 134⎥ ⎣135 136 137 138 139⎦ ⎡140 141 142 143 144⎤ ⎢145 146 147 148 149⎥ ⎢150 151 152 153 154⎥ ⎣155 156 157 158 159⎦ ⎡160 161 162 163 164⎤ ⎢165 166 167 168 169⎥ ⎢170 171 172 173 174⎥ ⎣175 176 177 178 179⎦ ⎡180 181 182 183 184⎤ ⎢185 186 187 188 189⎥ ⎢190 191 192 193 194⎥ ⎣195 196 197 198 199⎦ ⎡200 201 202 203 204⎤ ⎢205 206 207 208 209⎥ ⎢210 211 212 213 214⎥ ⎣215 216 217 218 219⎦ ⎡220 221 222 223 224⎤ ⎢225 226 227 228 229⎥ ⎢230 231 232 233 234⎥ ⎣235 236 237 238 239⎦
UnstableSort performs a topological sort of the directed graph g returning the 'from' to 'to' sort order. If a topological ordering is not possible, an Unorderable error is returned listing cyclic components in g with each cyclic component's members sorted by ID. When an Unorderable error is returned, each cyclic component's topological position within the sorted nodes is marked with a nil graph.Node.
Add adds to set
AllSameGraph returns true if all the nodes in the slice belong to the same graph. Note that constants do not have to belong to the same graph.
Contains checks if the wanted node is in the set
Difference is ns - other. Bear in mind it is NOT commutative
Equals returns true if two Nodes are the same
Err returns nil always
Format implements fmt.Formatter, which allows Nodes to be differently formatted depending on the verbs
Intersect performs an intersection with other Nodes
Node returns nil. Always. This is bound to cause a panic somewhere if an program is not using it correctly. The reason for implementing this is so that it may fulfil common interfaces.
Nodes returns itself. This is useful for interfaces
Set returns a uniquifies slice. It mutates the slice.
type Op interface { // Arity returns the number of inputs the Op expects. -1 indicates that it's n-ary and will be determined at runtime Arity() int // Informs the type of the Op (not the node). This will be used by the type system to infer the final type of the node Type() hm.Type // returns the output shape as a function of the inputs InferShape(...DimSizer) (tensor.Shape, error) // executes the op Do(...Value) (Value, error) // indicates if the Op will return a pointer (allowing possible inplace edits) or by value // if it's false, the return value of the Op will be a copy of its input ReturnsPtr() bool // Does this op potentially call external (cgo or cuda) functions (thereby requiring extra overhead for Go's trampolining thing) CallsExtern() bool // overwriteInput() is a method which states which input the output will be overwriting. // This allows for some efficiency gains as the underlying arrays wouldn't have to be re-allocated. // The method returns an int instead of a bool because potentially different operations may be allowed // to overwrite certain inputs. For example, consider an operation to increment a value: // the IncrementOp would be a unary operator, and assuming we would like to overwrite the input, // the retVal of overwriteInput() will be 0 (inputs[0]). // -1 is returned if overwriting of input is disallowed OverwritesInput() int /* Other methods */ WriteHash(h hash.Hash) Hashcode() uint32 fmt.Stringer }
An Op is a symbolic representation of an operation Think of them as functions, taking an input (or multiple), and outputting something
All Ops have type signatures that look like this:
OpName :: (Floats a) ⇒ Tensor a → Tensor a → Tensor a
type RMSPropSolver struct {
// contains filtered or unexported fields
}
RMSPropSolver is a solver that implements Geoffrey Hinton's RMSProp gradient descent optimization algorithm. http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
func NewRMSPropSolver(opts ...SolverOpt) *RMSPropSolver
NewRMSPropSolver creates an RMSProp solver with these default values:
eta (learn rate) : 0.001 eps (smoothing factor): 1e-8 rho (decay factor) : 0.999
func (s *RMSPropSolver) Step(model []ValueGrad) (err error)
Step steps through each node in the model and applies the RMSProp gradient descent algorithm on the value.
This function will error out if the nodes do not have an associated Grad value.
ReductionOp changes the shape of the node
Result is either a Node or Nodes or error. It's a poor man's sum types and it's not sealed for good reason
LiftResult creates a Result from a Input and error pair. If the error is not nil, the Input is discarded.
The usual use case is in a function that returns a `(*Node, error)`. e.g LiftResult(Add(a, b))
type SDOp interface { Op // DiffWRT indicates if the op is differentiable with regards to the given number of inputs // returns []bool to indicate which input it is differentiable to DiffWRT(inputs int) []bool // SymDiff symbolically differentiates the op SymDiff(inputs Nodes, output, grad *Node) (retVal Nodes, err error) }
A SDOp is an Op that supports symbolic differentiation
Scalar represents a scalar(non-array-based) value. Do note that it's the pointers of the scalar types (F64, F32, etc) that implement the Scalar interface. The main reason is primarily due to optimizations with regards to memory allocation and copying for device interoperability.
Solver is anything that does gradient updates. The name solvers is stolen from Caffe. A much shorter name than GradientUpdaters
SolverOpt is a function that provides construction options for a Solver
WithBatchSize sets the batch size for the solver. Currently only Adam and Vanilla (basic SGD) has batch size support
WithBeta1 sets the beta1 param of the solver. Only works with Adam
WithBeta2 sets the beta1 param of the solver. Only works with Adam
WithClip clips the gradient if it gets too crazy. By default all solvers do not have any clips attached
WithEps sets the smoothing factor for the solver.
WithL1Reg adds a L1 regularization parameter to the solver. By default, the solvers do not use any regularization param
WithL2Reg adds a L2 regularization parameter to the solver. By default, the solvers do not use any regularization param
WithLearnRate sets the learn rate or step size for the solver.
WithMomentum sets the momentum of the solver. It is a no-op is the solver's type is not Momentum
WithRho sets the decay parameter of the RMSProp solver
StandardEngine is the default CPU engine for gorgonia
Transpose tensor a according to expStrides
type SymDiffError struct {
// contains filtered or unexported fields
}
SymDiffError provides the context at which an error occurred
func (err SymDiffError) Error() string
func (err SymDiffError) Grad() *Node
Grad returns a specific grad involved in the error
func (err SymDiffError) Grads() map[*Node]Nodes
Grads returns the grads involved in the error
func (err SymDiffError) Node() *Node
Node returns a specific node involved in the error
func (err SymDiffError) Nodes() Nodes
Nodes returns the nodes involved in the error
type Tensor interface { // info about the ndarrayN Shape() tensor.Shape Strides() []int Dtype() tensor.Dtype Dims() int Size() int DataSize() int // type overloading methods IsScalar() bool ScalarValue() interface{} // engine/memory related stuff // all Tensors should be able to be expressed of as a slab of memory // Note: the size of each element can be acquired by T.Dtype().Size() Engine() tensor.Engine // Engine can be nil MemSize() uintptr // the size in memory Uintptr() uintptr // the pointer to the first element, as a uintptr Pointer() unsafe.Pointer // the pointer to the first elemment as a unsafe.Ponter IsNativelyAccessible() bool // Can Go access the memory IsManuallyManaged() bool // Must Go manage the memory }
Tensor is an interface that describes an ndarray
TensorType is a type constructor for tensors.
Think of it as something like this:
data Tensor a = Tensor d a
The shape of the Tensor is not part of TensorType. Shape checking is relegated to the dynamic part of the program run
func (t TensorType) Apply(sub hm.Subs) hm.Substitutable
Apply applies the substitutions on the types. Satisfies the hm.Type interface.
func (t TensorType) Eq(other hm.Type) bool
Eq is the equality function of this type. The type of Tensor has to be the same, and for now, only the dimensions are compared. Shape may be compared in the future for tighter type inference. Satisfies the hm.Type interface.
func (t TensorType) Format(state fmt.State, c rune)
Format implements fmt.Formatter. It is also required for the satisfication the hm.Type interface.
func (t TensorType) FreeTypeVar() hm.TypeVarSet
FreeTypeVar returns any free (unbound) type variables in this type. Satisfies the hm.Type interface.
func (t TensorType) Name() string
Name returns the name of the type, which will always be "Tensor". Satisfies the hm.Type interface.
func (t TensorType) Normalize(k, v hm.TypeVarSet) (hm.Type, error)
Normalize normalizes the type variable names (if any) in the TensorType. Satisfies the hm.Type interface.
func (t TensorType) String() string
String implements fmt.Stringer and runtime.Stringer. Satisfies the hm.Type interface.
func (t TensorType) Types() hm.Types
Types returns a list of types that TensorType contains - in this case, the type of Tensor (float64, float32, etc). Satisfies the hm.Type interface.
Typer represents any type (typically a Op) that knows its own Type
U8 represents a byte value.
Data returns the original representation of the Value
Dtype returns the Dtype of the value
Format implements fmt.Formatter
MemSize satisfies the tensor.Memory interface
Pointer returns the pointer as an unsafe.Pointer. Satisfies the tensor.Memory interface
Shape returns a scalar shape for all scalar values
Size returns 0 for all scalar Values
Uintptr satisfies the tensor.Memory interface
A UnaryOp is an Op that takes only one input
UnsafeDoer is an op that will overwrite the underlying value.
UsePreallocDoer is an op that works when a preallocated value is provided
type VM interface { RunAll() error Reset() // Close closes all the machine resources (CUDA, if any, loggers if any) Close() error }
VM represents a structure that can execute a graph or program. There are two VMs (both unexported):
- *tapeMachine - *lispMachine
The *tapeMachine pre-compiles a graph into a list of instructions, then executes the instructions linearly and sequentially. The main tradeoff is dynamism. Graphs cannot be dynamically created on the fly as a re-compilation process is required (and compilation is relatively expensive). However, graphs executed with the *tapeMachine run much faster as plenty of optimizations has been done in the code generation stage.
The *lispMachine allows for graphs to be dynamically built and executed upon. The tradeoff is that executing a graph on *lispMachine is generally slower than on *tapeMachine, given the same static "image" of a graph.
VMOpt is a VM creation option
BindDualValues is an option for *tapeMachine only. This is useful to set when using a Solver
ExecuteBwdOnly creates a VM that will execute a graph by doing back propagation only. The assumption is of course, that the forward graph has already been executed, and there are already values associated with the nodes. This option is only for *lispMachine. Try it on any other VMs and it will panic.
ExecuteFwdOnly creates a VM that will execute a graph forwards only - it will not do back propagation. This option is only for *lispMachine. Try it on any other VMs and it will panic.
LogBothDir logs both directions of the execution of the graph. This option is only available for *lispMachine.
LogBwd logs the backwards execution of a graph. This option is only for *lispMachine. Try it on any other VMs and it will panic.
LogFwd logs the forward execution of a graph. This option is only for *lispMachine. Try it on any other VMs and it will panic.
TraceExec is an option for *tapeMachine only. It stores an immutable copy of the executed value into the node, instead of a mutable value, which may be clobbered
UseCudaFor is an option for *tapeMachine. This function is NO-OP unless the program is built with the `cuda` tag.
WithEngine sets the tensor engine for computation inside the VM.
WithInfWatch creates a VM that will watch for Infs when executing. It watches for +Inf, -Inf and Inf. No choice there. This slows the execution down.
WithLogger creates a VM with the supplied logger. If the logger is nil, a default logger, writing to os.stderr will be created.
WithManualGradient allows the user to set the gradient of the root, before backprop. The root gradients should be set using the SetDeriv method
WithNaNWatch creates a VM that will watch for NaNs when executing. This slows the execution down.
WithPrecompiled is an option to pass in compiled programs. This is useful for users who use the CompileFunction function
WithValueFmt defines how the logger will output the values. It defaults to "%3.3f"
WithWatchlist creates a VM with a watchlist. When the execution touches the things in the watchlist, the VM's logger will the log it. This allows for watching and finetuning of the algorithm. When nothing is passed in, then the VM will default to watching and logging every single execution object.
The watchlist allows for different things to be watched, depending on VM type:
*lispMachine will ONLY take *Node *tapeMachine will take int (for register IDs) or *Node.
type Value interface { Shape() tensor.Shape // Shape returns the shape of the Value. Scalar values return ScalarShape() Size() int // Size represents the number of elements in the Value. Note that in cases such as a *tensor.Dense, the underlying slice MAY have more elements than the Size() reports. This is correct. Data() interface{} // Data returns the original representation of the Value Dtype() tensor.Dtype // Dtype returns the Dtype of the value tensor.Memory fmt.Formatter }
Value represents a value that Gorgonia accepts. At this point it is implemented by:
- all scalar value types (F64, F32... etc) - *tensor.Dense - *dualValue
A Value is essentially any thing that knows its own type and shape. Most importantly though, a Value is a pointer - and can be converted into a tensor.Memory. This is done for the sake of interoperability with external devices like cgo or CUDA or OpenCL. This also means for the most part most Values will be allocated on the heap. There are some performance tradeoffs made in this decision, but ultimately this is better than having to manually manage blocks of memory
CloneValue clones a value. For scalars, since Go copies scalars, it returns itself
Copy copies the src values into dest values. For scalars, it just returns itself
ScalarAsTensor returns the tensor representation of a scalar. It is particularly useful as a "reshape" of tensors of sorts
The Value passed in are either Scalar, tensor.Tensor, or *dualValue. Anything else will panic.
ZeroValue returns the zero value of a type
ValueCloser represents any type that can perform a close-value check
ValueEqualer represents any type that can perform a equal value check
ValueGrad is any type that has a value and a grad. This is used for Solvers
NodesToValueGrads is a utility function that converts a Nodes to a slice of ValueGrad for the solvers
Valuer is any type that can return a Value
type VanillaSolver struct {
// contains filtered or unexported fields
}
VanillaSolver is your bog standard stochastic gradient descent optimizer. There are no fancy features to this
func NewVanillaSolver(opts ...SolverOpt) *VanillaSolver
NewVanillaSolver creates a new VanillaSolver with sane-ish default values
func (s *VanillaSolver) Step(model []ValueGrad) (err error)
Step steps through each node in the model and applies the most basic gradient descent algorithm on the value.
This function will error out if the nodes do not have an associated Grad value.
ZeroValuer is a a Value that can provide the zero-value of its type
Zeroer is a Value that can zero itself
Path | Synopsis |
---|---|
blase | Package blase is a thin wrapper over Gonum's BLAS interface that provides a queue so that cgo calls are batched. |
cuda | |
encoding/dot | Package dot creates a graphviz compatible version of the ExprGraph |
internal/encoding | |
ops/nn | Package nnops implements some operators that have both a pure go implementation and a cuda implementation to use the cuda version, assuming that you have the pre-requisites, simply compile or run the code with the `cuda tag` go run -tags='cuda' |
x/vm |
Package gorgonia imports 37 packages (graph) and is imported by 67 packages. Updated 2021-01-24. Refresh now. Tools for package owners.