import "github.com/grailbio/bigmachine"
Package bigmachine implements a vertically integrated stack for distributed computing in Go. Go programs written with bigmachine are transparently distributed across a number of machines as instantiated by the backend used. (Currently supported: EC2, local machines, unit tests.) Bigmachine clusters comprise a driver node and a number of bigmachine nodes (called "machines"). The driver node can create new machines and communicate with them; machines can call each other.
On startup, a bigmachine program calls driver.Start. Driver.Start configures a bigmachine instance based on a set of standard flags and then starts it. (Users desiring a lower-level API can use bigmachine.Start directly.)
import ( "github.com/grailbio/bigmachine" "github.com/grailbio/bigmachine/driver" ... ) func main() { flag.Parse() // Additional configuration and setup. b, shutdown := driver.Start() defer shutdown() // Driver code... }
When the program is run, driver.Start returns immediately: the program can then interact with the returned bigmachine B to create new machines, define services on those machines, and invoke methods on those services. Bigmachine bootstraps machines by running the same binary, but in these runs, driver.Start never returns; instead it launches a server to handle calls from the driver program and other machines.
A machine is started by (*B).Start. Machines must be configured with at least one service:
m, err := b.Start(ctx, bigmachine.Services{ "MyService": &MyService{Param1: value1, ...}, })
Users may then invoke methods on the services provided by the returned machine. A services's methods can be invoked so long as they are of the form:
Func(ctx context.Context, arg argType, reply *replyType) error
See package github.com/grailbio/bigmachine/rpc for more details.
Methods are named by the sevice and method name, separated by a dot ('.'), e.g.: "MyService.MyMethod":
if err := m.Call(ctx, "MyService.MyMethod", arg, &reply); err != nil { log.Print(err) } else { // Examine reply }
Since service instances must be serialized so that they can be transmitted to the remote machine, and because we do not know the service types a priori, any type that can appear as a service must be registered with gob. This is usually done in an init function in the package that declares type:
type MyService struct { ... } func init() { // MyService has method receivers gob.Register(new(MyService)) }
A bigmachine program attempts to appear and act like a single program:
- Each machine's standard output and error are copied to the driver; - bigmachine provides aggregating profile handlers at /debug/bigmachine/pprof so that aggregate profiles may be taken over the full cluster; - command line flags are propagated from the driver to the machine, so that a binary run can be configured in the usual way.
The driver program maintains keepalives to all of its machines. Once this is no longer maintained (e.g., because the driver finished, or crashed, or lost connectivity), the machines become idle and shut down.
A service is any Go value that implements methods of the form given above. Services are instantiated by the user and registered with bigmachine. When a service is registered, bigmachine will also invoke an initialization method on the service if it exists. Per-machine initialization can be performed by this method.The form of the method is:
Init(*Service) error
If a non-nil error is returned, the machine is considered failed.
bigmachine.go doc.go expvar.go local.go machine.go profile.go status.go supervisor.go system.go
const RpcPrefix = "/bigrpc/"
RpcPrefix is the path prefix used to serve RPC requests.
func Init()
Init initializes bigmachine. It should be called after flag parsing and global setup in bigmachine-based processes. Init is a no-op if the binary is not running as a bigmachine worker; if it is, Init never returns.
RegisterSystem is used by systems implementation to register a system implementation. RegisterSystem registers the implementation with gob, so that instances can be transmitted over the wire. It also registers the provided System instance as a default to use for the name to support bigmachine.Init.
type B struct {
// contains filtered or unexported fields
}
B is a bigmachine instance. Bs are created by Start and, outside of testing situations, there is exactly one per process.
Start is the main entry point of bigmachine. Start starts a new B using the provided system, returning the instance. B's shutdown method should be called to tear down the session, usually in a defer statement from the program's main:
func main() { // Parse flags, configure system. b := bigmachine.Start() defer b.Shutdown() // bigmachine driver code }
Dial connects to the machine named by the provided address.
The returned machine is not owned: it is not kept alive as Start does.
HandleDebug registers diagnostic http endpoints on the provided ServeMux.
HandleDebugPrefix registers diagnostic http endpoints on the provided ServeMux under the provided prefix.
IsDriver is true if this is a driver instance (rather than a spawned machine).
Machines returns a snapshot of the current set machines known to this B.
Shutdown tears down resources associated with this B. It should be called by the driver to discard a session, usually in a defer:
b := bigmachine.Start() defer b.Shutdown() // driver code
Start launches up to n new machines and returns them. The machines are configured according to the provided parameters. Each machine must have at least one service exported, or else Start returns an error. The new machines may be in Starting state when they are returned. Start maintains a keepalive to the returned machines, thus tying the machines' lifetime with the caller process.
Start returns at least one machine, or else an error.
System returns this B's System implementation.
A DiskInfo describes system disk usage.
Environ is a machine parameter that amends the process environment of the machine. It is a slice of strings in the form "key=value"; later definitions override earlies ones.
An Expvar is a snapshot of an expvar.
Expvars is a collection of snapshotted expvars.
type Info struct { // Goos and Goarch are the operating system and architectures // as reported by the Go runtime. Goos, Goarch string // Digest is the fingerprint of the currently running binary on the machine. Digest digest.Digest }
Info contains system information about a machine.
LocalInfo returns system information for this process.
A LoadInfo describes system load.
type Machine struct { // Addr is the address of the machine. It may be used to create // machine instances through Dial. Addr string // Maxprocs is the number of processors available on the machine. Maxprocs int // NoExec should be set to true if the machine should not exec a // new binary. This is meant for testing purposes. NoExec bool // contains filtered or unexported fields }
A Machine is a single machine managed by bigmachine. Each machine is a "one-shot" execution of a bigmachine binary. Machines embody a failure detection mechanism, but does not provide fault tolerance. Each machine comprises instances of each registered bigmachine service. A Machine is created by the bigmachine driver binary, but its address can be passed to other Machines which can in turn connect to each other (through Dial).
Machines are created with (*B).Start.
Call invokes a method named by a service on this machine. The argument and reply must be provided in accordance to bigmachine's RPC mechanism (see package docs or the docs of the rpc package). Call waits to invoke the method until the machine is in running state, and fails fast when it is stopped.
If a machine fails its keepalive, pending calls are canceled.
Cancel cancels all pending operations on machine m. The machine is stopped with an error of context.Canceled.
DiskInfo returns the machine's disk usage information.
Err returns a machine's error. Err is only well-defined when the machine is in Stopped state.
Hostname returns the hostname portion of the machine's address.
KeepaliveReplyTimes returns a buffer up to the last numKeepaliveReplyTimes keepalive reply latencies, most recent first.
LoadInfo returns the machine's current load.
MemInfo returns the machine's memory usage information. Go runtime memory stats are read if readMemStats is true.
NextKeepalive returns the time at which the next keepalive request is due.
Owned tells whether this machine was created and is managed by this bigmachine instance.
func (m *Machine) RetryCall(ctx context.Context, serviceMethod string, arg, reply interface{}) error
RetryCall invokes Call, and retries on a temporary error.
State returns the machine's current state.
Wait returns a channel that is closed once the machine reaches the provided state or greater.
type MemInfo struct { System mem.VirtualMemoryStat Runtime runtime.MemStats }
A MemInfo describes system and Go runtime memory usage.
Option is an option that can be provided when starting a new B. It is a function that can modify the b that will be returned by Start.
Name is an option that will name the B. See B.name.
type Param interface {
// contains filtered or unexported methods
}
A Param is a machine parameter. Parameters customize machines before the are started.
Services is a machine parameter that specifies the set of services that should be served by the machine. Each machine should have at least one service. Multiple Services parameters may be passed.
State enumerates the possible states of a machine. Machine states proceed monotonically: they can only increase in value.
const ( // Unstarted indicates the machine has yet to be started. Unstarted State = iota // Starting indicates that the machine is currently bootstrapping. Starting // Running indicates that the machine is running and ready to // receive calls. Running // Stopped indicates that the machine was stopped, eitehr because of // a failure, or because the driver stopped it. Stopped )
String returns a State's string.
type Supervisor struct {
// contains filtered or unexported fields
}
Supervisor is the system service installed on every machine.
StartSupervisor starts a new supervisor based on the provided arguments.
func (s *Supervisor) CPUProfile(ctx context.Context, dur time.Duration, prof *io.ReadCloser) error
CPUProfile takes a pprof CPU profile of this process for the provided duration. If a duration is not provided (is 0) a 30-second profile is taken. The profile is returned in the pprof serialized form (which uses protocol buffers underneath the hood).
DiskInfo returns disk usage information on the disk where the temporary directory resides.
func (s *Supervisor) Exec(ctx context.Context, _ struct{}, _ *struct{}) error
Exec reads a new image from its argument and replaces the current process with it. As a consequence, the currently running machine will die. It is up to the caller to manage this interaction.
Expvars returns a snapshot of this machine's expvars.
func (s *Supervisor) GetBinary(ctx context.Context, _ struct{}, rc *io.ReadCloser) error
GetBinary retrieves the last binary uploaded via Setbinary.
Getpid returns the PID of the supervisor process.
Info returns the info struct for this machine.
func (s *Supervisor) Keepalive(ctx context.Context, next time.Duration, reply *keepaliveReply) error
Keepalive maintains the machine keepalive. The next argument indicates the callers desired keepalive interval (i.e., the amount of time until the keepalive expires from the time of the call); the accepted time is returned. In order to maintain the keepalive, the driver should call Keepalive again before replynext expires.
LoadInfo returns system load information.
MemInfo returns system and Go runtime memory usage information. Go runtime stats are read if readMemStats is true.
Ping replies immediately with the sequence number provided.
func (s *Supervisor) Profile(ctx context.Context, req profileRequest, prof *io.ReadCloser) error
Profile returns the named pprof profile for the current process. The profile is returned in protocol buffer format.
func (s *Supervisor) Profiles(ctx context.Context, _ struct{}, profiles *[]profileStat) error
Profiles returns the set of available profiles and their counts.
func (s *Supervisor) Register(ctx context.Context, svc service, _ *struct{}) error
Register registers a new service with the machine (server) associated with this supervisor. After registration, the service is also initialized if it implements the method
Init(*B) error
Setargs sets the process' arguments. It should be used before Exec in order to invoke the new image with the appropriate arguments.
Setbinary uploads a new binary to replace the current binary when Supervisor.Exec is called. The two calls are separated so that different timeouts can be applied to upload and exec.
Setenv sets the processes' environment. It is applied to newly exec'd images, and should be called before Exec. The provided environment is appended to the default process environment: keys provided here override those that already exist in the environment.
func (s *Supervisor) Shutdown(ctx context.Context, req shutdownRequest, _ *struct{}) error
Shutdown will cause the process to exit asynchronously at a point in the future no sooner than the specified delay.
type System interface { // Name is the name of this system. It is used to multiplex multiple // system implementations, and thus should be unique among // systems. Name() string // Init is called when the bigmachine starts up in order to // initialize the system implementation. If an error is returned, // the Bigmachine fails. Init(*B) error // Main is called to start a machine. The system is expected to // take over the process; the bigmachine fails if main returns (and // if it does, it should always return with an error). Main() error // HTTPClient returns an HTTP client that can be used to communicate // from drivers to machines as well as between machines. HTTPClient() *http.Client // ListenAndServe serves the provided handler on an HTTP server that // is reachable from other instances in the bigmachine cluster. If addr // is the empty string, the default cluster address is used. ListenAndServe(addr string, handle http.Handler) error // Start launches up to n new machines. The returned machines can // be in Unstarted state, but should eventually become available. Start(ctx context.Context, n int) ([]*Machine, error) // Exit is called to terminate a machine with the provided exit code. Exit(int) // Shutdown is called on graceful driver exit. It's should be used to // perform system tear down. It is not guaranteed to be called. Shutdown() // Maxprocs returns the maximum number of processors per machine, // as configured. Returns 0 if is a dynamic value. Maxprocs() int // KeepaliveConfig returns the various keepalive timeouts that should // be used to maintain keepalives for machines started by this system. KeepaliveConfig() (period, timeout, rpcTimeout time.Duration) // Tail returns a reader that follows the bigmachine process logs. Tail(ctx context.Context, m *Machine) (io.Reader, error) // Read returns a reader that reads the contents of the provided filename // on the host. This is done outside of the supervisor to support external // monitoring of the host. Read(ctx context.Context, m *Machine, filename string) (io.Reader, error) }
A System implements a set of methods to set up a bigmachine and start new machines. Systems are also responsible for providing an HTTP client that can be used to communicate between machines and drivers.
Local is a System that insantiates machines by creating new processes on the local machine.
Path | Synopsis |
---|---|
driver | Package driver provides a convenient API for bigmachine drivers, which includes configuration by flags. |
ec2system | Package ec2system implements a bigmachine System that launches machines on dedicated EC2 spot instances. |
ec2system/instances | |
internal/authority | Package authority provides an in-process TLS certificate authority, useful for creating and distributing TLS certificates for mutually authenticated HTTPS networking within Bigmachine. |
internal/ioutil | Package ioutil contains utilities for performing I/O in bigmachine. |
internal/tee | Package tee implements utilities for I/O multiplexing. |
rpc | Package rpc implements a simple RPC system for Go methods. |
testsystem | Package testsystem implements a bigmachine system that's useful for testing. |
Package bigmachine imports 50 packages (graph) and is imported by 8 packages. Updated 2019-11-27. Refresh now. Tools for package owners.