geto

command module
v0.0.0-...-31fc794 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 26, 2014 License: MPL-2.0 Imports: 4 Imported by: 0

README

geto

(G)ood (e)nough (t)ask (o)ffloader is a framework for offloading work to hosts with minimal setup and dependencies.

Basically, geto can be used to offload an arbitrary task to any target host and retrieve results.

You might want to use geto if you have a machine (or more) that needs to offload work to other machines. Geto code is only required on a machine from which the work is offloaded; geto is not required (or in any way useful) on the target host machines.

It is likely that the offloading and result gathering will take on the order of seconds, so you might not want to use geto if that is a concern.

Here's a trivial example that runs a sleep command on a target host:

package main

import (
    "fmt"
	"github.com/bgmerrell/geto/lib/config"
	"github.com/bgmerrell/geto/lib/remote/ssh"
	"github.com/bgmerrell/geto/lib/task"
)

func main() {
	conf, _ := config.ParseConfig("/etc/geto.ini")
	var script task.Script = task.NewScriptWithCommands(
		"sleep", []string{"#!/bin/bash", "sleep 15"}, nil)
	var depFiles []string
	t, _ := task.New(depFiles, script, 0)
	ch := make(chan task.RunOutput)
	go task.RunOnHost(ssh.New(), t, conf.Hosts[0], ch)
	taskOutput := <-ch
	fmt.Printf("stdout: %s\n", taskOutput.Stdout)
	fmt.Printf("stderr: %s\n", taskOutput.Stderr)
	if taskOutput.Err != nil {
		fmt.Printf("err: %s\n", taskOutput.Err.Error())
	}
}

Prerequisites

Any host to which the user wishes to offload must have the following:

  • A Unix-like environment (only tested on Linux)
  • SSH server allowing public-key authenticated logins by any machine doing offloading. Password authentication is being worked on, but there is an issue
  • The timeout command in your PATH. This command is usually installed by default as part of the coreutils package in Linux.

The machine originating the offloading must have the following:

Terms

  • Host: Any machine that receives a task, i.e., any machine setup with the first set of prerequisites above.
  • Task: A unit of work to run on a host. Task IDs are uniquely generated.
  • Script: A geto object that contains a single script of any language. Script objects can be given any name, and multiple script objects can share a common name. The script name is used to limit and load balance the scripts on target hosts.

A task contains a single script, and multiple tasks can contain scripts of the same name.

For example, In the above code example, a simple bash script is used to compose a geto script (using the task.NewScriptWithCommands() method). That geto script is then used to create a new geto task (using the task.New() method). That task is then executed (using the task.RunOnHost() method) on the first host found in the parsed config file (i.e., conf.Hosts[0]).

Script details

The Script object consists of a name, commands, and the number of maximum scripts that can run concurrently on a given host. Otherwise stated:

// A script that runs on a target host
type Script struct {
    // Name is the name of a script.  It need not be unique.
	name string
	// The commands that make up a shell-style script.
	// Each index represents a line in the script.
	commands []string
	// The number of scripts of the same name that will run on a target host
	// concurrently.  A nil value means there is no limit.
	maxConcurrent *uint32
}

There are multiple ways of creating a script object:

func NewScript(name string, maxConcurrent *uint32) Script

In the above case the user is responsible for adding the commands to the object. Alternatively, the commands can be provided when instantiating the script object (which is the strategy used in the first example of this document) like so:

func NewScriptWithCommands(name string, commands []string, maxConcurrent *uint32) Script

Yet another approach is to provide a path to an existing script file to use to instantiate the geto script:

func NewScriptFromPath(name string, path string, maxConcurrent *uint32) (Script, error)

Scripts are simply executed on the target host; it is up to the script to indicate how it should be executed (e.g., by using a shebang interpreter directive).

Task details

A task object looks like this:

// A task that runs on a target host
type Task struct {
    // A unique ID for the task, automatically generated
	Id string
	// A list of files and/or directories that the task requires
	DepFiles []string
	// A script for the task to run
	Script Script
	// The number of seconds before giving up on a task after it has been
	// started
	Timeout uint32
}

Any file dependencies (specified by DepFiles) are copied to the target host and placed in a special "DEPS" directory. The script is also copied to the target host and placed in the same parent directory as the "DEPS" directory. This means that file dependencies can be relatively referenced from the script. For example, a foo.bin file dependency could be referenced in the script by "DEPS/foo.bin". (NOTE: This may or may not be tested at this point).

There is currently one way to instantiate a task object:

func New(depFiles []string, script Script, timeout uint32) (Task, error)

Once a task has been created, however, there are several fun ways to run it. The user can provide exactly which host on which the task should be run, like this:

func RunOnHost(conn remote.Remote, task Task, host host.Host, resultChan chan<- RunOutput)

Or, the user might wish to just have a random host picked, like this:

func RunOnRandomHost(conn remote.Remote, task Task, ch chan<- RunOutput)

The user can also perform basic load balancing by having geto choose the host that is running the fewest instances of a task's script, like this:

func RunOnHostBalancedByScriptName(conn remote.Remote, task Task, ch chan<- RunOutput)

TODO

  • Allow the remote copy operations to be done using password authentication (see issue #1)
  • Implement Python bridge allowing geto to be wielded from Python. There is already a proof-of-concept code checked into the geto repo. The code consists of a Go JSON rpc server and Python RPC client that calls it.
  • Various TODO-marked code.

Documentation

Overview

Geto's main package

Parse command line arguments and let the fun begin!

Directories

Path Synopsis
lib
config
Configuration file management
Configuration file management
host
Provide the host structure
Provide the host structure
remote/dummy
Dummy remote for unit testing purposes
Dummy remote for unit testing purposes
ssh
All of the calls to the external SSH library (code.google.com/p/go.crypto/ssh) will go through this package.
All of the calls to the external SSH library (code.google.com/p/go.crypto/ssh) will go through this package.
task
Run tasks on the hosts and get results Provide the script structure and functions.
Run tasks on the hosts and get results Provide the script structure and functions.
Start a raw JSON RPC server A client may call (via RPC) any of the GetoRPC functions exported here.
Start a raw JSON RPC server A client may call (via RPC) any of the GetoRPC functions exported here.
test
ssh

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL