torus

package module
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 15, 2016 License: Apache-2.0 Imports: 17 Imported by: 96

README

Torus

Build Status Go Report Card GoDoc

Torus is an open source project for distributed storage coordinated through etcd.

Torus provides a resource pool and basic file primitives from a set of daemons running atop multiple nodes. These primitives are made consistent by being append-only and coordinated by etcd. From these primitives, a Torus server can support multiple types of volumes, the semantics of which can be broken into subprojects. It ships with a simple block-device volume plugin, but is extensible to more.

Quick-glance overview

Sharding is done via a consistent hash function, controlled in the simple case by a hash ring algorithm, but fully extensible to arbitrary maps, rack-awareness, and other nice features. The project name comes from this: a hash 'ring' plus a 'volume' is a torus.

Project Status

Torus is at an early stage and under active development. We do not recommend its use in production, but we encourage you to try out Torus and provide feedback via issues and pull requests. Consequently, speed, while nice to have, is a secondary concern to stability at this time.

Trying out Torus

To get started quicky using Torus for the first time, start with the guide to running your first Torus cluster, learn more about setting up Torus on Kubernetes using FlexVolumes in contrib, or create a Torus cluster on bare metal.

Contributing to Torus

Torus is an open source project and contributors are welcome! Join us on IRC at #coreos on freenode.net, file an issue here on Github, check out bigger plans on the kind/design tag, contribute on bugs that are low hanging fruit for issue ideas and check the project layout for a guide to the sections that might interest you.

Licensing

Unless otherwise noted, all code in the Torus repository is licensed under the Apache 2.0 license. Some portions of the codebase are derived from other projects under different licenses; the appropriate information can be found in the header of those source files, as applicable.

Documentation

Overview

Torus is a distributed storage system, allowing for the creation and coordination of sharded, replicated files. For more details, see the README at https://github.com/coreos/torus

Index

Constants

View Source
const (
	CtxWriteLevel int = iota
	CtxReadLevel
)
View Source
const BlockRefByteSize = 8 * 3
View Source
const INodeRefByteSize = 8 * 2
View Source
const VolumeIDByteSize = 5
View Source
const (
	VolumeMax = 0x000000FFFFFFFFFF
)

Variables

View Source
var (
	// ErrBlockUnavailable is returned when a function fails to retrieve a known
	// block.
	ErrBlockUnavailable = errors.New("torus: block cannot be retrieved")

	// ErrINodeUnavailable is returned when a function fails to retrieve a known
	// INode.
	ErrINodeUnavailable = errors.New("torus: inode cannot be retrieved")

	// ErrBlockNotExist is returned when a function attempts to manipulate a
	// non-existent block.
	ErrBlockNotExist = errors.New("torus: block doesn't exist")

	// ErrClosed is returned when a function attempts to manipulate a Store
	// that is not currently open.
	ErrClosed = errors.New("torus: store is closed")

	// ErrInvalid is a locally invalid operation (such as Close()ing a nil file pointer)
	ErrInvalid = errors.New("torus: invalid operation")

	// ErrOutOfSpace is returned when the block storage is out of space.
	ErrOutOfSpace = errors.New("torus: out of space on block store")

	// ErrExists is returned if the entity already exists
	ErrExists = errors.New("torus: already exists")

	// ErrNotExist is returned if the entity doesn't already exist
	ErrNotExist = errors.New("torus: doesn't exist")

	// ErrAgain is returned if the operation was interrupted. The call was valid, and
	// may be tried again.
	ErrAgain = errors.New("torus: interrupted, try again")

	// ErrNoGlobalMetadata is returned if the metadata service hasn't been formatted.
	ErrNoGlobalMetadata = errors.New("torus: no global metadata available at mds")

	// ErrNonSequentialRing is returned if the ring's internal version number appears to jump.
	ErrNonSequentialRing = errors.New("torus: non-sequential ring")

	// ErrNoPeer is returned if the peer can't be found.
	ErrNoPeer = errors.New("torus: no such peer")

	// ErrCompareFailed is returned if the CAS operation failed to compare.
	ErrCompareFailed = errors.New("torus: compare failed")

	// ErrIsSymlink is returned if we're trying to modify a symlink incorrectly.
	ErrIsSymlink = errors.New("torus: is symlink")

	// ErrNotDir is returned if we're trying a directory operation on a non-directory path.
	ErrNotDir = errors.New("torus: not a directory")

	// ErrWrongVolumeType is returned if the operation cannot be performed on this type of volume.
	ErrWrongVolumeType = errors.New("torus: wrong volume type")

	// ErrNotSupported is returned if the interface doesn't implement the
	// requested subfunctionality.
	ErrNotSupported = errors.New("torus: not supported")

	// ErrLocked is returned if the resource is locked.
	ErrLocked = errors.New("torus: locked")

	// ErrLeaseNotFound is returned if the lease cannot be found.
	ErrLeaseNotFound = errors.New("torus: lease not found")
)
View Source
var BlockLog = capnslog.NewPackageLogger("github.com/coreos/torus", "blocklog")
View Source
var Version string

Version is set by build scripts, do not touch.

Functions

func InitMDS added in v0.1.0

func InitMDS(name string, cfg Config, gmd GlobalMetadata, ringType RingType) error

InitMDS calls the specific init function provided by a metadata package.

func MarshalBlocksetToProto

func MarshalBlocksetToProto(bs Blockset) ([]*models.BlockLayer, error)

func MkdirsFor added in v0.1.0

func MkdirsFor(dir string) error

func RegisterBlockStore

func RegisterBlockStore(name string, newFunc NewBlockStoreFunc)

func RegisterMetadataInit added in v0.1.0

func RegisterMetadataInit(name string, newFunc InitMDSFunc)

RegisterMetadataInit is the hook used for implementations of MetadataServices to register their ways of creating base metadata to the system.

func RegisterMetadataService

func RegisterMetadataService(name string, newFunc CreateMetadataServiceFunc)

RegisterMetadataService is the hook used for implementations of MetadataServices to register themselves to the system. This is usually called in the init() of the package that implements the MetadataService. A similar pattern is used in database/sql of the standard library.

func RegisterMetadataWipe added in v0.1.0

func RegisterMetadataWipe(name string, newFunc WipeMDSFunc)

RegisterMetadataWipe is the hook used for implementations of MetadataServices to register their ways of deleting their metadata from the consistent store

func RegisterSetRing

func RegisterSetRing(name string, newFunc SetRingFunc)

RegisterSetRing is the hook used for implementations of MetadataServices to register their ways of creating base metadata to the system.

func SetRing

func SetRing(name string, cfg Config, r Ring) error

SetRing calls the specific SetRing function provided by a metadata package.

func WipeMDS added in v0.1.0

func WipeMDS(name string, cfg Config) error

Types

type BlockIterator

type BlockIterator interface {
	Err() error
	Next() bool
	BlockRef() BlockRef
	Close() error
}

type BlockLayer

type BlockLayer struct {
	Kind    BlockLayerKind
	Options string
}

type BlockLayerKind

type BlockLayerKind int

type BlockLayerSpec

type BlockLayerSpec []BlockLayer

type BlockRef

type BlockRef struct {
	INodeRef
	Index IndexID
}

BlockRef is the identifier for a unique block in the cluster.

func BlockFromProto

func BlockFromProto(p *models.BlockRef) BlockRef

func BlockRefFromBytes

func BlockRefFromBytes(b []byte) BlockRef

func ZeroBlock

func ZeroBlock() BlockRef

func (BlockRef) BlockType

func (b BlockRef) BlockType() BlockType

func (BlockRef) HasINode

func (b BlockRef) HasINode(i INodeRef, t BlockType) bool

func (BlockRef) IsZero

func (b BlockRef) IsZero() bool

func (*BlockRef) SetBlockType

func (b *BlockRef) SetBlockType(t BlockType)

func (BlockRef) String

func (b BlockRef) String() string

func (BlockRef) ToBytes

func (b BlockRef) ToBytes() []byte

func (BlockRef) ToBytesBuf added in v0.1.0

func (b BlockRef) ToBytesBuf(buf []byte)

func (BlockRef) ToProto

func (b BlockRef) ToProto() *models.BlockRef

type BlockStore

type BlockStore interface {
	Store
	HasBlock(ctx context.Context, b BlockRef) (bool, error)
	GetBlock(ctx context.Context, b BlockRef) ([]byte, error)
	WriteBlock(ctx context.Context, b BlockRef, data []byte) error
	WriteBuf(ctx context.Context, b BlockRef) ([]byte, error)
	DeleteBlock(ctx context.Context, b BlockRef) error
	NumBlocks() uint64
	UsedBlocks() uint64
	BlockIterator() BlockIterator
	BlockSize() uint64
}

BlockStore is the interface representing the standardized methods to interact with something storing blocks.

func CreateBlockStore

func CreateBlockStore(kind string, name string, cfg Config, gmd GlobalMetadata) (BlockStore, error)

type BlockType

type BlockType uint16
const (
	TypeBlock BlockType = iota
	TypeINode
)

type Blockset

type Blockset interface {
	// Length returns the number of blocks in the Blockset.
	Length() int
	// Kind returns the kind of the Blockset.
	Kind() uint32
	// GetBlock returns the ith block in the Blockset.
	GetBlock(ctx context.Context, i int) ([]byte, error)
	// PutBlock puts a block with data `b` into the Blockset as its ith block.
	// The block belongs to the given inode.
	PutBlock(ctx context.Context, inode INodeRef, i int, b []byte) error
	// GetLiveInodes returns the current INode representation of the Blockset.
	// The returned INode might not be synced.
	GetLiveINodes() *roaring.Bitmap
	// GetAllBlockRefs returns the BlockRef of the blocks in the Blockset.
	// The ith BlockRef in the returned slice is the Ref of the ith Block in the
	// Blockset.
	GetAllBlockRefs() []BlockRef

	// Marshal returns the bytes representation of the Blockset.
	Marshal() ([]byte, error)
	// Unmarshal parses the bytes representation of the Blockset and stores the result
	// in the Blockset.
	Unmarshal(data []byte) error
	// GetSubBlockset gets the sub-Blockset of the Blockset if exists.
	// If there is no sub-Blockset, nil will be returned.
	GetSubBlockset() Blockset
	// Truncate changes the length of the Blockset and the block. If the Blockset has less
	// blocks than the required size, truncate adds zero blocks. If the block has less bytes
	// than required size, truncate add bytes into block.
	Truncate(lastIndex int, blocksize uint64) error
	// Trim zeros the blocks in range [from, to).
	Trim(from, to int) error
	// String implements the fmt.Stringer interface.
	String() string
}

Blockset is the interface representing the standardized methods to interact with a set of blocks.

type Config

type Config struct {
	DataDir         string
	StorageSize     uint64
	MetadataAddress string
	ReadCacheSize   uint64
	ReadLevel       ReadLevel
	WriteLevel      WriteLevel

	TLS *tls.Config
}

type CreateMetadataServiceFunc

type CreateMetadataServiceFunc func(cfg Config) (MetadataService, error)

CreateMetadataServiceFunc is the signature of a constructor used to create a registered MetadataService.

type DebugMetadataService

type DebugMetadataService interface {
	DumpMetadata(io.Writer) error
}

type File

type File struct {
	ReadOnly bool
	// contains filtered or unexported fields
}

func (*File) Close

func (f *File) Close() error

func (*File) Read

func (f *File) Read(b []byte) (n int, err error)

func (*File) ReadAt

func (f *File) ReadAt(b []byte, off int64) (n int, ferr error)

func (*File) Replaces

func (f *File) Replaces() uint64

func (*File) Seek

func (f *File) Seek(offset int64, whence int) (int64, error)

func (*File) Size

func (f *File) Size() uint64

func (*File) SyncAllWrites

func (f *File) SyncAllWrites() (INodeRef, error)

func (*File) SyncBlocks added in v0.1.0

func (f *File) SyncBlocks() error

func (*File) SyncINode added in v0.1.0

func (f *File) SyncINode(ctx context.Context) (INodeRef, error)

func (*File) Trim

func (f *File) Trim(offset, length int64) error

Trim zeroes data in the middle of a file.

func (*File) Truncate

func (f *File) Truncate(size int64) error

func (*File) Write

func (f *File) Write(b []byte) (n int, err error)

func (*File) WriteAt

func (f *File) WriteAt(b []byte, off int64) (n int, err error)

func (*File) WriteOpen

func (f *File) WriteOpen() bool

type GlobalMetadata

type GlobalMetadata struct {
	BlockSize        uint64
	DefaultBlockSpec BlockLayerSpec
	INodeReplication int
}

type INodeID

type INodeID uint64

INodeID represents a unique identifier for an INode.

type INodeIterator

type INodeIterator struct {
	// contains filtered or unexported fields
}

func (*INodeIterator) Close

func (i *INodeIterator) Close() error

func (*INodeIterator) Err

func (i *INodeIterator) Err() error

func (*INodeIterator) INodeRef

func (i *INodeIterator) INodeRef() INodeRef

func (*INodeIterator) Next

func (i *INodeIterator) Next() bool

type INodeRef

type INodeRef struct {
	INode INodeID
	// contains filtered or unexported fields
}

INodeRef is a reference to a unique INode in the cluster.

func INodeFromProto

func INodeFromProto(p *models.INodeRef) INodeRef

func INodeRefFromBytes

func INodeRefFromBytes(b []byte) INodeRef

func NewINodeRef

func NewINodeRef(vol VolumeID, i INodeID) INodeRef

func ZeroINode

func ZeroINode() INodeRef

func (INodeRef) Equals added in v0.1.0

func (i INodeRef) Equals(x INodeRef) bool

func (INodeRef) String

func (i INodeRef) String() string

func (INodeRef) ToBytes

func (i INodeRef) ToBytes() []byte

func (INodeRef) ToProto

func (i INodeRef) ToProto() *models.INodeRef

func (INodeRef) Volume

func (i INodeRef) Volume() VolumeID

type INodeStore

type INodeStore struct {
	// contains filtered or unexported fields
}

func NewINodeStore

func NewINodeStore(bs BlockStore) *INodeStore

func (*INodeStore) Close

func (b *INodeStore) Close() error

func (*INodeStore) DeleteINode

func (b *INodeStore) DeleteINode(ctx context.Context, i INodeRef) error

func (*INodeStore) Flush

func (b *INodeStore) Flush() error

func (*INodeStore) GetINode

func (b *INodeStore) GetINode(ctx context.Context, i INodeRef) (*models.INode, error)

func (*INodeStore) INodeIterator

func (b *INodeStore) INodeIterator() *INodeIterator

func (*INodeStore) WriteINode

func (b *INodeStore) WriteINode(ctx context.Context, i INodeRef, inode *models.INode) error

type IndexID

type IndexID uint64

IndexID represents a unique identifier for an Index.

type InitMDSFunc added in v0.1.0

type InitMDSFunc func(cfg Config, gmd GlobalMetadata, ringType RingType) error

InitMDSFunc is the signature of a function which preformats a metadata service.

type MetadataKind

type MetadataKind int
const (
	EtcdMetadata MetadataKind = iota
	TempMetadata
)

type MetadataService

type MetadataService interface {
	GetVolumes() ([]*models.Volume, VolumeID, error)
	GetVolume(volume string) (*models.Volume, error)
	NewVolumeID() (VolumeID, error)
	Kind() MetadataKind

	GlobalMetadata() (GlobalMetadata, error)

	// Returns a UUID based on the underlying datadir. Should be
	// unique for every created datadir.
	UUID() string

	GetRing() (Ring, error)
	SubscribeNewRings(chan Ring)
	UnsubscribeNewRings(chan Ring)
	SetRing(ring Ring) error

	WithContext(ctx context.Context) MetadataService

	GetLease() (int64, error)
	RenewLease(int64) error
	RegisterPeer(lease int64, pi *models.PeerInfo) error
	GetPeers() (PeerInfoList, error)

	Close() error

	CommitINodeIndex(VolumeID) (INodeID, error)
	GetINodeIndex(VolumeID) (INodeID, error)
}

MetadataService is the interface representing the basic ways to manipulate consistently stored fileystem metadata.

func CreateMetadataService

func CreateMetadataService(name string, cfg Config) (MetadataService, error)

CreateMetadataService calls the constructor of the specified MetadataService with the provided address.

type ModifyableRing

type ModifyableRing interface {
	ChangeReplication(r int) (Ring, error)
}

type NewBlockStoreFunc

type NewBlockStoreFunc func(string, Config, GlobalMetadata) (BlockStore, error)

type PeerInfoList

type PeerInfoList []*models.PeerInfo

func (PeerInfoList) AndNot

func (pi PeerInfoList) AndNot(b PeerList) PeerInfoList

func (PeerInfoList) GetWeights

func (pi PeerInfoList) GetWeights() map[string]int

func (PeerInfoList) HasUUID

func (pi PeerInfoList) HasUUID(uuid string) bool

func (PeerInfoList) Intersect

func (pi PeerInfoList) Intersect(b PeerInfoList) PeerInfoList

func (PeerInfoList) PeerList

func (pi PeerInfoList) PeerList() PeerList

func (PeerInfoList) UUIDAt

func (pi PeerInfoList) UUIDAt(uuid string) int

func (PeerInfoList) Union

type PeerList

type PeerList []string

func (PeerList) AndNot

func (pl PeerList) AndNot(b PeerList) PeerList

func (PeerList) Has

func (pl PeerList) Has(uuid string) bool

func (PeerList) IndexAt

func (pl PeerList) IndexAt(uuid string) int

func (PeerList) Intersect

func (pl PeerList) Intersect(b PeerList) PeerList

func (PeerList) Union

func (pl PeerList) Union(b PeerList) PeerList

type PeerPermutation

type PeerPermutation struct {
	Replication int
	Peers       PeerList
}

type ReadLevel

type ReadLevel int
const (
	ReadBlock ReadLevel = iota
	ReadSequential
	ReadSpread
)

type Ring

type Ring interface {
	GetPeers(key BlockRef) (PeerPermutation, error)
	Members() PeerList

	Describe() string
	Type() RingType
	Version() int

	Marshal() ([]byte, error)
}

type RingAdder

type RingAdder interface {
	ModifyableRing
	AddPeers(PeerInfoList) (Ring, error)
}

type RingRemover

type RingRemover interface {
	ModifyableRing
	RemovePeers(PeerList) (Ring, error)
}

type RingType

type RingType int

type Server

type Server struct {
	Blocks BlockStore
	MDS    MetadataService
	INodes *INodeStore

	Cfg Config

	ReplicationOpen bool
	// contains filtered or unexported fields
}

Server is the type representing the generic distributed block store.

func NewMemoryServer

func NewMemoryServer() *Server

func NewServer

func NewServer(cfg Config, metadataServiceKind, blockStoreKind string) (*Server, error)

func NewServerByImpl

func NewServerByImpl(cfg Config, mds MetadataService, blocks BlockStore) (*Server, error)

func (*Server) AddTimeoutCallback

func (s *Server) AddTimeoutCallback(f func(uuid string))

func (*Server) BeginHeartbeat

func (s *Server) BeginHeartbeat(addr *url.URL) error

BeginHeartbeat spawns a goroutine for heartbeats. Non-blocking.

func (*Server) Close

func (s *Server) Close() error

func (*Server) CreateFile

func (s *Server) CreateFile(volume *models.Volume, inode *models.INode, blocks Blockset) (*File, error)

func (*Server) Debug

func (s *Server) Debug(w io.Writer) error

Debug writes a bunch of debug output to the io.Writer.

func (*Server) ExtendContext

func (s *Server) ExtendContext(ctx context.Context) context.Context

func (*Server) GetPeerMap

func (s *Server) GetPeerMap() map[string]*models.PeerInfo

func (*Server) Lease

func (s *Server) Lease() int64

func (*Server) UpdatePeerMap

func (s *Server) UpdatePeerMap() map[string]*models.PeerInfo

func (*Server) UpdateRebalanceInfo

func (s *Server) UpdateRebalanceInfo(ri *models.RebalanceInfo)

type SetRingFunc

type SetRingFunc func(cfg Config, r Ring) error

type Store

type Store interface {
	Kind() string
	Flush() error
	Close() error
}

Store is the interface that represents methods that should be common across all types of storage providers.

type VolumeID

type VolumeID uint64

VolumeID represents a unique identifier for a Volume.

func (VolumeID) ToBytes

func (v VolumeID) ToBytes() []byte

type WipeMDSFunc added in v0.1.0

type WipeMDSFunc func(cfg Config) error

type WriteLevel

type WriteLevel int
const (
	WriteAll WriteLevel = iota
	WriteOne
	WriteLocal
)

func ParseWriteLevel added in v0.1.0

func ParseWriteLevel(s string) (wl WriteLevel, err error)

Directories

Path Synopsis
block provides the implementation of the "block" volume type, using a Torus file as a block device.
block provides the implementation of the "block" volume type, using a Torus file as a block device.
aoe
aoe provides the implementation of an ATA over Ethernet server, backed by a Torus block volume.
aoe provides the implementation of an ATA over Ethernet server, backed by a Torus block volume.
blockset provides a registry of BlockLayers, that can be (Un)Marshaled and retrieve blocks from a Torus storage interface.
blockset provides a registry of BlockLayers, that can be (Un)Marshaled and retrieve blocks from a Torus storage interface.
cmd
distributor is a complex implementation of a Torus storage interface, that understands rebalancing it's underlying storage and fetching data from peers, as necessary.
distributor is a complex implementation of a Torus storage interface, that understands rebalancing it's underlying storage and fetching data from peers, as necessary.
protocols
protocols is the metapackage for the RPC protocols for how Torus' storage layer communicates with other storage servers.
protocols is the metapackage for the RPC protocols for how Torus' storage layer communicates with other storage servers.
rebalance
rebalance provides the implementation of the rebalancer, which continually checks the data stored on a host, knows where data should live, and moves it to the appropriate servers.
rebalance provides the implementation of the rebalancer, which continually checks the data stored on a host, knows where data should live, and moves it to the appropriate servers.
gc provides the Torus interface for how garbage collection is implemented.
gc provides the Torus interface for how garbage collection is implemented.
internal
flagconfig
flagconfig is a generic set of flags dedicated to configuring a Torus client.
flagconfig is a generic set of flags dedicated to configuring a Torus client.
nbd
Package nbd uses the Linux NBD layer to emulate a block device in user space
Package nbd uses the Linux NBD layer to emulate a block device in user space
metadata is the metapackage for the implementations of the metadata interface, for each potential backend.
metadata is the metapackage for the implementations of the metadata interface, for each potential backend.
models is the package containing all the protos used for serializing data for Torus.
models is the package containing all the protos used for serializing data for Torus.
ring is the package containing implementations of the consistent hash ring, a pure function which provides a permutation of peers where a block can live, known by all members of the cluster.
ring is the package containing implementations of the consistent hash ring, a pure function which provides a permutation of peers where a block can live, known by all members of the cluster.
storage is the package which implements the underlying, on-disk storage API for Torus servers.
storage is the package which implements the underlying, on-disk storage API for Torus servers.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL