synchronicity

package module
v0.0.0-...-b763669 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 4, 2014 License: MIT Imports: 19 Imported by: 0

README

synchronicity

Refactor in process, do not use

A flexible sync package for concurrently inventorying and syncing two directories allowing for idempotent pushes of sources to destinations; among other things.

Synchronicity is usable right away, with no additional configuration. It can be used using its internal Synchro struct or it can return a synchronicity.Synchro struct for you to use directly. Both can be configured to better suit the environment within which it will be running, but that's optional.

Synchronicity supports both byte based and checksum, or digest, comparison. By default, it compares bytes, which is much faster; using less CPU and memory..

Errors result in the operation failing, which may lead to the destination state being in an incomplete update state. If this is an issue, a way to backup and rollback the state of the destination directory should be implemented. This is something synchronicity will be supporting in the future.

At this point, synchronicity only supports the push operation.

Comparisons

Synchronicity supports several methods of checking files for changes:

  • File size: if the file size is different, it has changed.
  • Byte comparison: the files are compared in chunks of bytes until a change is detected or EOF.
  • Digest comparison: the file hashes are compared. Synchronicity supports SHA256.
  • Chunked digest comparison: chunks of bytes are read from each file and a digest of the chunk is generated and compared.

If the files' contents are found to be equal, their properties are checked for changes to see if that information should be updated.

Source files always take precedence over destination files; this makes push operations idempotent.

Tasks

File comparisons can result in the following tasks:

  • New: the source file doesn't exist in the destination.
  • Copy: the source and destination files are different at the byte level; update destination with source.
  • Update: the source and destination file information are different; update destination with the sources'.
  • No action: the source and destination files are exactly the same; they are flagged as duplicates.

Execution overview

The destination directory is first indexed; the properties of each file encountered, and, optionally, its checksum or checksums of a portion of the file, are indexed.

The source directory is then walked. For each file encountered, the destination index is checked to see if it already exists. If it does not exist, a new action is initiated for that file. If the file does exist, its checksum is compared to the destination file's checksum; a copy task is generated for each comparison that results in a difference. If the checksums are the same, the files properties, mdate and mode, are checked. If there are any differences in those properties, an update task is generated. An update task does not result in the copying of the source file data, only its header information is copied to the destination file.

If delete is enabled, any orphaned files in the destination are deleted. An orphaned file is a file that exists in the destination but does not exist in the source. This means that any destination file that was not compared to a source file is deleted.

Logging

Synchronicity uses the standard log package. By default it logs to ioutil.Discard. Call synchronicity.SetLogger(*io.writer*) to set the log output destination. To enable verbose output, set the synchronicity.Verbose bool to true. The verbose output is written to the log: there currently isn't very much verbose information generated.

Experimental filtering support

Synchronicity has experimental support for file filtering using include and exclude filters. Include filters only looks at files that match the include filters. Exclude filters excludes any files that match the exclude filters. These filters can be applied to either file suffixes or as prefixes to filenames.

Future Functionality
  • Support for filtering on time.
  • Writing directory inventory and information, including checksums, to file or other persistent store.
  • Creation of compressed archive for:
    • each destination file replaced or deleted
    • each set of destination files replaced or deleted
    • each source file pushed to a destination
    • each set of source files pushed to a destination
  • Encryped archives
  • Rollback
  • Rollforward

License

Modified BSD Style license. Please view LICENSE file for details.

Documentation

Overview

Copyright 2014 Joel Scoble (github.com/mohae) All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.

Copyright 2014 Joel Scoble (github.com/mohae) All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.

Based on Richard Clayton's blog post on pipelines:

https://rclayton.silvrback.com/pipelines-in-golang

Which is based on Sameer Ajmani's pipeline post:

https://blog.golang.org/pipelines

Copyright 2014 Joel Scoble (github.com/mohae) All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.

This file generates the test directories and files for various testing. This may be refactored to its own package, at some point as it'll probably appear in more than one place.

Index

Constants

View Source
const (
	UnknownEquality equalityType = iota
	BasicEquality                // compare bytes for equality check
	DigestEquality               // compare digests for equality check: digest entire file at once
	ChunkedEquality              // compare digests for equality check digest using chunks
)
View Source
const (
	SHA256 hashType
)

Variables

View Source
var FName, FData []string

Files

View Source
var MaxChunks = 4 // Modify directly to change buffered hashes
View Source
var ReadAll = true
View Source
var VLogger *log.Logger // Use separate logger for verbose output

Functions

func Delta

func Delta() float64

Delta returns the 𝛥 between the start and end of an operation/

func DisableLog

func DisableLog()

func DstFileData

func DstFileData() map[string]*FileData

DstFileData returns the map of FileData accumulated during the walk of the destination.

func EqualityType

func EqualityType(s string) equalityType

func Log

func Log(v ...interface{})

Use for verbose output. Anything using Log() is for Verbosity.

func Logf

func Logf(format string, v ...interface{})

Use for verbose output. Anything using Logf() is for Verbosity.

func Message

func Message() string

Message returns stats about the last Synch.

func ParseHashType

func ParseHashType(s string) hashType

ParseHashType returns the hashType for a given string.

func Pull

func Pull(src, dst string) (string, error)

Pull is just a Push from dst to src

func Push

func Push(src, dst string) (string, error)

Push pushes the contents of src to dst.

  • Existing files that are the same are ignored
  • Modified files are overwritten, even if dst is newer
  • New files are created.
  • Files in destination not in source may be deleted.

func SetCPUMultiplier

func SetCPUMultiplier(i int)

SetCPUMultiplier sets both the multipler and the maxProcs. If the multiplier is <= 0, 1 is used

func SetChunkSize

func SetChunkSize(i int)

SetChunkSize sets the chunkSize as 1k * i, i.e. 4 == 4k chunkSize If the multiplier, i, is < 0, the default is used, 4.

func SetDelete

func SetDelete(b bool)

SetDelete is used to set the mainSynchro's delete flag. When working directly with a Synchro object, just set it, Synchro.Delete, instead of calling this function.

func SetEqualityType

func SetEqualityType(e equalityType)

func SetHashType

func SetHashType(s string)

SetHashType sets the hashtype to use based on the passed value.

func SetLogger

func SetLogger(l io.Writer)

func SetVerbose

func SetVerbose(b bool)

SetVerbose sets the verbosity, a false also sets output to ioutil.Discard, true sets output to stdout.

func SetVerboseLogger

func SetVerboseLogger(l io.Writer)

SetVerboseLogger set's the output for the verbose logger. It also sets the verbose flag.

func WriteTestFiles

func WriteTestFiles() (dir string, err error)

Types

type ArchivedSynch

type ArchivedSynch struct {
	Synch *Synch

	ArchiveDst      bool
	ArchiveFilename string
	Pipeline
	// contains filtered or unexported fields
}

Synchro provides information about a sync operation. This trades memory for CPU.

func NewArchivedSynch

func NewArchivedSynch() *ArchivedSynch

New returns an initialized ArchivedSynch. Any overrides need to be done prior to a Synch operation. Archived synchs result in an archive of the files, in the destination, that will be updated, modified, or deleted prior to making any modifications to the destination directory.

If the archived synch process encounters an error, the process will be aborted. This may result in the destination directory being in an unknown state, but it should never result in the original destination files being lost or be in an unknown state.

TODO: Add destination state info to archive so that a rollback process can properly restore the destination directory to its pre-synch state using the created archive. The complete inventory is needed so that new files can be removed during the rollback.

type ByPath

type ByPath struct {
	FileDatas
}

ByPath sorts by RelPath

func (ByPath) Less

func (s ByPath) Less(i, j int) bool

type BySize

type BySize struct {
	FileDatas
}

BySize sorts by filesize

func (BySize) Less

func (s BySize) Less(i, j int) bool

type FileData

type FileData struct {
	Processed bool

	Digests []Hash256

	CurByte int64         // for when the while file hasn't been hashed and
	Root    string        // the relative root of this file: allows for synch support
	Dir     string        // relative path to parent directory of Fi
	Buf     *bytes.Buffer // Cache read files; trade memory for io
	BufPos  int64         // position in buffer
	Fi      os.FileInfo
	// contains filtered or unexported fields
}

func NewFileData

func NewFileData(root, dir string, fi os.FileInfo, s *Synch) *FileData

Returns a FileData struct for the passed file using the defaults. Set any overrides before performing an operation.

func (*FileData) FullPath

func (fd *FileData) FullPath() string

func (*FileData) RelPath

func (fd *FileData) RelPath() string

RelPath returns the relative path of the file, this is the file less the root information. This allows for easy comparision between two directories.

func (*FileData) RootPath

func (fd *FileData) RootPath() string

RootPath returns the relative path of the file including its root. A root is the directory that Synchronicity considers a root, e.g. one of the directories being synched. This is not the FullPath of a file.

func (*FileData) String

func (fd *FileData) String() string

String is an alias to RelPath

type FileDatas

type FileDatas []*FileData

FileDatas is used for sorting FileData info

func (FileDatas) Len

func (s FileDatas) Len() int

func (FileDatas) Swap

func (s FileDatas) Swap(i, j int)

type Hash256

type Hash256 [32]byte

SHA256 sized for hashed blocks.

type Pipe

type Pipe interface {
	Process(in chan *FileData) chan *FileData
}

type Pipeline

type Pipeline struct {
	// contains filtered or unexported fields
}

func NewPipeline

func NewPipeline(pipes ...Pipe) Pipeline

func (*Pipeline) Close

func (p *Pipeline) Close()

func (*Pipeline) Dequeue

func (p *Pipeline) Dequeue(handler func(*FileData))

func (*Pipeline) Enqueue

func (p *Pipeline) Enqueue(item *FileData)

type StringFilter

type StringFilter struct {
	Prefix   string
	Ext      []string
	ExtCount int
	Anchored string
	// contains filtered or unexported fields
}

StringFilter defines the string filters used on files.

type Synch

type Synch struct {

	//	ArchiveDst         bool     // Archive Destination files that will be modified or deleted
	//	ArchiveFilename    string   // ArchiveFilename defaults to: archive-2006-01-02:15:04:05-MST.tgz
	Delete             bool // Delete orphaned dst filed (doesn't exist in src)
	PreserveProperties bool // Preserve file properties(mode, mtime)
	ReadAll            bool // Read the entire file at once; false == chunked read

	//	processingSrc bool                 // if false, processing dst. Only used for walking
	Mod int64 // filemode

	𝛥t float64 // Change in time between start and end of operation
	// contains filtered or unexported fields
}

Synchro provides information about a sync operation. This trades memory for CPU.

func NewSynch

func NewSynch() *Synch

New returns an initialized Synchro. Any overrides need to be done prior to a Synchro operation.

func (*Synch) Delta

func (s *Synch) Delta() float64

Delta returns the 𝛥 between the start and end of an operation/

func (*Synch) DstFileData

func (s *Synch) DstFileData() map[string]*FileData

DstFileData returns the map of FileData accumulated during the walk of the destination.

func (*Synch) Message

func (s *Synch) Message() string

Message returns stats about the last Synch.

func (*Synch) Pull

func (s *Synch) Pull(src, dst string) (string, error)

Pull is just a Push from dst to src

func (*Synch) Push

func (s *Synch) Push(src, dst string) (string, error)

Push pushes the contents of src to dst.

  • Existing files that are the same are ignored
  • Modified files are overwritten, even if dst is newer
  • New files are created.
  • Files in destination not in source may be deleted.

func (*Synch) SetEqualityType

func (s *Synch) SetEqualityType(e equalityType)

func (*Synch) SetHashChunkSize

func (s *Synch) SetHashChunkSize(i int)

SetDigestChunkSize either sets the chunkSize, when a value > 0 is received, using the recieved int as a multipe of 1024 bytes. If the received value is 0, it will use the current chunksize * 4.

func (*Synch) SetMaxProcs

func (s *Synch) SetMaxProcs(i int)

SetMaxProcs sets the maxProcs to the passed value, or 1 for <= 0.

func (*Synch) Stop

func (s *Synch) Stop() error

type TimeFilter

type TimeFilter struct {
	Newer      string
	NewerMTime time.Time
	NewerFile  string
}

TimeFilter defines the time filters used on files.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL