lightpatch

package module
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 26, 2020 License: Apache-2.0 Imports: 10 Imported by: 0

README

lightpatch Build Status

lightpatch is a simple library for creating and apply patch files, similar to the functions of diff and patch. It is available primarily as a Go library, though a simple CLI tool is bundled as well.

The main goals of the project are:

  • Efficient patching of files. The patch files themselves are not intended to be human-readable diffs.
  • Byte-level diffs. This means it will work well when line-oriented tools like diff won't. An example of this being important would be changes to a single-line minified Javascript file. Furthermore, there is no presumuption that the input is in UTF-8 or any other encoding. It's just a stream of bytes.
  • The patchfile format is simple and well-defined. A basic decoder can be written in a few dozen lines. (Contrast to VCDIFF and its RFC that is nearly 30 pages long.)
  • Patches may contain an integrity check (CRC-32) which provides strong validation that the output file matches the original input. Note: the CRC is to detect accidental corruption or a patch being applied to the wrong file. It offers little protection against malicious tampering.
  • Simple API for developers. The go-diff library has over 30 functions, whereas lightpatch has three.

The core diff algorithm is general purpose and is geared towards text. While it will handle binary files fine (e.g. you could efficiently patch some changed EXIF data in a JPG image), it does not have the advanced handling of binary diffs you'll find in specialized tools such as bsdiff, xdelta, etc. On the other hand it is much simpler than those tools, especially for decoding.

CLI use

You can easily build from source by installing Go and running:

go get -u github.com/kalafut/lightpatch/cmd/lightpatch

Making and applying patch files are the two supported commands:

lightpatch make file1 file2 > patch
lightpatch apply file1 patch > output        # should match file2
lightpatch make --t 30s file1 file2 > patch  # allow 30s to make the patch

lightpatch is very fast in the general case, but if you give it two very different files, it will try hard to find a diff even when there isn't one. By default it will "give up" after 5 seconds (usually plenty of time even for large files), but this is adjustable with the --t option.

Note: the command still succeeds even if the timeout is reached, but the output might be a naïve diff that is just the new file in its entirety.

Go library use

The API is described in the docs. The source for the CLI tool is also a good example.

File Format

The lightpatch file format is a simple TLV style. The patch file provide edit instruction to be applied to a source file. The command format is:

command [len] [data]

Commands
Command Code Notes
Copy C (0x43) Copy len bytes from source to dest. data is not used.
Insert I (0x49) Insert the next len bytes from data into dest.
Delete D (0x44) "Delete" the next len source bytes by advancing the source input and output nothing to dest. data is not used.
Checksum K (0x4B) (Optional) The next 4 bytes are the CRC-32 of dest. If present, this must be the final command of the patch file.

The len parameter is varint encoded. Libraries are readily available to handle this encoding (and even a hand-rolled decoder is only a few lines).

Checksum (CRC-32)

CRC handling is optional on both ends. An encoder doesn't have to include it, and decoder don't have to verify them. It's better if they do, but in very simple cases the complexity may not be desired. Regardless if a decoder is verifying it or not, it should still return an error if there is data following the CRC, as that is an invalid patch.

The CRC uses the common CRC-32-IEEE polynomial.

Example

Before:

The quick brown fox jumped over the lazy dog

After:

The quick brown fox leaped over the lazy dog.

Patch (hex):

43 14          copy 20 bytes ("The quick brown fox ")
44 03          delete/skip 3 bytes ("jum")
49 03 6c 65 61 insert 3 bytes ("lea")
43 15          copy 21 bytes ("ped over the lazy dog")
49 01 2e       insert 1 byte (".")
4b 96 f6 b7 6c CRC-32 is 0x96f6b76c
Credits

lightpatch borrows heavily from:

  • go-diff, which provided the inspiration for this project and is still the source of most of the diff generation code (albeit translated from character to byte handling).
  • Google's Diff-Map-Patch library provided the fundamental diff generation algorithms upon which lightpatch, go-diff, and many variants are built upon.

Documentation

Overview

Package lightpatch generates and applies patch files. A description of the patch file format is included in the README.

Index

Constants

View Source
const (
	OpCopy   byte = 'C'
	OpInsert byte = 'I'
	OpDelete byte = 'D'
	OpCRC    byte = 'K'

	DefaultTimeout = 5 * time.Second
)

Variables

View Source
var (
	ErrCRC       = errors.New("CRC mismatch")
	ErrExtraData = errors.New("unexpected data following CRC")
)

Functions

func ApplyPatch

func ApplyPatch(before, patch io.Reader, after io.Writer) error

ApplyPatch reads before, applies the edits from patch, and writes the output to after.

func MakePatch

func MakePatch(before, after io.Reader, output io.Writer) error

MatchPatch generates a diff to change before into after, writing the output to patch.

func MakePatchTimeout

func MakePatchTimeout(before, after io.Reader, patch io.Writer, timeout time.Duration) error

MatchPatchTimeout generates a diff to change before into after, writing the output to patch. timeout is the max time to try to make an efficient patch. The operation will still succeed even if timeout is reached, with perhaps a less compact patch. If timeout is 0 the function will take as long as it needs to complete.

Types

This section is empty.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL