level: github.com/cdelorme/level Index | Files

package level

import "github.com/cdelorme/level"

This package provides a utility that scans files and checks for duplicates.

Index

Package Files

level.go

type Logger Uses

type Logger interface {
    Error(string, ...interface{})
    Info(string, ...interface{})
    Debug(string, ...interface{})
}

A minimum logger interface with three severities.

type Six Uses

type Six struct {
    Input    string `json:"input,omitempty"`
    Excludes string `json:"excludes,omitempty"`
    Test     bool   `json:"test,omitempty"`
    L        Logger `json:"-"`
    S        Stats  `json:"-"`
    // contains filtered or unexported fields
}

An abstraction to deduplication logic, with a minimal interface.

func (*Six) Delete Uses

func (s *Six) Delete()

Iterate filtered files to delete each, and attempt to clear any empty parent folders recursively.

func (*Six) Filtered Uses

func (s *Six) Filtered() []string

Returns the duplicates marked for deletion.

func (*Six) LastOrder Uses

func (s *Six) LastOrder()

Initializes the metrics system, which sets the start time and clears data.

Ensures the input path is both absolute and clean, parses the supplied excludes, and initializes private maps and slices clearing any former data.

Uses path/filepath.WalkFunc to iterate all files in the input path, and discards any zero-size files, symbolic links, or files matching the list of case-sensitive excludes. It groups the remaining files by size.

Any errors encountered while walking the file system will be logged and then discarded so the program may continue.

Iterates each set of files grouped by size, and two at a time will be checked using os.SameFile to discard hard-links, and then buffered byte-by-byte comparison.

The buffered comparison offers early termination, making it a faster solution than hash checks. Additionally, the code is written to work with the possibility of multiple duplicate groups of the same size.

Files with matching data are put into an unnamed group and appended to the slice of duplicates.

Finally it sorts the groups of duplicates, using a weighted score by depth and then by recurrence of parent path. The file with the lowest score in the group will be kept, and the rest are appended to a single dimensional slice, which can be requested by Filtered and is used by Delete.

type Stats Uses

type Stats interface {
    Add(string, int) int
}

Package level imports 7 packages (graph). Updated 2017-07-18. Refresh now. Tools for package owners.