godirwalk

package module
v1.17.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2022 License: BSD-2-Clause Imports: 9 Imported by: 341

README

godirwalk

godirwalk is a library for traversing a directory tree on a file system.

GoDoc Build Status

In short, why did I create this library?

  1. It's faster than filepath.Walk.
  2. It's more correct on Windows than filepath.Walk.
  3. It's more easy to use than filepath.Walk.
  4. It's more flexible than filepath.Walk.

Depending on your specific circumstances, you might no longer need a library for file walking in Go.

Usage Example

Additional examples are provided in the examples/ subdirectory.

This library will normalize the provided top level directory name based on the os-specific path separator by calling filepath.Clean on its first argument. However it always provides the pathname created by using the correct os-specific path separator when invoking the provided callback function.

    dirname := "some/directory/root"
    err := godirwalk.Walk(dirname, &godirwalk.Options{
        Callback: func(osPathname string, de *godirwalk.Dirent) error {
            // Following string operation is not most performant way
            // of doing this, but common enough to warrant a simple
            // example here:
            if strings.Contains(osPathname, ".git") {
                return godirwalk.SkipThis
            }
            fmt.Printf("%s %s\n", de.ModeType(), osPathname)
            return nil
        },
        Unsorted: true, // (optional) set true for faster yet non-deterministic enumeration (see godoc)
    })

This library not only provides functions for traversing a file system directory tree, but also for obtaining a list of immediate descendants of a particular directory, typically much more quickly than using os.ReadDir or os.ReadDirnames.

Description

Here's why I use godirwalk in preference to filepath.Walk, os.ReadDir, and os.ReadDirnames.

It's faster than filepath.Walk

When compared against filepath.Walk in benchmarks, it has been observed to run between five and ten times the speed on darwin, at speeds comparable to the that of the unix find utility; and about twice the speed on linux; and about four times the speed on Windows.

How does it obtain this performance boost? It does less work to give you nearly the same output. This library calls the same syscall functions to do the work, but it makes fewer calls, does not throw away information that it might need, and creates less memory churn along the way by reusing the same scratch buffer for reading from a directory rather than reallocating a new buffer every time it reads file system entry data from the operating system.

While traversing a file system directory tree, filepath.Walk obtains the list of immediate descendants of a directory, and throws away the node type information for the file system entry that is provided by the operating system that comes with the node's name. Then, immediately prior to invoking the callback function, filepath.Walk invokes os.Stat for each node, and passes the returned os.FileInfo information to the callback.

While the os.FileInfo information provided by os.Stat is extremely helpful--and even includes the os.FileMode data--providing it requires an additional system call for each node.

Because most callbacks only care about what the node type is, this library does not throw the type information away, but rather provides that information to the callback function in the form of a os.FileMode value. Note that the provided os.FileMode value that this library provides only has the node type information, and does not have the permission bits, sticky bits, or other information from the file's mode. If the callback does care about a particular node's entire os.FileInfo data structure, the callback can easiy invoke os.Stat when needed, and only when needed.

Benchmarks
macOS
$ go test -bench=. -benchmem
goos: darwin
goarch: amd64
pkg: github.com/karrick/godirwalk
BenchmarkReadDirnamesStandardLibrary-12   50000       26250  ns/op       10360  B/op       16  allocs/op
BenchmarkReadDirnamesThisLibrary-12       50000       24372  ns/op        5064  B/op       20  allocs/op
BenchmarkFilepathWalk-12                      1  1099524875  ns/op   228415912  B/op   416952  allocs/op
BenchmarkGodirwalk-12                         2   526754589  ns/op   103110464  B/op   451442  allocs/op
BenchmarkGodirwalkUnsorted-12                 3   509219296  ns/op   100751400  B/op   378800  allocs/op
BenchmarkFlameGraphFilepathWalk-12            1  7478618820  ns/op  2284138176  B/op  4169453  allocs/op
BenchmarkFlameGraphGodirwalk-12               1  4977264058  ns/op  1031105328  B/op  4514423  allocs/op
PASS
ok  	github.com/karrick/godirwalk	21.219s
Linux
$ go test -bench=. -benchmem
goos: linux
goarch: amd64
pkg: github.com/karrick/godirwalk
BenchmarkReadDirnamesStandardLibrary-12  100000       15458  ns/op       10360  B/op       16  allocs/op
BenchmarkReadDirnamesThisLibrary-12      100000       14646  ns/op        5064  B/op       20  allocs/op
BenchmarkFilepathWalk-12                      2   631034745  ns/op   228210216  B/op   416939  allocs/op
BenchmarkGodirwalk-12                         3   358714883  ns/op   102988664  B/op   451437  allocs/op
BenchmarkGodirwalkUnsorted-12                 3   355363915  ns/op   100629234  B/op   378796  allocs/op
BenchmarkFlameGraphFilepathWalk-12            1  6086913991  ns/op  2282104720  B/op  4169417  allocs/op
BenchmarkFlameGraphGodirwalk-12               1  3456398824  ns/op  1029886400  B/op  4514373  allocs/op
PASS
ok      github.com/karrick/godirwalk    19.179s
It's more correct on Windows than filepath.Walk

I did not previously care about this either, but humor me. We all love how we can write once and run everywhere. It is essential for the language's adoption, growth, and success, that the software we create can run unmodified on all architectures and operating systems supported by Go.

When the traversed file system has a logical loop caused by symbolic links to directories, on unix filepath.Walk ignores symbolic links and traverses the entire directory tree without error. On Windows however, filepath.Walk will continue following directory symbolic links, even though it is not supposed to, eventually causing filepath.Walk to terminate early and return an error when the pathname gets too long from concatenating endless loops of symbolic links onto the pathname. This error comes from Windows, passes through filepath.Walk, and to the upstream client running filepath.Walk.

The takeaway is that behavior is different based on which platform filepath.Walk is running. While this is clearly not intentional, until it is fixed in the standard library, it presents a compatibility problem.

This library fixes the above problem such that it will never follow logical file sytem loops on either unix or Windows. Furthermore, it will only follow symbolic links when FollowSymbolicLinks is set to true. Behavior on Windows and other operating systems is identical.

It's more easy to use than filepath.Walk

While this library strives to mimic the behavior of the incredibly well-written filepath.Walk standard library, there are places where it deviates a bit in order to provide a more easy or intuitive caller interface.

Callback interface does not send you an error to check

Since this library does not invoke os.Stat on every file system node it encounters, there is no possible error event for the callback function to filter on. The third argument in the filepath.WalkFunc function signature to pass the error from os.Stat to the callback function is no longer necessary, and thus eliminated from signature of the callback function from this library.

Furthermore, this slight interface difference between filepath.WalkFunc and this library's WalkFunc eliminates the boilerplate code that callback handlers must write when they use filepath.Walk. Rather than every callback function needing to check the error value passed into it and branch accordingly, users of this library do not even have an error value to check immediately upon entry into the callback function. This is an improvement both in runtime performance and code clarity.

Callback function is invoked with OS specific file system path separator

On every OS platform filepath.Walk invokes the callback function with a solidus (/) delimited pathname. By contrast this library invokes the callback with the os-specific pathname separator, obviating a call to filepath.Clean in the callback function for each node prior to actually using the provided pathname.

In other words, even on Windows, filepath.Walk will invoke the callback with some/path/to/foo.txt, requiring well written clients to perform pathname normalization for every file prior to working with the specified file. This is a hidden boilerplate requirement to create truly os agnostic callback functions. In truth, many clients developed on unix and not tested on Windows neglect this subtlety, and will result in software bugs when someone tries to run that software on Windows.

This library invokes the callback function with some\path\to\foo.txt for the same file when running on Windows, eliminating the need to normalize the pathname by the client, and lessen the likelyhood that a client will work on unix but not on Windows.

This enhancement eliminates necessity for some more boilerplate code in callback functions while improving the runtime performance of this library.

godirwalk.SkipThis is more intuitive to use than filepath.SkipDir

One arguably confusing aspect of the filepath.WalkFunc interface that this library must emulate is how a caller tells the Walk function to skip file system entries. With both filepath.Walk and this library's Walk, when a callback function wants to skip a directory and not descend into its children, it returns filepath.SkipDir. If the callback function returns filepath.SkipDir for a non-directory, filepath.Walk and this library will stop processing any more entries in the current directory. This is not necessarily what most developers want or expect. If you want to simply skip a particular non-directory entry but continue processing entries in the directory, the callback function must return nil.

The implications of this interface design is when you want to walk a file system hierarchy and skip an entry, you have to return a different value based on what type of file system entry that node is. To skip an entry, if the entry is a directory, you must return filepath.SkipDir, and if entry is not a directory, you must return nil. This is an unfortunate hurdle I have observed many developers struggling with, simply because it is not an intuitive interface.

Here is an example callback function that adheres to filepath.WalkFunc interface to have it skip any file system entry whose full pathname includes a particular substring, optSkip. Note that this library still supports identical behavior of filepath.Walk when the callback function returns filepath.SkipDir.

    func callback1(osPathname string, de *godirwalk.Dirent) error {
        if optSkip != "" && strings.Contains(osPathname, optSkip) {
            if b, err := de.IsDirOrSymlinkToDir(); b == true && err == nil {
                return filepath.SkipDir
            }
            return nil
        }
        // Process file like normal...
        return nil
    }

This library attempts to eliminate some of that logic boilerplate required in callback functions by providing a new token error value, SkipThis, which a callback function may return to skip the current file system entry regardless of what type of entry it is. If the current entry is a directory, its children will not be enumerated, exactly as if the callback had returned filepath.SkipDir. If the current entry is a non-directory, the next file system entry in the current directory will be enumerated, exactly as if the callback returned nil. The following example callback function has identical behavior as the previous, but has less boilerplate, and admittedly logic that I find more simple to follow.

    func callback2(osPathname string, de *godirwalk.Dirent) error {
        if optSkip != "" && strings.Contains(osPathname, optSkip) {
            return godirwalk.SkipThis
        }
        // Process file like normal...
        return nil
    }
It's more flexible than filepath.Walk

The default behavior of this library is to ignore symbolic links to directories when walking a directory tree, just like filepath.Walk does. However, it does invoke the callback function with each node it finds, including symbolic links. If a particular use case exists to follow symbolic links when traversing a directory tree, this library can be invoked in manner to do so, by setting the FollowSymbolicLinks config parameter to true.

Configurable Sorting of Directory Children

The default behavior of this library is to always sort the immediate descendants of a directory prior to visiting each node, just like filepath.Walk does. This is usually the desired behavior. However, this does come at slight performance and memory penalties required to sort the names when a directory node has many entries. Additionally if caller specifies Unsorted enumeration in the configuration parameter, reading directories is lazily performed as the caller consumes entries. If a particular use case exists that does not require sorting the directory's immediate descendants prior to visiting its nodes, this library will skip the sorting step when the Unsorted parameter is set to true.

Here's an interesting read of the potential hazzards of traversing a file system hierarchy in a non-deterministic order. If you know the problem you are solving is not affected by the order files are visited, then I encourage you to use Unsorted. Otherwise skip setting this option.

Researchers find bug in Python script may have affected hundreds of studies

Configurable Post Children Callback

This library provides upstream code with the ability to specify a callback function to be invoked for each directory after its children are processed. This has been used to recursively delete empty directories after traversing the file system in a more efficient manner. See the examples/clean-empties directory for an example of this usage.

Configurable Error Callback

This library provides upstream code with the ability to specify a callback to be invoked for errors that the operating system returns, allowing the upstream code to determine the next course of action to take, whether to halt walking the hierarchy, as it would do were no error callback provided, or skip the node that caused the error. See the examples/walk-fast directory for an example of this usage.

Documentation

Overview

Package godirwalk provides functions to read and traverse directory trees.

In short, why do I use this library?

* It's faster than `filepath.Walk`.

* It's more correct on Windows than `filepath.Walk`.

* It's more easy to use than `filepath.Walk`.

* It's more flexible than `filepath.Walk`.

USAGE

This library will normalize the provided top level directory name based on the os-specific path separator by calling `filepath.Clean` on its first argument. However it always provides the pathname created by using the correct os-specific path separator when invoking the provided callback function.

dirname := "some/directory/root"
err := godirwalk.Walk(dirname, &godirwalk.Options{
    Callback: func(osPathname string, de *godirwalk.Dirent) error {
        fmt.Printf("%s %s\n", de.ModeType(), osPathname)
        return nil
    },
})

This library not only provides functions for traversing a file system directory tree, but also for obtaining a list of immediate descendants of a particular directory, typically much more quickly than using `os.ReadDir` or `os.ReadDirnames`.

scratchBuffer := make([]byte, godirwalk.MinimumScratchBufferSize)

names, err := godirwalk.ReadDirnames("some/directory", scratchBuffer)
// ...

entries, err := godirwalk.ReadDirents("another/directory", scratchBuffer)
// ...

Index

Constants

This section is empty.

Variables

View Source
var MinimumScratchBufferSize = os.Getpagesize()

MinimumScratchBufferSize specifies the minimum size of the scratch buffer that ReadDirents, ReadDirnames, Scanner, and Walk will use when reading file entries from the operating system. During program startup it is initialized to the result from calling `os.Getpagesize()` for non Windows environments, and 0 for Windows.

View Source
var SkipThis = errors.New("skip this directory entry")

SkipThis is used as a return value from WalkFuncs to indicate that the file system entry named in the call is to be skipped. It is not returned as an error by any function.

Functions

func ReadDirnames added in v0.0.2

func ReadDirnames(osDirname string, scratchBuffer []byte) ([]string, error)

ReadDirnames returns a slice of strings, representing the immediate descendants of the specified directory. If the specified directory is a symbolic link, it will be resolved.

If an optional scratch buffer is provided that is at least one page of memory, it will be used when reading directory entries from the file system. If you plan on calling this function in a loop, you will have significantly better performance if you allocate a scratch buffer and use it each time you call this function.

Note that this function, depending on operating system, may or may not invoke the ReadDirents function, in order to prepare the list of immediate descendants. Therefore, if your program needs both the names and the file system mode types of descendants, it will always be faster to invoke ReadDirents directly, rather than calling this function, then looping over the results and calling os.Stat or os.LStat for each entry.

children, err := godirwalk.ReadDirnames(osDirname, nil)
if err != nil {
    return nil, errors.Wrap(err, "cannot get list of directory children")
}
sort.Strings(children)
for _, child := range children {
    fmt.Printf("%s\n", child)
}

func Walk added in v0.1.0

func Walk(pathname string, options *Options) error

Walk walks the file tree rooted at the specified directory, calling the specified callback function for each file system node in the tree, including root, symbolic links, and other node types.

This function is often much faster than filepath.Walk because it does not invoke os.Stat for every node it encounters, but rather obtains the file system node type when it reads the parent directory.

If a runtime error occurs, either from the operating system or from the upstream Callback or PostChildrenCallback functions, processing typically halts. However, when an ErrorCallback function is provided in the provided Options structure, that function is invoked with the error along with the OS pathname of the file system node that caused the error. The ErrorCallback function's return value determines the action that Walk will then take.

func main() {
    dirname := "."
    if len(os.Args) > 1 {
        dirname = os.Args[1]
    }
    err := godirwalk.Walk(dirname, &godirwalk.Options{
        Callback: func(osPathname string, de *godirwalk.Dirent) error {
            fmt.Printf("%s %s\n", de.ModeType(), osPathname)
            return nil
        },
        ErrorCallback: func(osPathname string, err error) godirwalk.ErrorAction {
        	// Your program may want to log the error somehow.
        	fmt.Fprintf(os.Stderr, "ERROR: %s\n", err)

        	// For the purposes of this example, a simple SkipNode will suffice,
        	// although in reality perhaps additional logic might be called for.
        	return godirwalk.SkipNode
        },
    })
    if err != nil {
        fmt.Fprintf(os.Stderr, "%s\n", err)
        os.Exit(1)
    }
}

Types

type Dirent

type Dirent struct {
	// contains filtered or unexported fields
}

Dirent stores the name and file system mode type of discovered file system entries.

func NewDirent added in v1.7.0

func NewDirent(osPathname string) (*Dirent, error)

NewDirent returns a newly initialized Dirent structure, or an error. This function does not follow symbolic links.

This function is rarely used, as Dirent structures are provided by other functions in this library that read and walk directories, but is provided, however, for the occasion when a program needs to create a Dirent.

func (Dirent) IsDevice added in v1.9.0

func (de Dirent) IsDevice() bool

IsDevice returns true if and only if the Dirent represents a device file.

func (Dirent) IsDir added in v1.0.0

func (de Dirent) IsDir() bool

IsDir returns true if and only if the Dirent represents a file system directory. Note that on some operating systems, more than one file mode bit may be set for a node. For instance, on Windows, a symbolic link that points to a directory will have both the directory and the symbolic link bits set.

func (Dirent) IsDirOrSymlinkToDir added in v1.15.0

func (de Dirent) IsDirOrSymlinkToDir() (bool, error)

IsDirOrSymlinkToDir returns true if and only if the Dirent represents a file system directory, or a symbolic link to a directory. Note that if the Dirent is not a directory but is a symbolic link, this method will resolve by sending a request to the operating system to follow the symbolic link.

func (Dirent) IsRegular added in v1.7.0

func (de Dirent) IsRegular() bool

IsRegular returns true if and only if the Dirent represents a regular file. That is, it ensures that no mode type bits are set.

func (de Dirent) IsSymlink() bool

IsSymlink returns true if and only if the Dirent represents a file system symbolic link. Note that on some operating systems, more than one file mode bit may be set for a node. For instance, on Windows, a symbolic link that points to a directory will have both the directory and the symbolic link bits set.

func (Dirent) ModeType

func (de Dirent) ModeType() os.FileMode

ModeType returns the mode bits that specify the file system node type. We could make our own enum-like data type for encoding the file type, but Go's runtime already gives us architecture independent file modes, as discussed in `os/types.go`:

Go's runtime FileMode type has same definition on all systems, so that
information about files can be moved from one system to another portably.

func (Dirent) Name

func (de Dirent) Name() string

Name returns the base name of the file system entry.

type Dirents

type Dirents []*Dirent

Dirents represents a slice of Dirent pointers, which are sortable by base name. This type satisfies the `sort.Interface` interface.

func ReadDirents added in v0.0.2

func ReadDirents(osDirname string, scratchBuffer []byte) (Dirents, error)

ReadDirents returns a sortable slice of pointers to Dirent structures, each representing the file system name and mode type for one of the immediate descendant of the specified directory. If the specified directory is a symbolic link, it will be resolved.

If an optional scratch buffer is provided that is at least one page of memory, it will be used when reading directory entries from the file system. If you plan on calling this function in a loop, you will have significantly better performance if you allocate a scratch buffer and use it each time you call this function.

children, err := godirwalk.ReadDirents(osDirname, nil)
if err != nil {
    return nil, errors.Wrap(err, "cannot get list of directory children")
}
sort.Sort(children)
for _, child := range children {
    fmt.Printf("%s %s\n", child.ModeType, child.Name)
}

func (Dirents) Len

func (l Dirents) Len() int

Len returns the count of Dirent structures in the slice.

func (Dirents) Less

func (l Dirents) Less(i, j int) bool

Less returns true if and only if the base name of the element specified by the first index is lexicographically less than that of the second index.

func (Dirents) Swap

func (l Dirents) Swap(i, j int)

Swap exchanges the two Dirent entries specified by the two provided indexes.

type ErrorAction added in v1.4.0

type ErrorAction int

ErrorAction defines a set of actions the Walk function could take based on the occurrence of an error while walking the file system. See the documentation for the ErrorCallback field of the Options structure for more information.

const (
	// Halt is the ErrorAction return value when the upstream code wants to halt
	// the walk process when a runtime error takes place. It matches the default
	// action the Walk function would take were no ErrorCallback provided.
	Halt ErrorAction = iota

	// SkipNode is the ErrorAction return value when the upstream code wants to
	// ignore the runtime error for the current file system node, skip
	// processing of the node that caused the error, and continue walking the
	// file system hierarchy with the remaining nodes.
	SkipNode
)

type Options added in v1.0.0

type Options struct {
	// ErrorCallback specifies a function to be invoked in the case of an error
	// that could potentially be ignored while walking a file system
	// hierarchy. When set to nil or left as its zero-value, any error condition
	// causes Walk to immediately return the error describing what took
	// place. When non-nil, this user supplied function is invoked with the OS
	// pathname of the file system object that caused the error along with the
	// error that took place. The return value of the supplied ErrorCallback
	// function determines whether the error will cause Walk to halt immediately
	// as it would were no ErrorCallback value provided, or skip this file
	// system node yet continue on with the remaining nodes in the file system
	// hierarchy.
	//
	// ErrorCallback is invoked both for errors that are returned by the
	// runtime, and for errors returned by other user supplied callback
	// functions.
	ErrorCallback func(string, error) ErrorAction

	// FollowSymbolicLinks specifies whether Walk will follow symbolic links
	// that refer to directories. When set to false or left as its zero-value,
	// Walk will still invoke the callback function with symbolic link nodes,
	// but if the symbolic link refers to a directory, it will not recurse on
	// that directory. When set to true, Walk will recurse on symbolic links
	// that refer to a directory.
	FollowSymbolicLinks bool

	// Unsorted controls whether or not Walk will sort the immediate descendants
	// of a directory by their relative names prior to visiting each of those
	// entries.
	//
	// When set to false or left at its zero-value, Walk will get the list of
	// immediate descendants of a particular directory, sort that list by
	// lexical order of their names, and then visit each node in the list in
	// sorted order. This will cause Walk to always traverse the same directory
	// tree in the same order, however may be inefficient for directories with
	// many immediate descendants.
	//
	// When set to true, Walk skips sorting the list of immediate descendants
	// for a directory, and simply visits each node in the order the operating
	// system enumerated them. This will be more fast, but with the side effect
	// that the traversal order may be different from one invocation to the
	// next.
	Unsorted bool

	// Callback is a required function that Walk will invoke for every file
	// system node it encounters.
	Callback WalkFunc

	// PostChildrenCallback is an option function that Walk will invoke for
	// every file system directory it encounters after its children have been
	// processed.
	PostChildrenCallback WalkFunc

	// ScratchBuffer is an optional byte slice to use as a scratch buffer for
	// Walk to use when reading directory entries, to reduce amount of garbage
	// generation. Not all architectures take advantage of the scratch
	// buffer. If omitted or the provided buffer has fewer bytes than
	// MinimumScratchBufferSize, then a buffer with MinimumScratchBufferSize
	// bytes will be created and used once per Walk invocation.
	ScratchBuffer []byte

	// AllowNonDirectory causes Walk to bypass the check that ensures it is
	// being called on a directory node, or when FollowSymbolicLinks is true, a
	// symbolic link that points to a directory. Leave this value false to have
	// Walk return an error when called on a non-directory. Set this true to
	// have Walk run even when called on a non-directory node.
	AllowNonDirectory bool
}

Options provide parameters for how the Walk function operates.

type Scanner added in v1.11.0

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner is an iterator to enumerate the contents of a directory.

func NewScanner added in v1.11.0

func NewScanner(osDirname string) (*Scanner, error)

NewScanner returns a new directory Scanner that lazily enumerates the contents of a single directory. To prevent resource leaks, caller must invoke either the Scanner's Close or Err method after it has completed scanning a directory.

scanner, err := godirwalk.NewScanner(dirname)
if err != nil {
    fatal("cannot scan directory: %s", err)
}

for scanner.Scan() {
    dirent, err := scanner.Dirent()
    if err != nil {
        warning("cannot get dirent: %s", err)
        continue
    }
    name := dirent.Name()
    if name == "break" {
        break
    }
    if name == "continue" {
        continue
    }
    fmt.Printf("%v %v\n", dirent.ModeType(), dirent.Name())
}
if err := scanner.Err(); err != nil {
    fatal("cannot scan directory: %s", err)
}

func NewScannerWithScratchBuffer added in v1.15.2

func NewScannerWithScratchBuffer(osDirname string, scratchBuffer []byte) (*Scanner, error)

NewScannerWithScratchBuffer returns a new directory Scanner that lazily enumerates the contents of a single directory. On platforms other than Windows it uses the provided scratch buffer to read from the file system. On Windows the scratch buffer is ignored. To prevent resource leaks, caller must invoke either the Scanner's Close or Err method after it has completed scanning a directory.

func (*Scanner) Close added in v1.17.0

func (s *Scanner) Close() error

Close releases resources associated with scanning a directory. Call either this or the Err method when the directory no longer needs to be scanned.

func (*Scanner) Dirent added in v1.11.0

func (s *Scanner) Dirent() (*Dirent, error)

Dirent returns the current directory entry while scanning a directory.

func (*Scanner) Err added in v1.11.0

func (s *Scanner) Err() error

Err returns any error associated with scanning a directory. It is normal to call Err after Scan returns false, even though they both ensure Scanner resources are released. Call either this or the Close method when the directory no longer needs to be scanned.

func (*Scanner) Name added in v1.11.0

func (s *Scanner) Name() string

Name returns the base name of the current directory entry while scanning a directory.

func (*Scanner) Scan added in v1.11.0

func (s *Scanner) Scan() bool

Scan potentially reads and then decodes the next directory entry from the file system.

When it returns false, this releases resources used by the Scanner then returns any error associated with closing the file system directory resource.

type WalkFunc added in v0.1.0

type WalkFunc func(osPathname string, directoryEntry *Dirent) error

WalkFunc is the type of the function called for each file system node visited by Walk. The pathname argument will contain the argument to Walk as a prefix; that is, if Walk is called with "dir", which is a directory containing the file "a", the provided WalkFunc will be invoked with the argument "dir/a", using the correct os.PathSeparator for the Go Operating System architecture, GOOS. The directory entry argument is a pointer to a Dirent for the node, providing access to both the basename and the mode type of the file system node.

If an error is returned by the Callback or PostChildrenCallback functions, and no ErrorCallback function is provided, processing stops. If an ErrorCallback function is provided, then it is invoked with the OS pathname of the node that caused the error along along with the error. The return value of the ErrorCallback function determines whether to halt processing, or skip this node and continue processing remaining file system nodes.

The exception is when the function returns the special value filepath.SkipDir. If the function returns filepath.SkipDir when invoked on a directory, Walk skips the directory's contents entirely. If the function returns filepath.SkipDir when invoked on a non-directory file system node, Walk skips the remaining files in the containing directory. Note that any supplied ErrorCallback function is not invoked with filepath.SkipDir when the Callback or PostChildrenCallback functions return that special value.

One arguably confusing aspect of the filepath.WalkFunc API that this library must emulate is how a caller tells Walk to skip file system entries or directories. With both filepath.Walk and this Walk, when a callback function wants to skip a directory and not descend into its children, it returns filepath.SkipDir. If the callback function returns filepath.SkipDir for a non-directory, filepath.Walk and this library will stop processing any more entries in the current directory, which is what many people do not want. If you want to simply skip a particular non-directory entry but continue processing entries in the directory, a callback function must return nil. The implications of this API is when you want to walk a file system hierarchy and skip an entry, when the entry is a directory, you must return one value, namely filepath.SkipDir, but when the entry is a non-directory, you must return a different value, namely nil. In other words, to get identical behavior for two file system entry types you need to send different token values.

Here is an example callback function that adheres to filepath.Walk API to have it skip any file system entry whose full pathname includes a particular substring, optSkip:

func callback1(osPathname string, de *godirwalk.Dirent) error {
    if optSkip != "" && strings.Contains(osPathname, optSkip) {
        if b, err := de.IsDirOrSymlinkToDir(); b == true && err == nil {
            return filepath.SkipDir
        }
        return nil
    }
    // Process file like normal...
    return nil
}

This library attempts to eliminate some of that logic boilerplate by providing a new token error value, SkipThis, which a callback function may return to skip the current file system entry regardless of what type of entry it is. If the current entry is a directory, its children will not be enumerated, exactly as if the callback returned filepath.SkipDir. If the current entry is a non-directory, the next file system entry in the current directory will be enumerated, exactly as if the callback returned nil. The following example callback function has identical behavior as the previous, but has less boilerplate, and admittedly more simple logic.

func callback2(osPathname string, de *godirwalk.Dirent) error {
    if optSkip != "" && strings.Contains(osPathname, optSkip) {
        return godirwalk.SkipThis
    }
    // Process file like normal...
    return nil
}

Directories

Path Synopsis
examples
remove-empty-directories
* remove-empty-directories * * Walks a file system hierarchy and removes all directories with no children.
* remove-empty-directories * * Walks a file system hierarchy and removes all directories with no children.
sizes
* sizes * * Walks a file system hierarchy and prints sizes of file system objects, * recursively printing sizes of directories.
* sizes * * Walks a file system hierarchy and prints sizes of file system objects, * recursively printing sizes of directories.
walk-fast
* walk-fast * * Walks a file system hierarchy using this library.
* walk-fast * * Walks a file system hierarchy using this library.
walk-stdlib
* walk-fast * * Walks a file system hierarchy using the standard library.
* walk-fast * * Walks a file system hierarchy using the standard library.
find-fast Module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL