gofindfiles

command module

v0.0.0-...-1245f13 Latest Latest Go to latest Published: Jun 5, 2023 License: MIT Imports: 20 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

README ¶

GoFindFiles

Uses Go Colly to crawl websites attempting to find files with matching file types. For use as OSINT or RECON intelligence collection tool.

Installation

Build from source (Go required)

To install / build redgrab binary from source, you need to have Go installed on your system (https://go.dev/doc/install). Once you have Go installed, you can either clone and run from source or download and install with the following command:

go install github.com/bradsec/gofindfiles@latest

Basic Usage

# With URL only will default to looking for document files types
gofindfiles --url https://thisurlexamplesite.com

# Specify individual file types
gofindfiles --url https://thisurlexamplesite.com --filetypes ".pdf,.jpg"

Predefined File Type Groups

Use with flag --filetypes
Example using more than one group --filetypes "images,documents"

	var fileGroups = map[string][]string{
		"images":    []string{".jpg", ".jpeg", ".png", ".gif", ".bmp", ".svg", ".tiff", ".ico", ".heif", ".indd"},
		"movies":    []string{".mov", ".avi", ".mp4", ".webp", ".mkv", ".flv", ".wmv", ".m4v", ".3gp", ".mpg", ".mpeg"},
		"audio":     []string{".wav", ".mp3", ".aac", ".flac", ".m4a", ".ogg", ".wma"},
		"archives":  []string{".zip", ".tar", ".tar.gz", ".tar.gz2", ".rar", ".7z", ".bz2", ".jar", ".iso"},
		"documents": []string{".doc", ".docx", ".pdf", ".xls", ".xlsx", ".txt", ".csv", ".ppt", ".pptx", ".odt", ".ods", ".odp", ".rtf", ".tex"},
		"configs":   []string{".json", ".yaml", ".yml", ".xml", ".ini", ".conf", ".cfg", ".toml"},
		"logs":      []string{".log", ".out", ".err", ".syslog", ".event"},
		"databases": []string{".db", ".sql", ".dbf", ".mdb", ".accdb", ".sqlite", ".sqlite3", ".csv", ".tsv", ".json"},
	}

Full Usage options

  -depth int
    	The maximum depth to follow links (default 10)
  -external
    	Enable or disable downloading files from external domains (default true)
  -filetext string
    	The text to be present in the filename (optional)
  -filetypes string
    	Comma-separated list of file extensions to download (default "documents")
  -timeout int
    	The maximum time the crawling will run. (default 10)
  -url string
    	The target URL to search including http:// or https://
  -useragent string
    	The User-Agent string to use (default "random")

Other Notes

Logs

A list of the crawled/visited URLs will be stored a text file crawled.txt in the logs sub-directory of the target URL.

File Naming

As images are dumped in the main root directory made for the targetURL a SHA256 hash has been added to filename based of the files full source URL location to allow for when files may be different but have the same name in another location. You can match the HASH in the logs/crawled.txt to find where the file came from.

External File Links

If files are from an external domain/url there will be sub-director of the external domain/url within the main target URL directory containing the files from that site. You can disable downloading from external links using the --external false flag.

Limitations

Will not work for some dynamic content sites.
Will not work for sites with reCAPTCHA/CAPTCHA type protection.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL