gofindfiles

command module
v0.0.0-...-1245f13 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2023 License: MIT Imports: 20 Imported by: 0

README

GoFindFiles

Uses Go Colly to crawl websites attempting to find files with matching file types. For use as OSINT or RECON intelligence collection tool.

Installation

Build from source (Go required)

To install / build redgrab binary from source, you need to have Go installed on your system (https://go.dev/doc/install). Once you have Go installed, you can either clone and run from source or download and install with the following command:

go install github.com/bradsec/gofindfiles@latest

Basic Usage

# With URL only will default to looking for document files types
gofindfiles --url https://thisurlexamplesite.com

# Specify individual file types
gofindfiles --url https://thisurlexamplesite.com --filetypes ".pdf,.jpg"
Predefined File Type Groups

Use with flag --filetypes
Example using more than one group --filetypes "images,documents"

	var fileGroups = map[string][]string{
		"images":    []string{".jpg", ".jpeg", ".png", ".gif", ".bmp", ".svg", ".tiff", ".ico", ".heif", ".indd"},
		"movies":    []string{".mov", ".avi", ".mp4", ".webp", ".mkv", ".flv", ".wmv", ".m4v", ".3gp", ".mpg", ".mpeg"},
		"audio":     []string{".wav", ".mp3", ".aac", ".flac", ".m4a", ".ogg", ".wma"},
		"archives":  []string{".zip", ".tar", ".tar.gz", ".tar.gz2", ".rar", ".7z", ".bz2", ".jar", ".iso"},
		"documents": []string{".doc", ".docx", ".pdf", ".xls", ".xlsx", ".txt", ".csv", ".ppt", ".pptx", ".odt", ".ods", ".odp", ".rtf", ".tex"},
		"configs":   []string{".json", ".yaml", ".yml", ".xml", ".ini", ".conf", ".cfg", ".toml"},
		"logs":      []string{".log", ".out", ".err", ".syslog", ".event"},
		"databases": []string{".db", ".sql", ".dbf", ".mdb", ".accdb", ".sqlite", ".sqlite3", ".csv", ".tsv", ".json"},
	}

Full Usage options

  -depth int
    	The maximum depth to follow links (default 10)
  -external
    	Enable or disable downloading files from external domains (default true)
  -filetext string
    	The text to be present in the filename (optional)
  -filetypes string
    	Comma-separated list of file extensions to download (default "documents")
  -timeout int
    	The maximum time the crawling will run. (default 10)
  -url string
    	The target URL to search including http:// or https://
  -useragent string
    	The User-Agent string to use (default "random")

Other Notes

Logs

A list of the crawled/visited URLs will be stored a text file crawled.txt in the logs sub-directory of the target URL.

File Naming

As images are dumped in the main root directory made for the targetURL a SHA256 hash has been added to filename based of the files full source URL location to allow for when files may be different but have the same name in another location. You can match the HASH in the logs/crawled.txt to find where the file came from.

If files are from an external domain/url there will be sub-director of the external domain/url within the main target URL directory containing the files from that site. You can disable downloading from external links using the --external false flag.

Limitations

  • Will not work for some dynamic content sites.
  • Will not work for sites with reCAPTCHA/CAPTCHA type protection.

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL