site-url-checker

command module

v0.3.0 Latest Latest Go to latest Published: Dec 15, 2023 License: MIT Imports: 16 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/JekRock/site-url-checker

Links

Open Source Insights

README ¶

Site URL Checker

A simple tool to check the HTTP status of a list of URLs.

Getting started

Prerequisites and Main Dependencies

Golang (1.20+)
Make

Installation

To compile the binary, run the following command:

make build

If you need to compile for Linux, run:

make build-linux

Usage

$ ./site-url-checker -h
Usage of ./site-url-checker:
  -numWorkers int
     number of parallel workers to make requests (default 1)
  -output string
     path to output CSV file. If file exists, the content will be overridden (default "output.csv")
  -randomUserAgent
     If set to 'true' every request will have random user agent and 'userAgentString' flag will be ignored
  -robotsTxt string
     path to robots.txt. Either URL or filesystem path. If set, the script will check if the URL is allowed to be crawled
  -robotsTxtUserAgent string
     user agent string used to validate robots.txt (default "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/109.0")
  -urls string
     path to file with URLs to check (default "urls.txt")
  -userAgent string
     user agent string sent with every request (default "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/109.0")

$ ./site-url-checker -urls=input.csv -output=output.csv -numWorkers=20 -userAgent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:109.0) Gecko/20100101 Firefox/109.0"
Starting at Thursday, 31-Aug-23 11:07:22 UTC
 100% |███████████████████████████████████████████████████████████████████████████████| (156/156, 19 it/s)

Where input.csv is a CSV file with the following format:

https://www.google.com
https://www.facebook.com
https://www.twitter.com

License

Distributed under the MIT license. See LICENSE for more information.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
pkg
requester
serializer

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL