furl

command module
v0.0.0-...-c8126ae Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 20, 2023 License: MIT Imports: 3 Imported by: 0

README

Web Crawler in Go

This is a simple web crawler implemented in Go. It takes a seed URL and a maximum depth as input and then starts crawling the web from the seed URL, following links up to the specified maximum depth.

Usage

To use the web crawler, first install Go on your machine and set up your Go workspace.

Then, clone this repository and navigate to the root directory of the project:

$ git clone https://github.com/ericsomto/wbcrl $ cd wbcrl

Next, build the program using the following command:

$ go build

This will create an executable file called web-crawler. You can then run the web crawler using the following command:

$ ./wbclr -seed https://www.wired.com -depth 3

This will start the web crawler at the seed URL https://www.example.com and crawl up to a depth of 3. You can adjust the seed URL and maximum depth to your liking.

Output

The web crawler will output the URLs of the pages it has visited, one URL per line. You can redirect the output to a file if you want to save the results:

$ ./wbclr -seed https://www.wired.com -depth 3 > results.txt

Dependencies

This web crawler uses the following third-party packages:

golang.org/x/net/html for parsing HTML pages
golang.org/x/net/publicsuffix for determining the public suffix of a host
Contribute

If you want to contribute to this project, please fork the repository and submit a pull request. Any contributions, whether they are bug fixes or new features, are welcome.

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL