save_to_web.archive.org

command module
v0.0.0-...-c7207ee Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 5, 2024 License: GPL-2.0 Imports: 11 Imported by: 0

README

Save to web.archive.org logo


Like my work?

Tip me


Description

Scrapes the given website for internal links and saves the found ones into web.archive.org

Installation

I assume you have already installed go. (Go installation manual)

Dependencies

Download the dependecies via go get

Execute the following two commands:

go get -u github.com/simonfrey/proxyfy
go get -u github.com/PuerkitoBio/goquery

Download tool

Just clone the git repo

git clone https://github.com/simonfrey/save_to_web.archive.org.git

Execution

Navigate into the directory of the git repo.

Execute with:

Please Replace http[s]://[yourwebsite.com] with the url of the website you want to scrape and save.

go run main.go http[s]://[yourwebsite.com]

****Additional commandline arguments:

-p for proxyfing the requests

-i for also crawling internal urls (e.g. /test/foo)

So if you want to use the tool with also crawling interal links and use a proxy for that it would be the following command

go run main.go -p -i http[s]://[yourwebsite.com] 

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL