tumblr-scraper

command module

v0.0.0-...-5085e7c Latest Latest Go to latest Published: Mar 9, 2021 License: MIT Imports: 4 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

This project was created as a black box, from scratch reimplementation of Liru/tumblr-downloader, recreating and improving upon its features.

Documentation (up until now this strictly has been a private project)
Crawling of >5000 posts per day will lead to rate limiting
Continuing a previously failed crawl/scrape is not supported
Setting the before field in the config allows you to scrape backwards starting at a date in the past.
That way you can manually, iteratively scrape a huge blog in "sane" chunks (e.g. first everything before 2014, then 2015, 2016, ...).
Support for youtube-dl would be nice

There is no documentation for this package.

Path	Synopsis
account
app
config
cookiejar Package cookiejar implements an in-memory RFC 6265-compliant http.CookieJar.	Package cookiejar implements an in-memory RFC 6265-compliant http.CookieJar.
database
scraper
semaphore