skyhub

module
v0.0.0-...-13d4fa3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 23, 2021 License: AGPL-3.0

README

Skyhub

Skyhub provides a way of accessing the libgen scimag / sci-hub torrent archive on a one-off basis. It stands up what looks like a local copy of sci-hub. Visiting pages causes it to download the relevant article from the torrents before serving it up.

Library Documentation

Installation

Quick Start
# make sure go v1.14+ is installed and $GOPATH/bin is in your PATH
cd /tmp
GO111MODULE=on go get -v github.com/frrad/skyhub/cmd/skyhub@master
mkdir "$HOME/skyhub"
echo "DOI,ID\n10.7554/elife.32822,70494267" > $HOME/skyhub/index.csv
skyhub &
LINK="http://localhost:5000/by-doi/10.7554/elife.32822"
open $LINK || xdg-open $LINK

http://localhost:5000/by-doi/10.7554/elife.32822

Slow Start

Skyhub expects to be able to find a file called index.csv containing a list of DOI,scihub_id pairs ordered by DOI in your ~/skyhub directory. The full uncompressed index weigh in at ~3G, but any subset of it should work. For instance the one-line index we created in the Quick Start:

DOI,ID
10.7554/elife.32822,70494267

works just fine if you only want to be able to access this DOI.

If you want to build a more complete index, see this repo for a Makefile to build a complete index from the libgen database dump.

By default, skyhub will attempt to use its bundled torrents.toml file to load torrents by their infohash. However, if you have actual .torrent files it can make the process of loading new torrents much faster. Skyhub will prefer to use any .torrent files it finds in your ~/skyhub/torrentfiles. You can download all available torrents by navigating to ~/skyhub/torrentfiles and running make.

You can find information about skyhub while it's running by visiting the /status endpoint.

How it works

You can do random access inside zip files if you have some metadata. So, in order to retrieve a paper, you can just download 2 16MB "chunks" from the torrent. The first is for the metadata on the zip file you want. The offset info in the first chunk helps identify the second chunk which contains the actual paper.

Directories

Path Synopsis
cmd
lib
rat

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL