biorxivcmd

package
v0.0.0-...-6256cc4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 4, 2024 License: Apache-2.0 Imports: 22 Imported by: 1

Documentation

Overview

Package biorxivcmd provides support for building command line tools that access api.biorxiv.com

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func OptionsForEndpoint

func OptionsForEndpoint(cfg apicrawlcmd.Crawl[Service]) ([]operations.Option, error)

OptionsForEndpoint returns the operations.Option's derived from the apicrawlcmd configuration.

Types

type Command

type Command struct {
	// contains filtered or unexported fields
}

Çommand implements the command line operations available for api.biorxiv.org.

func NewCommand

func NewCommand(_ context.Context, crawl apicrawlcmd.Crawl[yaml.Node], cfs operations.FS, cacheRoot string, chkp checkpoint.Operation) (*Command, error)

NewCommand returns a new Command instance for the specified API crawl.

func (*Command) Crawl

func (c *Command) Crawl(ctx context.Context, flags CrawlFlags) error

Crawl implements the crawl command. The crawl is incremental and utilizes an internal state file to track progress and restart from that point in a subsequent crawl. This makes it possible to have a start date that predates the creation of biorxiv and an end date of 'now' with each incremental crawl picking up where the previous one left off assuming that biorxiv doesn't add new preprints with dates that predate the current one.

func (*Command) LookupDownloaded

func (c *Command) LookupDownloaded(ctx context.Context, fv *LookupFlags, dois ...string) error

LookupDownloaded looks up the specified preprints via their 'PreprintDOI' printing out fields using the specified template.

func (*Command) ScanDownloaded

func (c *Command) ScanDownloaded(ctx context.Context, fv *ScanFlags) error

ScanDownloaded scans downloaded preprints printing out fields using the specified template.

type CrawlFlags

type CrawlFlags struct {
	Restart bool `subcmd:"restart,false,'restart the crawl, ignoring the saved checkpoint'"`
}

type GetFlags

type GetFlags struct{}

type IndexFlags

type IndexFlags struct{}

type LookupFlags

type LookupFlags struct {
	Template string `subcmd:"template,'{{.}}',template to use for printing fields in the downloaded Preprint objects"`
}

type ScanFlags

type ScanFlags struct {
	Template string `` /* 126-byte string literal not displayed */
}

type Service

type Service struct {
	ServiceURL string           `yaml:"service_url" cmd:"rxiv service URL, eg. https://api.biorxiv.org/pubs/biorxiv for biorxiv"`
	StartDate  cmdyaml.FlexTime `yaml:"start_date" cmd:"start date for crawl, eg. 2020-01-01"`
	EndDate    cmdyaml.FlexTime `yaml:"end_date" cmd:"end date for crawl, eg. 2020-12-01"`
}

Service represents biorxiv specific configuration parameters.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL