felix

package
v0.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 31, 2018 License: BSD-3-Clause Imports: 20 Imported by: 0

Documentation

Index

Constants

View Source
const (
	DefaultFeedFetchInterval = 65 * time.Minute
	DefaultUserAgent         = "felix"
	DefaultPort              = 6554
	DefaultFeedOutputMaxAge  = 6 * time.Hour
	DefaultCleanupInterval   = 1 * time.Hour
	DefaultCleanupMaxAge     = 24 * time.Hour
)

Default values for the configuration

Variables

This section is empty.

Functions

func FeedHandler

func FeedHandler(ds Datastore, maxAge time.Duration) http.Handler

FeedHandler serves the found links (up to maxAge) as an RSS feed.

func FilterItems

func FilterItems(in <-chan Item, out chan<- Item, filters ...ItemFilter)

FilterItems should just filter until in-Channel is closed? Or is quit channel needed?

func FilterLinks(in <-chan Link, out chan<- Link, filters ...LinkFilter)

FilterLinks should just filter until in-Channel is closed? Or is quit channel needed?

func FilterString

func FilterString(itemFilters []ItemFilter, linkFilters []LinkFilter) string

FilterString return the concatenated string output of all passed filters that implement the Stringer interface.

func StringHandler

func StringHandler(s string) http.Handler

StringHandler simply serves the given static string.

Types

type Attempter

type Attempter interface {
	// Next returns if and when the next attempt is scheduled
	Next(key string) (bool, time.Duration, error)
	// Inc increments the number of attempts by 1
	Inc(key string) error
}

Attempter is used by Fetcher to determine if and when the next fetch attempt should be made.

func NewAttempter added in v0.3.0

func NewAttempter(ds Datastore, next NextAttemptFunc) Attempter

NewAttempter creates a new Attempter with the given NextAttemptFunc.

type Config

type Config struct {
	FetchInterval    time.Duration  `yaml:"fetchInterval"`
	UserAgent        string         `yaml:"userAgent"`
	Port             int            `yaml:"port"`
	FeedOutputMaxAge time.Duration  `yaml:"feedOutputMaxAge"`
	CleanupInterval  time.Duration  `yaml:"cleanupInterval"`
	CleanupMaxAge    time.Duration  `yaml:"cleanupMaxAge"`
	Feeds            []FeedConfig   `yaml:"feeds"`
	ItemFilters      []FilterConfig `yaml:"itemFilters"`
	LinkFilters      []FilterConfig `yaml:"linkFilters"`
}

Config contains the configuration

func ConfigFromFile

func ConfigFromFile(filename string) (Config, error)

ConfigFromFile returns a configuration parsed from the given file.

func NewConfig

func NewConfig() Config

NewConfig returns a new configuration with default values.

type Datastore

type Datastore interface {
	LastAttempt(key string) (time.Time, int, error)
	IncAttempt(key string) error
	StoreItem(item Item) (bool, error)
	StoreLink(link Link) (bool, error)
	GetItems(maxAge time.Duration) ([]Item, error)
	GetLinks(maxAge time.Duration) ([]Link, error)
	Cleanup(maxAge time.Duration) error
	Close() error
}

Datastore is used to store and retrieve items, links, etc.

type DefaultLogger

type DefaultLogger struct {
	// contains filtered or unexported fields
}

DefaultLogger is the default logger implementation and logs to stdout.

func NewLogger

func NewLogger() *DefaultLogger

NewLogger creates a new DefaultLogger that outputs to os.Stdout.

func (*DefaultLogger) Debug

func (l *DefaultLogger) Debug(msg string, keyvals ...interface{})

Debug logs with debug level.

func (*DefaultLogger) Error

func (l *DefaultLogger) Error(msg string, keyvals ...interface{})

Error logs with error level.

func (*DefaultLogger) Fatal

func (l *DefaultLogger) Fatal(msg string, keyvals ...interface{})

Fatal logs with fatal level and exits with unclean status code 1.

func (*DefaultLogger) Info

func (l *DefaultLogger) Info(msg string, keyvals ...interface{})

Info logs with info level.

func (*DefaultLogger) SetOutput

func (l *DefaultLogger) SetOutput(w io.Writer)

SetOutput sets the output destination for the logger.

func (*DefaultLogger) Warn

func (l *DefaultLogger) Warn(msg string, keyvals ...interface{})

Warn logs with warning level.

type Emitter

type Emitter interface {
	EmitItem(item Item)
	EmitLink(link Link)
	EmitFollow(follow string)
}

Emitter is used by a Scanner to emit Items, Links and Follow URLs that should be processed.

type FeedConfig

type FeedConfig struct {
	Type          string
	URL           string
	FetchInterval time.Duration
}

FeedConfig contains the configuration of a single feed.

type Fetcher

type Fetcher struct {
	// contains filtered or unexported fields
}

Fetcher is the default fetcher.

func NewFetcher

func NewFetcher(url string, source Source, scanner Scanner, attempt Attempter, items chan<- Item, links chan<- Link) *Fetcher

NewFetcher creates a new Fetcher.

func (*Fetcher) SetLogger

func (f *Fetcher) SetLogger(log Logger)

SetLogger sets the logger that is used by the fetcher. Fetcher does not log when no Logger is set.

func (*Fetcher) Start

func (f *Fetcher) Start(quit <-chan struct{})

Start starts the fetching.

type FilterConfig

type FilterConfig struct {
	Type string
	// contains filtered or unexported fields
}

FilterConfig is the common configuration of all filter types. See FilterConfig.Unmarshal for unmarshaling of the raw config value for more specific types.

func (*FilterConfig) Unmarshal

func (fc *FilterConfig) Unmarshal(v interface{}) error

Unmarshal decodes the raw config values for a more specific config type.

func (*FilterConfig) UnmarshalYAML

func (fc *FilterConfig) UnmarshalYAML(unmarshal func(interface{}) error) error

UnmarshalYAML is a custom YAML unmarshal handler to handle the common filter config elements. See https://godoc.org/gopkg.in/yaml.v2#Unmarshaler.

type Item

type Item struct {
	Title   string
	URL     string
	PubDate time.Time
}

Item is a feed item that should be scraped for links.

type ItemFilter

type ItemFilter interface {
	Filter(item Item, next func(Item))
}

ItemFilter wraps the Filter method for items.

Filter evaluates the given item, optionally modifies it, and passes it to the next filter in the filter chain, if it matches the filter criteria.

func ItemTitleFilter

func ItemTitleFilter(titles ...string) ItemFilter

ItemTitleFilter filters items based on the given title strings. (After conversion to lower case and stripping of all non-alphanumeric characters)

type ItemFilterFunc

type ItemFilterFunc func(Item, func(Item))

ItemFilterFunc is an adapter to allow the use of ordinary functions as filters. If f is a function with the appropriate signature, ItemFilterFunc(f) is a ItemFilter that calls f.

func (ItemFilterFunc) Filter

func (f ItemFilterFunc) Filter(item Item, next func(Item))

Filter calls the underlying ItemFilterFunc

type ItemTitleFilterConfig

type ItemTitleFilterConfig struct {
	Type   string
	Titles []string
}

ItemTitleFilterConfig contains the configuration of a ItemTitleFilter.

type Link struct {
	Title string
	URL   string
}

Link is a link that was found in a feed or scraped from a page (Item)

type LinkDomainFilterConfig

type LinkDomainFilterConfig struct {
	Domains []string
}

LinkDomainFilterConfig contains the configuration of a LinkDomainFilter.

type LinkDuplicatesFilterConfig added in v0.5.0

type LinkDuplicatesFilterConfig struct {
	Size int
}

type LinkFilenameAsTitleFilterConfig added in v0.2.0

type LinkFilenameAsTitleFilterConfig struct {
	TrimExt bool `yaml:"trimExt"`
}

LinkFilenameAsTitleFilterConfig contains the configuration of a LinkFilenameAsTitleFilter.

type LinkFilter

type LinkFilter interface {
	Filter(link Link, next func(Link))
}

LinkFilter wraps the Filter method for links.

Filter evaluates the given link, optionally modifies it, and passes it to the next filter in the filter chain, if it matches the filter criteria.

func LinkDomainFilter

func LinkDomainFilter(domains ...string) LinkFilter

LinkDomainFilter filters links based on the given domains.

func LinkDuplicatesFilter added in v0.5.0

func LinkDuplicatesFilter(size int) LinkFilter

LinkDuplicatesFilter filters duplicate links based on the link URL. The links URLs are compared over a sliding window of the given size.

func LinkFilenameAsTitleFilter added in v0.2.0

func LinkFilenameAsTitleFilter(trimExt bool) LinkFilter

LinkFilenameAsTitleFilter extracts the filename from the URL and sets it as the new link title. When trimExt is set, the filter tries to remove the file extension, if one is present.

func LinkURLRegexFilter

func LinkURLRegexFilter(exprs ...string) (LinkFilter, error)

LinkURLRegexFilter filters links based their URLs matching the given regular expressions.

func LinkUploadedExpandFilenameFilter added in v0.4.0

func LinkUploadedExpandFilenameFilter(source Source) LinkFilter

LinkUploadedExpandFilenameFilter expands the filename from an uploaded file URL and sets the appropriate new URL, e.g. uploaded.net/file/xxxxxxxx -> uploaded.net/file/xxxxxxxx/file.ext. This is sometime needed for easier filtering down the filter chain.

type LinkFilterFunc

type LinkFilterFunc func(Link, func(Link))

LinkFilterFunc is an adapter to allow the use of ordinary functions as filters. If f is a function with the appropriate signature, LinkFilterFunc(f) is a LinkFilter that calls f.

func (LinkFilterFunc) Filter

func (f LinkFilterFunc) Filter(link Link, next func(Link))

Filter calls the underlying LinkFilterFunc and implements LinkFilter.

type LinkURLRegexFilterConfig

type LinkURLRegexFilterConfig struct {
	Exprs []string
}

LinkURLRegexFilterConfig contains the configuration of a LinkURLRegexFilter.

type Logger

type Logger interface {
	Debug(msg string, keyvals ...interface{})
	Info(msg string, keyvals ...interface{})
	Warn(msg string, keyvals ...interface{})
	Error(msg string, keyvals ...interface{})
	Fatal(msg string, keyvals ...interface{})
}

Logger is the standardized interface for all Loggers used in this project.

type NextAttemptFunc added in v0.3.0

type NextAttemptFunc func(last time.Time, attempts int) (bool, time.Duration)

A NextAttemptFunc returns if and when the next attempt is scheduled for the given key.

func FibNextAttemptFunc added in v0.3.0

func FibNextAttemptFunc(baseInterval time.Duration, maxAttempts int) NextAttemptFunc

FibNextAttemptFunc creates a new NextAttemptFunc for attempts with a fibonacci based backoff interval, up to maxAttempts. The interval length is defined by baseInterval * fib(attempt count).

func PeriodicNextAttemptFunc added in v0.3.0

func PeriodicNextAttemptFunc(fetchInterval time.Duration) NextAttemptFunc

PeriodicNextAttemptFunc creates a new NextAttemptFunc for periodic attempts with a fixed interval.

type NopLogger

type NopLogger struct{}

NopLogger is a no-op implementation of the Logger interface.

func (NopLogger) Debug

func (NopLogger) Debug(msg string, keyvals ...interface{})

Debug is a no-op implementation of Logger.Debug.

func (NopLogger) Error

func (NopLogger) Error(msg string, keyvals ...interface{})

Error is a no-op implementation of Logger.Error.

func (NopLogger) Fatal

func (NopLogger) Fatal(msg string, keyvals ...interface{})

Fatal is a no-op implementation of Logger.Fatal.

func (NopLogger) Info

func (NopLogger) Info(msg string, keyvals ...interface{})

Info is a no-op implementation of Logger.Info.

func (NopLogger) Warn

func (NopLogger) Warn(msg string, keyvals ...interface{})

Warn is a no-op implementation of Logger.Warn.

type ScanFunc

type ScanFunc func(context.Context, io.Reader, Emitter) error

ScanFunc is an adapter to allow the use of ordinary functions as scanners. If f is a function with the appropriate signature, ScanFunc(f) is a Scanner that calls f.

func (ScanFunc) Scan

func (f ScanFunc) Scan(ctx context.Context, r io.Reader, e Emitter) error

Scan calls the underlying ScanFunc

type Scanner

type Scanner interface {
	Scan(context.Context, io.Reader, Emitter) error
}

Scanner scans the contents of the reader and emits items, links or follow URLs.

type Source

type Source interface {
	Get(ctx context.Context, url string) (io.Reader, error)
}

Source retrieves the resource from the given URL and returns a reader for the content. If applicable, the reader shall only return UTF-8 encoded text. TODO: Find a better name?

func NewHTTPSource

func NewHTTPSource(client *http.Client) Source

NewHTTPSource return a new Source for HTTP requests. A nil client will default to http.DefaultClient.

type Stringer

type Stringer interface {
	String() string
}

Stringer is an optional interface for all filters to provide a 'native' textual representation.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL