crawler

package
v0.0.0-...-ada256f Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 31, 2023 License: AGPL-3.0 Imports: 16 Imported by: 0

Documentation

Overview

Package crawler is grouped around the Crawler component, crawling and indexing content from an AnnotatedResource.

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrDirectoryTooLarge is returned by Ls() when a directory is larger `Config.MaxDirSize`.
	ErrDirectoryTooLarge = t.WrappedError{Err: t.ErrInvalidResource, Msg: "directory too large"}
)

Functions

This section is empty.

Types

type Config

type Config struct {
	DirEntryBufferSize uint          // Size of buffer for processing directory entry channels.
	MinUpdateAge       time.Duration // The minimum age for items to be updated.
	StatTimeout        time.Duration // Timeout for Stat() calls.
	DirEntryTimeout    time.Duration // Timeout *between* directory entries.
	MaxDirSize         uint          // Maximum number of directory entries
}

Config contains configuration for a Crawler.

func DefaultConfig

func DefaultConfig() *Config

DefaultConfig generates a default configuration for a Crawler.

type Crawler

type Crawler struct {
	*instr.Instrumentation
	// contains filtered or unexported fields
}

Crawler allows crawling of resources.

func New

func New(config *Config, indexes *Indexes, queues *Queues, protocol protocol.Protocol, extractors []extractor.Extractor, i *instr.Instrumentation) *Crawler

New instantiates a Crawler.

func (*Crawler) Crawl

func (c *Crawler) Crawl(ctx context.Context, r *t.AnnotatedResource) error

Crawl updates existing or crawls new resources, extracting metadata where applicable.

type Indexes

type Indexes struct {
	Files       index.Index
	Directories index.Index
	Invalids    index.Index
	Partials    index.Index
}

Indexes used for crawling.

type Queues

type Queues struct {
	Files       queue.Queue
	Directories queue.Queue
	Hashes      queue.Queue
}

Queues used for crawling.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL