sprinter

package
v0.0.0-...-c1f947c Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 13, 2015 License: GPL-3.0 Imports: 16 Imported by: 1

Documentation

Overview

Package sprinter implements our fast web crawler.

The sprinter package is a concurrent webquery engine.

Index

Constants

View Source
const (
	RobotsSize = 2000
)

Variables

View Source
var (
	ErrInvalidParameters = errors.New("invalid parameters for crawler")
	IgnoredWords         = []string{"the"}
)
View Source
var (
	Info  *log.Logger
	Error *log.Logger
)

Functions

This section is empty.

Types

type Crawler

type Crawler struct {
	MaxRequests           int // The max number of requests that can be handled in total.
	MaxConcurrentRequests int // The max number of requests that can be handled concurrently.

	Verbose bool
	// contains filtered or unexported fields
}

func NewCrawler

func NewCrawler(storage storage.Storage, buffer containers.Container) (c *Crawler, err error)

Create a new Crawler object with the specified storage.Storage and link buffer.

func (*Crawler) Crawl

func (c *Crawler) Crawl(uri string) (err error)

Start at the URI and crawl from there.

func (*Crawler) CrawlSequential

func (c *Crawler) CrawlSequential(uri string) (err error)

A somewhat easier function to crawl sequentially. See also the Crawl function.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL