smeago

package
v0.0.0-...-da67eab Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 30, 2017 License: MIT Imports: 10 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Crawler

type Crawler struct {
	Domain  string
	Results chan Job
	Retries chan Job
}

func NewCrawler

func NewCrawler(d string) *Crawler

NewCrawler creates a crawler for the given domain

func (*Crawler) Crawl

func (c *Crawler) Crawl(j Job)

Crawl the job path and retries in case of failures

type CrawlerSupervisor

type CrawlerSupervisor struct {
	// contains filtered or unexported fields
}

CrawlerSupervisor control the execution of the crawler

func NewCrawlerSupervisor

func NewCrawlerSupervisor(c *Crawler) *CrawlerSupervisor

NewCrawlerSupervisor returns a new CrawlerSupervisor

func (*CrawlerSupervisor) AddJobToBuffer

func (cs *CrawlerSupervisor) AddJobToBuffer(path string)

AddJobToBuffer creates a new job for the given path and adds it to the buffer

func (*CrawlerSupervisor) BuffSize

func (cs *CrawlerSupervisor) BuffSize() int

BuffSize returns the len of the buffer

func (*CrawlerSupervisor) CompleteJob

func (cs *CrawlerSupervisor) CompleteJob(j Job)

CompleteJob Removes the job from pending list

func (*CrawlerSupervisor) CrawlJobs

func (cs *CrawlerSupervisor) CrawlJobs()

CrawlJobs crawls all jobs in the buffer concurrently

func (cs *CrawlerSupervisor) GetVisitedLinks() []string

GetVisitedLinks returns a set of all visited links

func (*CrawlerSupervisor) HasPending

func (cs *CrawlerSupervisor) HasPending() bool

HasPending returns true if there are jobs in the pending list

func (*CrawlerSupervisor) Start

func (cs *CrawlerSupervisor) Start(done chan bool)

Start crawls buffered jobs until pending list is empty

type Job

type Job struct {
	ID         int
	Path       string
	Links      []string
	Completed  bool
	RetryCount int
}

func NewJob

func NewJob(id int, path string) *Job

NewJob creates a new Job

type Result

type Result struct {
	Links []string
}

func ReadString

func ReadString(rd io.Reader) (*Result, error)

func ReadStringSize

func ReadStringSize(rd io.Reader, n int) (*Result, error)

type Sitemap

type Sitemap struct {
	Filename string
	Links    []string
	Path     string
}

func (*Sitemap) Write

func (s *Sitemap) Write(w io.Writer) error

func (*Sitemap) WriteToFile

func (s *Sitemap) WriteToFile() error

WriteToFile writes the sitemap into a file

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL