gospider

package
v0.0.0-...-01a1fb0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 21, 2021 License: MIT Imports: 10 Imported by: 0

Documentation

Index

Constants

View Source
const (
	StatusPending = iota // 0
	StatusProcessing
	StatusSuspend
	StatusExiting
	StatusInvalid
	StatusStoped
)

Variables

This section is empty.

Functions

This section is empty.

Types

type GoSpider

type GoSpider struct {
	// contains filtered or unexported fields
}

func New

func New(name, url string) *GoSpider

func (*GoSpider) AddHeader

func (this *GoSpider) AddHeader(name, val string) *GoSpider

func (*GoSpider) AddURLRule

func (this *GoSpider) AddURLRule(rule string) *GoSpider

func (*GoSpider) Charset

func (this *GoSpider) Charset(charset string) *GoSpider

func (*GoSpider) Close

func (this *GoSpider) Close() error

func (*GoSpider) DataPath

func (this *GoSpider) DataPath(queueDataPath string) *GoSpider

func (*GoSpider) Depth

func (this *GoSpider) Depth(depth int) *GoSpider

func (*GoSpider) OnVisit

func (this *GoSpider) OnVisit(rule string, f VisitCallback)

func (*GoSpider) OnVisited

func (this *GoSpider) OnVisited(f VisitedCallback)

func (*GoSpider) Proxy

func (this *GoSpider) Proxy(proxy string) *GoSpider

func (*GoSpider) Run

func (this *GoSpider) Run()

func (*GoSpider) RunCount

func (this *GoSpider) RunCount() int64

func (*GoSpider) Size

func (this *GoSpider) Size() int

func (*GoSpider) Sleep

func (this *GoSpider) Sleep(sleep time.Duration) *GoSpider

func (*GoSpider) Status

func (this *GoSpider) Status() int

func (*GoSpider) Stop

func (this *GoSpider) Stop()

func (*GoSpider) URLRules

func (this *GoSpider) URLRules(rules []string) *GoSpider

func (*GoSpider) Wait

func (this *GoSpider) Wait()

type VisitCallback

type VisitCallback func(url, html string)

type VisitedCallback

type VisitedCallback func(url string) bool

已经采集过的 URL,将不会放入队列

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL