Documentation ¶
Overview ¶
Package engine @description implements engines of the cralwer
Index ¶
Constants ¶
This section is empty.
Variables ¶
View Source
var Store = &CrawlerStore{ list: []*fetcher.Task{}, Hash: map[string]*fetcher.Task{}, }
Store is the global CrawlerStore instance
Functions ¶
Types ¶
type Crawler ¶ added in v0.0.9
type Crawler struct { // store the visited fetcher.Request Visited map[string]bool VisitedLock sync.Mutex // contains filtered or unexported fields }
Crawler represents the global crawl instance
func NewCrawler ¶ added in v0.0.9
func (*Crawler) CreateWork ¶ added in v0.0.9
func (c *Crawler) CreateWork()
func (*Crawler) HandleResult ¶ added in v0.0.9
func (c *Crawler) HandleResult()
func (*Crawler) SetFailure ¶ added in v0.0.9
func (*Crawler) StoreVisited ¶ added in v0.0.9
type CrawlerStore ¶ added in v0.1.0
type CrawlerStore struct { Hash map[string]*fetcher.Task // contains filtered or unexported fields }
CrawlerStore scores the crawler tasks
func (*CrawlerStore) Add ¶ added in v0.1.0
func (cs *CrawlerStore) Add(task *fetcher.Task)
Add adds a task to the global crawler instance
func (*CrawlerStore) AddJSTask ¶ added in v0.1.0
func (cs *CrawlerStore) AddJSTask(m *fetcher.TaskModel)
AddJSTask 添加 js 动态爬取任务
type Option ¶
type Option func(opts *options)
func WithChannelBuffer ¶ added in v0.1.3
func WithFetcher ¶
func WithLogger ¶
func WithScheduler ¶ added in v0.0.9
func WithWorkCount ¶
type Schedule ¶ added in v0.0.9
func NewSchedule ¶ added in v0.0.9
func NewSchedule() *Schedule
Click to show internal directories.
Click to hide internal directories.