crawler

package
v0.0.0-...-aebc479 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 19, 2023 License: MIT Imports: 8 Imported by: 0

Documentation

Index

Constants

View Source
const Accept = "application/rss+xml, application/rdf+xml;q=0.8, application/atom+xml;q=0.6, application/xml;q=0.4, text/xml;q=0.4"
View Source
const UserAgent = "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36"

Variables

This section is empty.

Functions

func FeedCrawler

func FeedCrawler(crawlRequests chan *feedwatcher.FeedCrawlRequest)

FeedCrawler pulls FeedCrawlRequests from the crawl_requests channel, gets the given URL and returns a response

func GetFeed

func GetFeed(url string, client *http.Client) (*http.Response, error)

GetFeed gets a URL and returns a http.Response. Sets a reasonable timeout on the connection and read from the server. Users will need to Close() the resposne.Body or risk leaking connections.

func GetFeedAndMakeResponse

func GetFeedAndMakeResponse(url string, client *http.Client) *feedwatcher.FeedCrawlResponse

GetFeedAndMakeResponse gets a URL and returns a FeedCrawlResponse Sets FeedCrawlResponse.Error if there was a problem retreiving the URL.

func StartCrawlerPool

func StartCrawlerPool(num int, crawlChannel chan *feedwatcher.FeedCrawlRequest)

StartCrawlerPool creates a pool of num http crawlers listening to the crawl_channel.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL