crawler

package
v0.0.0-...-028e29d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 15, 2020 License: MIT Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

Functions

func Count

func Count(file *os.File)

Count is a func

func CountWordsAndImages

func CountWordsAndImages(url string) (words, images int, err error)

CountWordsAndImages does an HTTP GET request for the HTML document url and returns the number of words and images in it.

func Crawl

func Crawl(url string) []string

Crawl will crawl url

func ElementByID

func ElementByID(doc *html.Node, id string) *html.Node

ElementByID is a func

func ElementByTagName

func ElementByTagName(n *html.Node, tags ...string) []*html.Node

ElementByTagName will return tag name match nodes

func EndElement

func EndElement(n *html.Node) bool

EndElement is a func

func Extract

func Extract(url string) ([]string, error)

Extract makes an http get request to the specified url, parses

func FindLinks(url string) ([]string, error)

FindLinks performs an HTTP GET request for url, parses the response as HTML, and extracts and returns the links.

func FindLinks2

func FindLinks2(url string) (*html.Node, error)

FindLinks2 will return url html node

func ForEachNode

func ForEachNode(n *html.Node, pre, post func(*html.Node) bool)

ForEachNode is a func, traversal html doc

func StartElement

func StartElement(n *html.Node) bool

StartElement is a func

func Title

func Title(url string) error

Title get all title elem node content

func Visit

func Visit(file *os.File, option string)

Visit is func

func WaitForServer

func WaitForServer(url string) error

WaitForServer attempts to contact the server of a URL. It tries for one minute using exponential back-off. It reports an error if all attempts fail.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL