htmls

package
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 11, 2018 License: MIT Imports: 3 Imported by: 0

Documentation

Overview

Package htmls contains helper functions simplify working with golang.org/x/net/html package

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Attr

func Attr(node *html.Node, key string) string

Attr returns the value of an HTML attribute.

func Find

func Find(node *html.Node, mf MatchFunc) (n *html.Node, ok bool)

Find returns the first node which matches the MatchFunc using depth-first search. If no node is found, ok will be false.

root, err := html.Parse(resp.Body)
if err != nil {
    // handle error
}
mf := func(n *html.Node) bool {
    return n.DataAtom == atom.Body
}
body, ok := scrape.Find(root, mf)

func FindAll

func FindAll(node *html.Node, mf MatchFunc) []*html.Node

FindAll returns all nodes which match the provided MatchFunc. After discovering a matching node, it will _not_ discover matching subnodes of that node.

func FindAllNested

func FindAllNested(node *html.Node, mf MatchFunc) []*html.Node

FindAllNested returns all nodes which match the provided MatchFunc and _will_ discover matching subnodes of matching nodes.

func FindNextSibling

func FindNextSibling(node *html.Node, mf MatchFunc) (n *html.Node, ok bool)

FindNextSibling returns the first node which matches the MatchFunc using next sibling search. If no node is found, ok will be false.

root, err := html.Parse(resp.Body)
if err != nil {
    // handle error
}
mf := func(n *html.Node) bool {
    return n.DataAtom == atom.Body
}
body, ok := scrape.FindNextSibling(root, mf)

func FindParent

func FindParent(node *html.Node, mf MatchFunc) (n *html.Node, ok bool)

FindParent searches up HTML tree from the current node until either a match is found or the top is hit.

func FindPrevSibling

func FindPrevSibling(node *html.Node, mf MatchFunc) (n *html.Node, ok bool)

FindPrevSibling returns the first node which matches the MatchFunc using previous sibling search. If no node is found, ok will be false.

root, err := html.Parse(resp.Body)
if err != nil {
    // handle error
}
mf := func(n *html.Node) bool {
    return n.DataAtom == atom.Body
}
body, ok := scrape.FindPrevSibling(root, mf)

func Text

func Text(node *html.Node) string

Text returns text from all descendant text nodes joined. For control over the join function, see TextJoin.

func TextJoin

func TextJoin(node *html.Node, join func([]string) string) string

TextJoin returns a string from all descendant text nodes joined by a caller provided join function.

Types

type MatchFunc

type MatchFunc func(*html.Node) bool

MatchFunc matches HTML nodes, should return true when a desired node is found.

func ByClass

func ByClass(class string) MatchFunc

ByClass returns a MatchFunc which matches all nodes with the provided class.

func ByID

func ByID(id string) MatchFunc

ByID returns a MatchFunc which matches all nodes with the provided id.

func ByTag

func ByTag(a atom.Atom) MatchFunc

ByTag returns a MatchFunc which matches all nodes of the provided tag type.

root, err := html.Parse(resp.Body)
if err != nil {
    // handle error
}
title, ok := scrape.Find(root, scrape.ByTag(atom.Title))

func MatchAtom

func MatchAtom(a atom.Atom) MatchFunc

MatchAtom returns a MatchFunc that matches a Node with the specified Atom.

Directories

Path Synopsis
example

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL