htree

package module
v1.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 31, 2023 License: MIT Imports: 5 Imported by: 1

README

Htree - Go package for working with html.Node trees

Go Reference Go Report Card Tests Coverage Status

This is htree, a Go package that helps traverse, navigate, filter, and otherwise process trees of html.Node objects.

Usage

root, err := html.Parse(input)
if err != nil { ... }

body := htree.FindEl(root, func(n *html.Node) bool {
  return n.DataAtom == atom.Body
})

content := htree.FindEl(body, func(n *html.Node) bool {
  return n.DataAtom == atom.Div && htree.ElClassContains(n, "content")
})

...etc...

Documentation

Overview

Package htree is a collection of tools for working with trees of html.Nodes.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ElAttr

func ElAttr(node *html.Node, key string) string

ElAttr returns `node`'s value for the attribute `key`.

func ElClassContains

func ElClassContains(node *html.Node, probe string) bool

ElClassContains tells whether `node` has a `class` attribute containing the class name `probe`.

func Find

func Find(node *html.Node, pred func(*html.Node) bool) *html.Node

Find finds the first node, in a depth-first search of the tree rooted at `node`, satisfying the given predicate.

func FindAll

func FindAll(node *html.Node, pred func(*html.Node) bool, f func(*html.Node) error) error

FindAll walks the tree rooted at `node` in preorder, depth-first fashion. It tests each node in the tree with `pred`. Any node that passes the test causes FindAll to (a) call `f` on the node, and (b) skip walking the node's subtree.

If any call to `f` returns an error, FindAll aborts the walk and returns the error.

To continue walking the subtree of a node `n` that passes `pred`, call FindAllChildren(n, pred, f) in the body of `f`.

func FindAllChildEls

func FindAllChildEls(node *html.Node, pred func(*html.Node) bool, f func(*html.Node) error) error

FindAllChildEls is the same as FindAllEls but operates only on the children of `node`, not `node` itself.

func FindAllChildren

func FindAllChildren(node *html.Node, pred func(*html.Node) bool, f func(*html.Node) error) error

FindAllChildren is the same as FindAll but operates only on the children of `node`, not `node` itself.

func FindAllEls

func FindAllEls(node *html.Node, pred func(*html.Node) bool, f func(*html.Node) error) error

FindAllEls is like FindAll but calls `pred`, and perhaps `f`, only for nodes with type `ElementNode`.

To continue walking the subtree of a node `n` that passes `pred`, call FindAllChildEls(n, pred, f) in the body of `f`.

func FindEl

func FindEl(node *html.Node, pred func(*html.Node) bool) *html.Node

FindEl finds the first `ElementNode`-typed node, in a depth-first search of the tree rooted at `node`, satisfying the given predicate.

func Prune

func Prune(node *html.Node, pred func(*html.Node) bool) *html.Node

Prune returns a copy of `node` and its children, minus any subnodes that cause the supplied predicate to return true. If `node` itself is pruned, the return value is nil.

func Text

func Text(node *html.Node) (string, error)

Text returns the content of the tree rooted at `node` as plain text. HTML entities are decoded, and <br> nodes are turned into newlines.

func Walk

func Walk(node *html.Node, f func(*html.Node) error) error

Walk applies f to each node in a recursive, preorder, depth-first walk of `node`. If any call to f produces an error, the walk is aborted and the error returned.

func WriteText

func WriteText(w io.Writer, node *html.Node) error

WriteText converts the content of the tree rooted at `node` into plain text and writes it to `w`. HTML entities are decoded, <script> and <style> nodes are pruned, and <br> nodes are turned into newlines.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL