stew

package module
v0.0.0-...-c07e8fe Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 27, 2019 License: MIT Imports: 5 Imported by: 0

README

stew

"So they poured it out for the men to eat. And as they were eating of the stew, they cried out and said, 'O man of God, there is XML in the pot.' And they were unable to eat." ~ Kings 4:40

Stew is a BeautifulSoup-like web scraper

Documentation

Overview

Package stew ... Is a lightweight extensible web scraping package

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type DescMap

type DescMap map[string]map[*Stew]struct{}

type ElemLookup

type ElemLookup func(*html.Node) []*html.Node

ElemLookup ... Is a functor type for DOM-tree BFS

func Find

func Find(attrKey, attrVal string) ElemLookup

Find ... Returns functor looking for elements matching input attr key-val pair

func FindAll

func FindAll(tags ...string) ElemLookup

FindAll ... Returns functor looking for elements with input tags

type Stew

type Stew struct {
	// Breadth-first position of element
	Pos uint
	// Tag name of current node
	Tag string
	// Pointer to parent node
	Parent *Stew
	// Pointers to children node
	Children []*Stew
	// Descs maps descendent tag name to Stew nodes
	Descs DescMap // discarding order information for searchability
	// Attrs ... map attribute key to value
	// empty string attrs key is the text content
	Attrs map[string][]string
}

Stew ... Is a queryable alternative to html.Node

func New

func New(link string) *Stew

New ... Visits link and extracts the Stew tree representation of the static DOM

func NewFromNode

func NewFromNode(root *html.Node) *Stew

NewFromNode ... Traverses through input root node and returns the Stew tree root

func NewFromReader

func NewFromReader(body io.ReadCloser) *Stew

NewFromReader ... Parses input html reader source and returns the Stew tree root

func NewFromRes

func NewFromRes(res *http.Response) *Stew

NewFromRes ... Parses input response and returns the Stew tree root

func (*Stew) Find

func (this *Stew) Find(attrKey, attrVal string) []*Stew

Find ... Returns all Stew nodes with matching input attr key-val pair

func (*Stew) FindAll

func (this *Stew) FindAll(tags ...string) []*Stew

FindAll ... Returns all Stew nodes matching input tags

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL