Documentation ¶
Overview ¶
Package htmltojson is a HTML parser, based on net/html package. This package actually just to simplify HTML parsing. If you need more complex HTML processing, please use net/html as its offer more features. The package name is actually is not really fitting for this package purpose, but I use this package for may scraper engines, so I don't really want to bother with changing the package name...
Index ¶
- Constants
- func Save(node *Node) error
- func SaveNodes(nodes []Node) error
- func SaveNodesToPath(nodes []Node, path string) error
- func SaveToPath(node *Node, path string) error
- type Attr
- type Node
- func Parse(root *html.Node) *Node
- func ParseBytes(byts []byte) (*Node, error)
- func ParseFromFile(path string) (*Node, error)
- func ParseFromReader(reader io.Reader) (*Node, error)
- func ParseString(str string) (*Node, error)
- func SearchAllNode(ty, data, namespace, key, val string, node *Node) []Node
- func SearchNode(ty, data, namespace, key, val string, node *Node) *Node
Constants ¶
const ( Text = "text" Document = "document" Element = "element" Comment = "comment" Doctype = "doctype" )
Node Types
Variables ¶
This section is empty.
Functions ¶
func SaveNodesToPath ¶
SaveNodesToPath saves array of nodes to path
Types ¶
type Attr ¶
type Attr struct { Namespace string `json:"namespace"` Key string `json:"key"` Val string `json:"val"` }
Attr is HTML attributes, like class, style, id, etc.
type Node ¶
type Node struct { Type string `json:"type"` Data string `json:"data"` Namespace string `json:"namespace"` Attr []Attr `json:"attr"` Child []Node `json:"child"` }
Node is parsed HTML object
func ParseBytes ¶
ParseBytes parse HTML bytes to marshalable node
func ParseFromFile ¶
ParseFromFile parse HTML from file in path
func ParseFromReader ¶
ParseFromReader parse reader to marshalable node
func ParseString ¶
ParseString parse HTML string to marshalable node
func SearchAllNode ¶
SearchAllNode search nodes matched with options. ty for HTML object type, data is for HTML tag name, key is for attribute key val is for attribute value with key
func SearchNode ¶
SearchNode search a node matched with params. ty for HTML object type, data is for HTML tag name, key is for attribute key val is for attribute value with key