Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type HTML ¶
type HTML struct{}
HTML represents a HTML parser implementation.
func (*HTML) FindAttrMap ¶
func (h *HTML) FindAttrMap(r io.Reader, q QueryAttrMap, res crawler.QueryResult) error
FindAttrMap parses HTML documents with multiple queries and retrieves the corresponding attributes of found elements. All queries and related attributes are stores within QueryAttrMap. All results will be stored in QueryResult.
Example: p.FindAttrMap(body, QueryAttrMap{"div": "class"}, QueryResult{}).
type Parser ¶
type Parser interface { // FindContent searches for only one element and returns its text content. FindContent(r io.Reader, query string) string // FindAttrMap searches by multiple queries and retrieves corresponding attributes // of found elements. All queries and related attributes stores within QueryAttrMap. // All results will be stored in QueryResult. FindAttrMap(io.Reader, QueryAttrMap, crawler.QueryResult) error }
Parser describes a generic set of parser functions.
type QueryAttrMap ¶
QueryAttrMap stores a mapping of parser query and an attribute which data should be retrieved. Example: QueryAttrMap{"div": "class"}. Here div elements should be found and the class value of each will be collected.
Click to show internal directories.
Click to hide internal directories.