Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func GetImg ¶
GetImg is an `extractor.CheckFunc` used to retrieve image URLs from a web page. It uses `t` as the token to analyse and its `tokenType`. It returns the link value or an empty `string` if `t` does not correspond to a link.
func GetLinkBasic ¶
GetLinkBasic is an `extractor.CheckFunc` used to retrieve link URLs from a web page. It uses `t` as the token to analyse and its `tokenType`. It returns the link value or an empty `string` if `t` does not correspond to a link.
NOTE: This function ignores the `nofollow` meta tag.
func GetLinkNoFollow ¶
GetLinkNoFollow is an `extractor.CheckFunc` used to retrieve link URLs from a web page. It uses `t` as the token to analyse and its `tokenType`. It returns the link value or an empty `string` if `t` does not correspond to a link.
NOTE: This function respect the `nofollow` meta tag.
Types ¶
type CheckFunc ¶
CheckFunc is a named type representing a function that checks if an `html.Token` has a link that can be crawled.
type Extractor ¶
type Extractor struct {
// contains filtered or unexported fields
}
Extractor is a `struct` that extracts links found in a web page according to the results of its inner `CheckFunc` functions.
func NewExtractor ¶
NewExtractor returns a new `*extractor.Extractor`.
func (*Extractor) ExtractLinks ¶
ExtractLinks extracts, cleans and returns a `[]string` of links found in `content` and matching any `e.cf` function.