Documentation ¶
Index ¶
- Variables
- func CollapseRepeatingSpaces(input string) string
- func CollapseRepeatingWhitespace(input string) string
- func GetNodeName(node *html.Node) string
- func HTMLToPlainText(htmlMarkup string) (string, error)
- func IsElement(node *html.Node) bool
- func IsElementNodeOfType(node *html.Node, types []string) bool
- func IsTextNode(node *html.Node) bool
- func LastIn(slice []string) string
- func TrimSpaces(input string) string
- func TrimWhitespaceLeft(input string) string
- func Walk(root *html.Node, enter func(*html.Node), exit func(*html.Node))
Constants ¶
This section is empty.
Variables ¶
var MetadataContent = []string{
"base",
"command",
"link",
"meta",
"noscript",
"script",
"style",
"title",
"html",
"head",
}
MetadataContent is a set of node names, for the nodes that are in the HTML metadata content category.
The node names, `html` and `head`, are treated as special set members.
var PhrasingContent = []string{
"a",
"abbr",
"audio",
"b",
"bdo",
"br",
"button",
"canvas",
"cite",
"code",
"command",
"data",
"datalist",
"dfn",
"em",
"embed",
"i",
"iframe",
"img",
"input",
"kbd",
"keygen",
"label",
"mark",
"math",
"meter",
"noscript",
"object",
"output",
"progress",
"q",
"ruby",
"samp",
"script",
"select",
"small",
"span",
"strong",
"sub",
"sup",
"svg",
"textarea",
"time",
"var",
"video",
"wbr",
"map",
"area",
}
PhrasingContent is a set of node names, for the nodes that are in the HTML phrasing content category.
The node names, `map` and `area`, are treated as special set members.
Functions ¶
func CollapseRepeatingSpaces ¶
CollapseRepeatingSpaces returns a slice of the string input with repeating spaces reduced down to one.
func CollapseRepeatingWhitespace ¶
CollapseRepeatingWhitespace returns a slice of the string input with repeating whitespace characters reduced down to one.
func GetNodeName ¶
GetNodeName returns the node name of the given node.
func HTMLToPlainText ¶
HTMLToPlainText receives HTML markup text as input, and returns a transformed plain-text representation.
It implements an algorithm similar to an HTML5 DOM element node's `.innerText` property. This does not take layout or styling into account.
func IsElementNodeOfType ¶
IsElementNodeOfType returns true if the given node has a name that is a member of the types slice.
func IsTextNode ¶
IsTextNode returns true if the given node is the text node type.
func LastIn ¶
LastIn returns the last item of the given slice.
If the slice empty the null character is returned.
func TrimSpaces ¶
TrimSpaces returns a slice of the string input with only spaces removed.
func TrimWhitespaceLeft ¶
TrimWhitespaceLeft returns a slice of the string input with all leading whitespace characters removed.
Types ¶
This section is empty.