xmlquery: github.com/antchfx/xmlquery Index | Files

package xmlquery

import "github.com/antchfx/xmlquery"

Package xmlquery provides extract data from XML documents using XPath expression.

Index

Package Files

cache.go node.go parse.go query.go

Variables

var DisableSelectorCache = false

DisableSelectorCache will disable caching for the query selector if value is true.

var SelectorCacheMaxEntries = 50

SelectorCacheMaxEntries allows how many selector object can be caching. Default is 50. Will disable caching if SelectorCacheMaxEntries <= 0.

func AddAttr Uses

func AddAttr(n *Node, key, val string)

AddAttr adds a new attribute specified by 'key' and 'val' to a node 'n'.

func AddChild Uses

func AddChild(parent, n *Node)

AddChild adds a new node 'n' to a node 'parent' as its last child.

func AddSibling Uses

func AddSibling(sibling, n *Node)

AddSibling adds a new node 'n' as a sibling of a given node 'sibling'. Note it is not necessarily true that the new node 'n' would be added immediately after 'sibling'. If 'sibling' isn't the last child of its parent, then the new node 'n' will be added at the end of the sibling chain of their parent.

func FindEach Uses

func FindEach(top *Node, expr string, cb func(int, *Node))

FindEach searches the html.Node and calls functions cb. Important: this method has deprecated, recommend use for .. = range Find(){}.

func FindEachWithBreak Uses

func FindEachWithBreak(top *Node, expr string, cb func(int, *Node) bool)

FindEachWithBreak functions the same as FindEach but allows you to break the loop by returning false from your callback function, cb. Important: this method has deprecated, recommend use for .. = range Find(){}.

func RemoveFromTree Uses

func RemoveFromTree(n *Node)

RemoveFromTree removes a node and its subtree from the document tree it is in. If the node is the root of the tree, then it's no-op.

type Node Uses

type Node struct {
    Parent, FirstChild, LastChild, PrevSibling, NextSibling *Node

    Type         NodeType
    Data         string
    Prefix       string
    NamespaceURI string
    Attr         []xml.Attr
    // contains filtered or unexported fields
}

A Node consists of a NodeType and some Data (tag name for element nodes, content for text) and are part of a tree of Nodes.

func Find Uses

func Find(top *Node, expr string) []*Node

Find is like QueryAll but it will panics if the `expr` is not a valid XPath expression. See `QueryAll()` function.

func FindOne Uses

func FindOne(top *Node, expr string) *Node

FindOne is like Query but it will panics if the `expr` is not a valid XPath expression. See `Query()` function.

func LoadURL Uses

func LoadURL(url string) (*Node, error)

LoadURL loads the XML document from the specified URL.

func Parse Uses

func Parse(r io.Reader) (*Node, error)

Parse returns the parse tree for the XML from the given Reader.

func Query Uses

func Query(top *Node, expr string) (*Node, error)

Query searches the XML Node that matches by the specified XPath expr, and returns first element of matched.

func QueryAll Uses

func QueryAll(top *Node, expr string) ([]*Node, error)

QueryAll searches the XML Node that matches by the specified XPath expr. Return an error if the expression `expr` cannot be parsed.

func QuerySelector Uses

func QuerySelector(top *Node, selector *xpath.Expr) *Node

QuerySelector returns the first matched XML Node by the specified XPath selector.

func QuerySelectorAll Uses

func QuerySelectorAll(top *Node, selector *xpath.Expr) []*Node

QuerySelectorAll searches all of the XML Node that matches the specified XPath selectors.

func (*Node) InnerText Uses

func (n *Node) InnerText() string

InnerText returns the text between the start and end tags of the object.

func (*Node) OutputXML Uses

func (n *Node) OutputXML(self bool) string

OutputXML returns the text that including tags name.

func (*Node) SelectAttr Uses

func (n *Node) SelectAttr(name string) string

SelectAttr returns the attribute value with the specified name.

func (*Node) SelectElement Uses

func (n *Node) SelectElement(name string) *Node

SelectElement finds child elements with the specified name.

func (*Node) SelectElements Uses

func (n *Node) SelectElements(name string) []*Node

SelectElements finds child elements with the specified name.

type NodeNavigator Uses

type NodeNavigator struct {
    // contains filtered or unexported fields
}

func CreateXPathNavigator Uses

func CreateXPathNavigator(top *Node) *NodeNavigator

CreateXPathNavigator creates a new xpath.NodeNavigator for the specified html.Node.

func (*NodeNavigator) Copy Uses

func (x *NodeNavigator) Copy() xpath.NodeNavigator

func (*NodeNavigator) Current Uses

func (x *NodeNavigator) Current() *Node

func (*NodeNavigator) LocalName Uses

func (x *NodeNavigator) LocalName() string

func (*NodeNavigator) MoveTo Uses

func (x *NodeNavigator) MoveTo(other xpath.NodeNavigator) bool

func (*NodeNavigator) MoveToChild Uses

func (x *NodeNavigator) MoveToChild() bool

func (*NodeNavigator) MoveToFirst Uses

func (x *NodeNavigator) MoveToFirst() bool

func (*NodeNavigator) MoveToNext Uses

func (x *NodeNavigator) MoveToNext() bool

func (*NodeNavigator) MoveToNextAttribute Uses

func (x *NodeNavigator) MoveToNextAttribute() bool

func (*NodeNavigator) MoveToParent Uses

func (x *NodeNavigator) MoveToParent() bool

func (*NodeNavigator) MoveToPrevious Uses

func (x *NodeNavigator) MoveToPrevious() bool

func (*NodeNavigator) MoveToRoot Uses

func (x *NodeNavigator) MoveToRoot()

func (*NodeNavigator) NamespaceURL Uses

func (x *NodeNavigator) NamespaceURL() string

func (*NodeNavigator) NodeType Uses

func (x *NodeNavigator) NodeType() xpath.NodeType

func (*NodeNavigator) Prefix Uses

func (x *NodeNavigator) Prefix() string

func (*NodeNavigator) String Uses

func (x *NodeNavigator) String() string

func (*NodeNavigator) Value Uses

func (x *NodeNavigator) Value() string

type NodeType Uses

type NodeType uint

A NodeType is the type of a Node.

const (
    // DocumentNode is a document object that, as the root of the document tree,
    // provides access to the entire XML document.
    DocumentNode NodeType = iota
    // DeclarationNode is the document type declaration, indicated by the following
    // tag (for example, <!DOCTYPE...> ).
    DeclarationNode
    // ElementNode is an element (for example, <item> ).
    ElementNode
    // TextNode is the text content of a node.
    TextNode
    // CharDataNode node <![CDATA[content]]>
    CharDataNode
    // CommentNode a comment (for example, <!-- my comment --> ).
    CommentNode
    // AttributeNode is an attribute of element.
    AttributeNode
)

type StreamParser Uses

type StreamParser struct {
    // contains filtered or unexported fields
}

StreamParser enables loading and parsing an XML document in a streaming fashion.

func CreateStreamParser Uses

func CreateStreamParser(r io.Reader, streamElementXPath string, streamElementFilter ...string) (*StreamParser, error)

CreateStreamParser creates a StreamParser. Argument streamElementXPath is required. Argument streamElementFilter is optional and should only be used in advanced scenarios.

Scenario 1: simple case:

xml := `<AAA><BBB>b1</BBB><BBB>b2</BBB></AAA>`
sp, err := CreateStreamParser(strings.NewReader(xml), "/AAA/BBB")
if err != nil {
    panic(err)
}
for {
    n, err := sp.Read()
    if err != nil {
        break
    }
    fmt.Println(n.OutputXML(true))
}

Output will be:

<BBB>b1</BBB>
<BBB>b2</BBB>

Scenario 2: advanced case:

xml := `<AAA><BBB>b1</BBB><BBB>b2</BBB></AAA>`
sp, err := CreateStreamParser(strings.NewReader(xml), "/AAA/BBB", "/AAA/BBB[. != 'b1']")
if err != nil {
    panic(err)
}
for {
    n, err := sp.Read()
    if err != nil {
        break
    }
    fmt.Println(n.OutputXML(true))
}

Output will be:

<BBB>b2</BBB>

As the argument names indicate, streamElementXPath should be used for providing xpath query pointing to the target element node only, no extra filtering on the element itself or its children; while streamElementFilter, if needed, can provide additional filtering on the target element and its children.

CreateStreamParser returns error if either streamElementXPath or streamElementFilter, if provided, cannot be successfully parsed and compiled into a valid xpath query.

func (*StreamParser) Read Uses

func (sp *StreamParser) Read() (*Node, error)

Read returns a target node that satisifies the XPath specified by caller at StreamParser creation time. If there is no more satisifying target node after reading the rest of the XML document, io.EOF will be returned. At any time, any XML parsing error encountered, the error will be returned and the stream parsing is stopped. Calling Read() after an error is returned (including io.EOF) is not allowed the behavior will be undefined. Also note, due to the streaming nature, calling Read() will automatically remove any previous target node(s) from the document tree.

Package xmlquery imports 12 packages (graph) and is imported by 37 packages. Updated 2020-09-21. Refresh now. Tools for package owners.