Documentation ¶
Overview ¶
Package xml implements a simple XML 1.0 parser that understands XML name spaces.
Index ¶
- Constants
- Variables
- func Escape(w io.Writer, s []byte)
- func EscapeText(w io.Writer, s []byte) error
- type Attr
- type CharData
- type Comment
- type Decoder
- type Directive
- type Encoder
- type EndElement
- type Marshaler
- type MarshalerAttr
- type Name
- type ProcInst
- type StartElement
- type SyntaxError
- type TagPathError
- type Token
- type UnmarshalError
- type Unmarshaler
- type UnmarshalerAttr
- type UnsupportedTypeError
- Bugs
Constants ¶
const ( // A generic XML header suitable for use with the output of Marshal. // This is not automatically added to any output of this package, // it is provided as a convenience. Header = `<?xml version="1.0" encoding="UTF-8"?>` + "\n" )
Variables ¶
var HTMLAutoClose = htmlAutoClose
HTMLAutoClose is the set of HTML elements that should be considered to close automatically.
var HTMLEntity = htmlEntity
HTMLEntity is an entity map containing translations for the standard HTML entity characters.
Functions ¶
Types ¶
type CharData ¶
type CharData []byte
A CharData represents XML character data (raw text), in which XML escape sequences have been replaced by the characters they represent.
type Comment ¶
type Comment []byte
A Comment represents an XML comment of the form <!--comment-->. The bytes do not include the <!-- and --> comment markers.
type Decoder ¶
type Decoder struct { // Strict defaults to true, enforcing the requirements // of the XML specification. // If set to false, the parser allows input containing common // mistakes: // * If an element is missing an end tag, the parser invents // end tags as necessary to keep the return values from Token // properly balanced. // * In attribute values and character data, unknown or malformed // character entities (sequences beginning with &) are left alone. // // Setting: // // d.Strict = false; // d.AutoClose = HTMLAutoClose; // d.Entity = HTMLEntity // // creates a parser that can handle typical HTML. // // Strict mode does not enforce the requirements of the XML name spaces TR. // In particular it does not reject name space tags using undefined prefixes. // Such tags are recorded with the unknown prefix as the name space URL. Strict bool // When Strict == false, AutoClose indicates a set of elements to // consider closed immediately after they are opened, regardless // of whether an end element is present. AutoClose []string // Entity can be used to map non-standard entity names to string replacements. // The parser behaves as if these standard mappings are present in the map, // regardless of the actual map content: // // "lt": "<", // "gt": ">", // "amp": "&", // "apos": "'", // "quot": `"`, Entity map[string]string // CharsetReader, if non-nil, defines a function to generate // charset-conversion readers, converting from the provided // non-UTF-8 charset into UTF-8. If CharsetReader is nil or // returns an error, parsing stops with an error. One of the // the CharsetReader's result values must be non-nil. CharsetReader func(charset string, input io.Reader) (io.Reader, error) // DefaultSpace sets the default name space used for unadorned tags, // as if the entire XML stream were wrapped in an element containing // the attribute xmlns="DefaultSpace". DefaultSpace string // contains filtered or unexported fields }
A Decoder represents an XML parser reading a particular input stream. The parser assumes that its input is encoded in UTF-8.
func NewDecoder ¶
NewDecoder creates a new XML parser reading from r. If r does not implement io.ByteReader, NewDecoder will do its own buffering.
func (*Decoder) Decode ¶
Decode works like xml.Unmarshal, except it reads the decoder stream to find the start element.
func (*Decoder) DecodeElement ¶
func (d *Decoder) DecodeElement(v interface{}, start *StartElement) error
DecodeElement works like xml.Unmarshal except that it takes a pointer to the start XML element to decode into v. It is useful when a client reads some raw XML tokens itself but also wants to defer to Unmarshal for some elements.
func (*Decoder) InputOffset ¶
InputOffset returns the input stream byte offset of the current decoder position. The offset gives the location of the end of the most recently returned token and the beginning of the next token.
func (*Decoder) RawToken ¶
RawToken is like Token but does not verify that start and end elements match and does not translate name space prefixes to their corresponding URLs.
func (*Decoder) Skip ¶
Skip reads tokens until it has consumed the end element matching the most recent start element already consumed. It recurs if it encounters a start element, so it can be used to skip nested structures. It returns nil if it finds an end element matching the start element; otherwise it returns an error describing the problem.
func (*Decoder) Token ¶
Token returns the next XML token in the input stream. At the end of the input stream, Token returns nil, io.EOF.
Slices of bytes in the returned token data refer to the parser's internal buffer and remain valid only until the next call to Token. To acquire a copy of the bytes, call CopyToken or the token's Copy method.
Token expands self-closing elements such as <br/> into separate start and end elements returned by successive calls.
Token guarantees that the StartElement and EndElement tokens it returns are properly nested and matched: if Token encounters an unexpected end element, it will return an error.
Token implements XML name spaces as described by http://www.w3.org/TR/REC-xml-names/. Each of the Name structures contained in the Token has the Space set to the URL identifying its name space when known. If Token encounters an unrecognized name space prefix, it uses the prefix as the Space rather than report an error.
type Directive ¶
type Directive []byte
A Directive represents an XML directive of the form <!text>. The bytes do not include the <! and > markers.
type Encoder ¶
type Encoder struct {
// contains filtered or unexported fields
}
An Encoder writes XML data to an output stream.
func NewEncoder ¶
NewEncoder returns a new encoder that writes to w.
func (*Encoder) Encode ¶
Encode writes the XML encoding of v to the stream.
See the documentation for Marshal for details about the conversion of Go values to XML.
Encode calls Flush before returning.
func (*Encoder) EncodeElement ¶
func (enc *Encoder) EncodeElement(v interface{}, start StartElement) error
EncodeElement writes the XML encoding of v to the stream, using start as the outermost tag in the encoding.
See the documentation for Marshal for details about the conversion of Go values to XML.
EncodeElement calls Flush before returning.
func (*Encoder) EncodeToken ¶
EncodeToken writes the given XML token to the stream. It returns an error if StartElement and EndElement tokens are not properly matched.
EncodeToken does not call Flush, because usually it is part of a larger operation such as Encode or EncodeElement (or a custom Marshaler's MarshalXML invoked during those), and those will call Flush when finished. Callers that create an Encoder and then invoke EncodeToken directly, without using Encode or EncodeElement, need to call Flush when finished to ensure that the XML is written to the underlying writer.
EncodeToken allows writing a ProcInst with Target set to "xml" only as the first token in the stream.
When encoding a StartElement holding an XML namespace prefix declaration for a prefix that is not already declared, contained elements (including the StartElement itself) will use the declared prefix when encoding names with matching namespace URIs.
type Marshaler ¶
type Marshaler interface {
MarshalXML(e *Encoder, start StartElement) error
}
Marshaler is the interface implemented by objects that can marshal themselves into valid XML elements.
MarshalXML encodes the receiver as zero or more XML elements. By convention, arrays or slices are typically encoded as a sequence of elements, one per entry. Using start as the element tag is not required, but doing so will enable Unmarshal to match the XML elements to the correct struct field. One common implementation strategy is to construct a separate value with a layout corresponding to the desired XML and then to encode it using e.EncodeElement. Another common strategy is to use repeated calls to e.EncodeToken to generate the XML output one token at a time. The sequence of encoded tokens must make up zero or more valid XML elements.
type MarshalerAttr ¶
MarshalerAttr is the interface implemented by objects that can marshal themselves into valid XML attributes.
MarshalXMLAttr returns an XML attribute with the encoded value of the receiver. Using name as the attribute name is not required, but doing so will enable Unmarshal to match the attribute to the correct struct field. If MarshalXMLAttr returns the zero attribute Attr{}, no attribute will be generated in the output. MarshalXMLAttr is used only for struct fields with the "attr" option in the field tag.
type Name ¶
type Name struct {
Space, Local string
}
A Name represents an XML name (Local) annotated with a name space identifier (Space). In tokens returned by Decoder.Token, the Space identifier is given as a canonical URL, not the short prefix used in the document being parsed.
As a special case, XML namespace declarations will use the literal string "xmlns" for the Space field instead of the fully resolved URL. See Encoder.EncodeToken for more information on namespace encoding behaviour.
type StartElement ¶
A StartElement represents an XML start element.
func (StartElement) Copy ¶
func (e StartElement) Copy() StartElement
func (StartElement) End ¶
func (e StartElement) End() EndElement
End returns the corresponding XML end element.
type SyntaxError ¶
A SyntaxError represents a syntax error in the XML input stream.
func (*SyntaxError) Error ¶
func (e *SyntaxError) Error() string
type TagPathError ¶
A TagPathError represents an error in the unmarshalling process caused by the use of field tags with conflicting paths.
func (*TagPathError) Error ¶
func (e *TagPathError) Error() string
type Token ¶
type Token interface{}
A Token is an interface holding one of the token types: StartElement, EndElement, CharData, Comment, ProcInst, or Directive.
type UnmarshalError ¶
type UnmarshalError string
An UnmarshalError represents an error in the unmarshalling process.
func (UnmarshalError) Error ¶
func (e UnmarshalError) Error() string
type Unmarshaler ¶
type Unmarshaler interface {
UnmarshalXML(d *Decoder, start StartElement) error
}
Unmarshaler is the interface implemented by objects that can unmarshal an XML element description of themselves.
UnmarshalXML decodes a single XML element beginning with the given start element. If it returns an error, the outer call to Unmarshal stops and returns that error. UnmarshalXML must consume exactly one XML element. One common implementation strategy is to unmarshal into a separate value with a layout matching the expected XML using d.DecodeElement, and then to copy the data from that value into the receiver. Another common strategy is to use d.Token to process the XML object one token at a time. UnmarshalXML may not use d.RawToken.
type UnmarshalerAttr ¶
UnmarshalerAttr is the interface implemented by objects that can unmarshal an XML attribute description of themselves.
UnmarshalXMLAttr decodes a single XML attribute. If it returns an error, the outer call to Unmarshal stops and returns that error. UnmarshalXMLAttr is used only for struct fields with the "attr" option in the field tag.
type UnsupportedTypeError ¶
A MarshalXMLError is returned when Marshal encounters a type that cannot be converted into XML.
func (*UnsupportedTypeError) Error ¶
func (e *UnsupportedTypeError) Error() string
Notes ¶
Bugs ¶
Mapping between XML elements and data structures is inherently flawed: an XML element is an order-dependent collection of anonymous values, while a data structure is an order-independent collection of named values. See package json for a textual representation more suitable to data structures.