xml

package

v0.2.1 Latest Latest Go to latest Published: Aug 29, 2018 License: Apache-2.0 Imports: 14 Imported by: 2

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/katydid/katydid

Links

Open Source Insights

Documentation ¶

Overview ¶

Package xml contains a parser for XML.

Package xml implements a simple XML 1.0 parser that understands XML name spaces.

Index ¶

Constants
Variables
func Compare(a, b []byte) int
func Equal(a, b []byte) bool
func IndexByte(s []byte, c byte) int
func Unmarshal(data []byte, v interface{}) error
func WithAttrPrefix(a string) func(x *xmlParser)
func WithElemPrefix(e string) func(x *xmlParser)
func WithTextPrefix(e string) func(x *xmlParser)
type Attr
type Buffer
- func NewBuffer(buf []byte) *Buffer
- func NewBufferString(s string) *Buffer
- func (b *Buffer) Bytes() []byte
- func (b *Buffer) Cap() int
- func (b *Buffer) Grow(n int)
- func (b *Buffer) Len() int
- func (b *Buffer) Next(n int) []byte
- func (b *Buffer) Read(p []byte) (n int, err error)
- func (b *Buffer) ReadByte() (c byte, err error)
- func (b *Buffer) ReadBytes(delim byte) (line []byte, err error)
- func (b *Buffer) ReadFrom(r io.Reader) (n int64, err error)
- func (b *Buffer) ReadRune() (r rune, size int, err error)
- func (b *Buffer) ReadString(delim byte) (line string, err error)
- func (b *Buffer) Reset()
- func (b *Buffer) String() string
- func (b *Buffer) Truncate(n int)
- func (b *Buffer) UnreadByte() error
- func (b *Buffer) UnreadRune() error
- func (b *Buffer) Write(p []byte) (n int, err error)
- func (b *Buffer) WriteByte(c byte) error
- func (b *Buffer) WriteRune(r rune) (n int, err error)
- func (b *Buffer) WriteString(s string) (n int, err error)
- func (b *Buffer) WriteTo(w io.Writer) (n int64, err error)
type CharData
- func (c CharData) Copy() CharData
type Comment
- func (c Comment) Copy() Comment
type Decoder
- func NewDecoder(r io.Reader) *Decoder
- func (d *Decoder) Decode(v interface{}) error
- func (d *Decoder) DecodeElement(v interface{}, start *StartElement) error
- func (d *Decoder) InputOffset() int64
- func (d *Decoder) RawToken() (Token, error)
- func (d *Decoder) Skip() error
- func (d *Decoder) Token() (t Token, err error)
type Directive
- func (d Directive) Copy() Directive
type EndElement
type Name
type Option
type ProcInst
- func (p ProcInst) Copy() ProcInst
type StartElement
- func (e StartElement) Copy() StartElement
- func (e StartElement) End() EndElement
type SyntaxError
- func (e *SyntaxError) Error() string
type TagPathError
- func (e *TagPathError) Error() string
type Token
- func CopyToken(t Token) Token
type UnmarshalError
- func (e UnmarshalError) Error() string
type Unmarshaler
type UnmarshalerAttr
type XMLParser
- func NewXMLParser(options ...Option) XMLParser
Bugs

Constants ¶

View Source

const MinRead = 512

MinRead is the minimum slice size passed to a Read call by Buffer.ReadFrom. As long as the Buffer has at least MinRead bytes beyond what is required to hold the contents of r, ReadFrom will not grow the underlying buffer.

Variables ¶

View Source

var ErrTooLarge = errors.New("bytes.Buffer: too large")

ErrTooLarge is passed to panic if memory cannot be allocated to store data in a buffer.

View Source

var HTMLAutoClose = htmlAutoClose

HTMLAutoClose is the set of HTML elements that should be considered to close automatically.

View Source

var HTMLEntity = htmlEntity

HTMLEntity is an entity map containing translations for the standard HTML entity characters.

Functions ¶

func Compare ¶

func Compare(a, b []byte) int

Compare returns an integer comparing two byte slices lexicographically. The result will be 0 if a==b, -1 if a < b, and +1 if a > b. A nil argument is equivalent to an empty slice.

func Equal ¶

func Equal(a, b []byte) bool

Equal returns a boolean reporting whether a and b are the same length and contain the same bytes. A nil argument is equivalent to an empty slice.

func IndexByte ¶

func IndexByte(s []byte, c byte) int

IndexByte returns the index of the first instance of c in s, or -1 if c is not present in s.

func Unmarshal ¶

func Unmarshal(data []byte, v interface{}) error

Unmarshal parses the XML-encoded data and stores the result in the value pointed to by v, which must be an arbitrary struct, slice, or string. Well-formed data that does not fit into v is discarded.

Because Unmarshal uses the reflect package, it can only assign to exported (upper case) fields. Unmarshal uses a case-sensitive comparison to match XML element names to tag values and struct field names.

Unmarshal maps an XML element to a struct using the following rules. In the rules, the tag of a field refers to the value associated with the key 'xml' in the struct field's tag (see the example above).

If the struct has a field of type []byte or string with tag ",innerxml", Unmarshal accumulates the raw XML nested inside the element in that field. The rest of the rules still apply.
If the struct has a field named XMLName of type xml.Name, Unmarshal records the element name in that field.
If the XMLName field has an associated tag of the form "name" or "namespace-URL name", the XML element must have the given name (and, optionally, name space) or else Unmarshal returns an error.
If the XML element has an attribute whose name matches a struct field name with an associated tag containing ",attr" or the explicit name in a struct field tag of the form "name,attr", Unmarshal records the attribute value in that field.
If the XML element contains character data, that data is accumulated in the first struct field that has tag ",chardata". The struct field may have type []byte or string. If there is no such field, the character data is discarded.
If the XML element contains comments, they are accumulated in the first struct field that has tag ",comment". The struct field may have type []byte or string. If there is no such field, the comments are discarded.
If the XML element contains a sub-element whose name matches the prefix of a tag formatted as "a" or "a>b>c", unmarshal will descend into the XML structure looking for elements with the given names, and will map the innermost elements to that struct field. A tag starting with ">" is equivalent to one starting with the field name followed by ">".
If the XML element contains a sub-element whose name matches a struct field's XMLName tag and the struct field has no explicit name tag as per the previous rule, unmarshal maps the sub-element to that struct field.
If the XML element contains a sub-element whose name matches a field without any mode flags (",attr", ",chardata", etc), Unmarshal maps the sub-element to that struct field.
If the XML element contains a sub-element that hasn't matched any of the above rules and the struct has a field with tag ",any", unmarshal maps the sub-element to that struct field.
An anonymous struct field is handled as if the fields of its value were part of the outer struct.
A struct field with tag "-" is never unmarshalled into.

Unmarshal maps an XML element to a string or []byte by saving the concatenation of that element's character data in the string or []byte. The saved []byte is never nil.

Unmarshal maps an attribute value to a string or []byte by saving the value in the string or slice.

Unmarshal maps an XML element to a slice by extending the length of the slice and mapping the element to the newly created value.

Unmarshal maps an XML element or attribute value to a bool by setting it to the boolean value represented by the string.

Unmarshal maps an XML element or attribute value to an integer or floating-point field by setting the field to the result of interpreting the string value in decimal. There is no check for overflow.

Unmarshal maps an XML element to an xml.Name by recording the element name.

Unmarshal maps an XML element to a pointer by setting the pointer to a freshly allocated value and then mapping the element to that value.

func WithAttrPrefix ¶

func WithAttrPrefix(a string) func(x *xmlParser)

WithAttrPrefix specifies the prefix which will be added to attributes returned by the parser.

func WithElemPrefix ¶

func WithElemPrefix(e string) func(x *xmlParser)

WithElemPrefix specifies the prefix which will be added to elements returned by the parser.

func WithTextPrefix ¶

func WithTextPrefix(e string) func(x *xmlParser)

WithTextPrefix specifies the prefix which will be added to text returned by the parser.

Types ¶

type Attr ¶

type Attr struct {
	Name  Name
	Value string
}

An Attr represents an attribute in an XML element (Name=Value).

type Buffer ¶

type Buffer struct {
	// contains filtered or unexported fields
}

A Buffer is a variable-sized buffer of bytes with Read and Write methods. The zero value for Buffer is an empty buffer ready to use.

func NewBuffer ¶

func NewBuffer(buf []byte) *Buffer

NewBuffer creates and initializes a new Buffer using buf as its initial contents. It is intended to prepare a Buffer to read existing data. It can also be used to size the internal buffer for writing. To do that, buf should have the desired capacity but a length of zero.

In most cases, new(Buffer) (or just declaring a Buffer variable) is sufficient to initialize a Buffer.

func NewBufferString ¶

func NewBufferString(s string) *Buffer

NewBufferString creates and initializes a new Buffer using string s as its initial contents. It is intended to prepare a buffer to read an existing string.

In most cases, new(Buffer) (or just declaring a Buffer variable) is sufficient to initialize a Buffer.

func (*Buffer) Bytes ¶

func (b *Buffer) Bytes() []byte

Bytes returns a slice of the contents of the unread portion of the buffer; len(b.Bytes()) == b.Len(). If the caller changes the contents of the returned slice, the contents of the buffer will change provided there are no intervening method calls on the Buffer.

func (*Buffer) Cap ¶

func (b *Buffer) Cap() int

Cap returns the capacity of the buffer's underlying byte slice, that is, the total space allocated for the buffer's data.

func (*Buffer) Grow ¶

func (b *Buffer) Grow(n int)

Grow grows the buffer's capacity, if necessary, to guarantee space for another n bytes. After Grow(n), at least n bytes can be written to the buffer without another allocation. If n is negative, Grow will panic. If the buffer can't grow it will panic with ErrTooLarge.

func (*Buffer) Len ¶

func (b *Buffer) Len() int

Len returns the number of bytes of the unread portion of the buffer; b.Len() == len(b.Bytes()).

func (*Buffer) Next ¶

func (b *Buffer) Next(n int) []byte

Next returns a slice containing the next n bytes from the buffer, advancing the buffer as if the bytes had been returned by Read. If there are fewer than n bytes in the buffer, Next returns the entire buffer. The slice is only valid until the next call to a read or write method.

func (*Buffer) Read ¶

func (b *Buffer) Read(p []byte) (n int, err error)

Read reads the next len(p) bytes from the buffer or until the buffer is drained. The return value n is the number of bytes read. If the buffer has no data to return, err is io.EOF (unless len(p) is zero); otherwise it is nil.

func (*Buffer) ReadByte ¶

func (b *Buffer) ReadByte() (c byte, err error)

ReadByte reads and returns the next byte from the buffer. If no byte is available, it returns error io.EOF.

func (*Buffer) ReadBytes ¶

func (b *Buffer) ReadBytes(delim byte) (line []byte, err error)

ReadBytes reads until the first occurrence of delim in the input, returning a slice containing the data up to and including the delimiter. If ReadBytes encounters an error before finding a delimiter, it returns the data read before the error and the error itself (often io.EOF). ReadBytes returns err != nil if and only if the returned data does not end in delim.

func (*Buffer) ReadFrom ¶

func (b *Buffer) ReadFrom(r io.Reader) (n int64, err error)

ReadFrom reads data from r until EOF and appends it to the buffer, growing the buffer as needed. The return value n is the number of bytes read. Any error except io.EOF encountered during the read is also returned. If the buffer becomes too large, ReadFrom will panic with ErrTooLarge.

func (*Buffer) ReadRune ¶

func (b *Buffer) ReadRune() (r rune, size int, err error)

ReadRune reads and returns the next UTF-8-encoded Unicode code point from the buffer. If no bytes are available, the error returned is io.EOF. If the bytes are an erroneous UTF-8 encoding, it consumes one byte and returns U+FFFD, 1.

func (*Buffer) ReadString ¶

func (b *Buffer) ReadString(delim byte) (line string, err error)

ReadString reads until the first occurrence of delim in the input, returning a string containing the data up to and including the delimiter. If ReadString encounters an error before finding a delimiter, it returns the data read before the error and the error itself (often io.EOF). ReadString returns err != nil if and only if the returned data does not end in delim.

func (*Buffer) Reset ¶

func (b *Buffer) Reset()

Reset resets the buffer so it has no content. b.Reset() is the same as b.Truncate(0).

func (*Buffer) String ¶

func (b *Buffer) String() string

String returns the contents of the unread portion of the buffer as a string. If the Buffer is a nil pointer, it returns "<nil>".

func (*Buffer) Truncate ¶

func (b *Buffer) Truncate(n int)

Truncate discards all but the first n unread bytes from the buffer. It panics if n is negative or greater than the length of the buffer.

func (*Buffer) UnreadByte ¶

func (b *Buffer) UnreadByte() error

UnreadByte unreads the last byte returned by the most recent read operation. If write has happened since the last read, UnreadByte returns an error.

func (*Buffer) UnreadRune ¶

func (b *Buffer) UnreadRune() error

UnreadRune unreads the last rune returned by ReadRune. If the most recent read or write operation on the buffer was not a ReadRune, UnreadRune returns an error. (In this regard it is stricter than UnreadByte, which will unread the last byte from any read operation.)

func (*Buffer) Write ¶

func (b *Buffer) Write(p []byte) (n int, err error)

Write appends the contents of p to the buffer, growing the buffer as needed. The return value n is the length of p; err is always nil. If the buffer becomes too large, Write will panic with ErrTooLarge.

func (*Buffer) WriteByte ¶

func (b *Buffer) WriteByte(c byte) error

WriteByte appends the byte c to the buffer, growing the buffer as needed. The returned error is always nil, but is included to match bufio.Writer's WriteByte. If the buffer becomes too large, WriteByte will panic with ErrTooLarge.

func (*Buffer) WriteRune ¶

func (b *Buffer) WriteRune(r rune) (n int, err error)

WriteRune appends the UTF-8 encoding of Unicode code point r to the buffer, returning its length and an error, which is always nil but is included to match bufio.Writer's WriteRune. The buffer is grown as needed; if it becomes too large, WriteRune will panic with ErrTooLarge.

func (*Buffer) WriteString ¶

func (b *Buffer) WriteString(s string) (n int, err error)

WriteString appends the contents of s to the buffer, growing the buffer as needed. The return value n is the length of s; err is always nil. If the buffer becomes too large, WriteString will panic with ErrTooLarge.

func (*Buffer) WriteTo ¶

func (b *Buffer) WriteTo(w io.Writer) (n int64, err error)

WriteTo writes data to w until the buffer is drained or an error occurs. The return value n is the number of bytes written; it always fits into an int, but it is int64 to match the io.WriterTo interface. Any error encountered during the write is also returned.

type CharData ¶

type CharData []byte

A CharData represents XML character data (raw text), in which XML escape sequences have been replaced by the characters they represent.

func (CharData) Copy ¶

func (c CharData) Copy() CharData

type Comment ¶

type Comment []byte

A Comment represents an XML comment of the form . The bytes do not include the  comment markers.

func (Comment) Copy ¶

func (c Comment) Copy() Comment

type Decoder ¶

type Decoder struct {
	// Strict defaults to true, enforcing the requirements
	// of the XML specification.
	// If set to false, the parser allows input containing common
	// mistakes:
	//	* If an element is missing an end tag, the parser invents
	//	  end tags as necessary to keep the return values from Token
	//	  properly balanced.
	//	* In attribute values and character data, unknown or malformed
	//	  character entities (sequences beginning with &) are left alone.
	//
	// Setting:
	//
	//	d.Strict = false;
	//	d.AutoClose = HTMLAutoClose;
	//	d.Entity = HTMLEntity
	//
	// creates a parser that can handle typical HTML.
	//
	// Strict mode does not enforce the requirements of the XML name spaces TR.
	// In particular it does not reject name space tags using undefined prefixes.
	// Such tags are recorded with the unknown prefix as the name space URL.
	Strict bool

	// When Strict == false, AutoClose indicates a set of elements to
	// consider closed immediately after they are opened, regardless
	// of whether an end element is present.
	AutoClose []string

	// Entity can be used to map non-standard entity names to string replacements.
	// The parser behaves as if these standard mappings are present in the map,
	// regardless of the actual map content:
	//
	//	"lt": "<",
	//	"gt": ">",
	//	"amp": "&",
	//	"apos": "'",
	//	"quot": `"`,
	Entity map[string]string

	// CharsetReader, if non-nil, defines a function to generate
	// charset-conversion readers, converting from the provided
	// non-UTF-8 charset into UTF-8. If CharsetReader is nil or
	// returns an error, parsing stops with an error. One of the
	// the CharsetReader's result values must be non-nil.
	CharsetReader func(charset string, input io.Reader) (io.Reader, error)

	// DefaultSpace sets the default name space used for unadorned tags,
	// as if the entire XML stream were wrapped in an element containing
	// the attribute xmlns="DefaultSpace".
	DefaultSpace string
	// contains filtered or unexported fields
}

A Decoder represents an XML parser reading a particular input stream. The parser assumes that its input is encoded in UTF-8.

func NewDecoder ¶

func NewDecoder(r io.Reader) *Decoder

NewDecoder creates a new XML parser reading from r. If r does not implement io.ByteReader, NewDecoder will do its own buffering.

func (*Decoder) Decode ¶

func (d *Decoder) Decode(v interface{}) error

Decode works like xml.Unmarshal, except it reads the decoder stream to find the start element.

func (*Decoder) DecodeElement ¶

func (d *Decoder) DecodeElement(v interface{}, start *StartElement) error

DecodeElement works like xml.Unmarshal except that it takes a pointer to the start XML element to decode into v. It is useful when a client reads some raw XML tokens itself but also wants to defer to Unmarshal for some elements.

func (*Decoder) InputOffset ¶

func (d *Decoder) InputOffset() int64

InputOffset returns the input stream byte offset of the current decoder position. The offset gives the location of the end of the most recently returned token and the beginning of the next token.

func (*Decoder) RawToken ¶

func (d *Decoder) RawToken() (Token, error)

RawToken is like Token but does not verify that start and end elements match and does not translate name space prefixes to their corresponding URLs.

func (*Decoder) Skip ¶

func (d *Decoder) Skip() error

Skip reads tokens until it has consumed the end element matching the most recent start element already consumed. It recurs if it encounters a start element, so it can be used to skip nested structures. It returns nil if it finds an end element matching the start element; otherwise it returns an error describing the problem.

func (*Decoder) Token ¶

func (d *Decoder) Token() (t Token, err error)

Token returns the next XML token in the input stream. At the end of the input stream, Token returns nil, io.EOF.

Slices of bytes in the returned token data refer to the parser's internal buffer and remain valid only until the next call to Token. To acquire a copy of the bytes, call CopyToken or the token's Copy method.

Token expands self-closing elements such as <br/> into separate start and end elements returned by successive calls.

Token guarantees that the StartElement and EndElement tokens it returns are properly nested and matched: if Token encounters an unexpected end element, it will return an error.

Token implements XML name spaces as described by http://www.w3.org/TR/REC-xml-names/. Each of the Name structures contained in the Token has the Space set to the URL identifying its name space when known. If Token encounters an unrecognized name space prefix, it uses the prefix as the Space rather than report an error.

type Directive ¶

type Directive []byte

A Directive represents an XML directive of the form <!text>. The bytes do not include the <! and > markers.

func (Directive) Copy ¶

func (d Directive) Copy() Directive

type EndElement ¶

type EndElement struct {
	Name Name
}

An EndElement represents an XML end element.

type Name ¶

type Name struct {
	Space, Local string
}

A Name represents an XML name (Local) annotated with a name space identifier (Space). In tokens returned by Decoder.Token, the Space identifier is given as a canonical URL, not the short prefix used in the document being parsed.

type Option ¶

type Option func(x *xmlParser)

Option is used set options when creating a new XMLParser

type ProcInst ¶

type ProcInst struct {
	Target string
	Inst   []byte
}

A ProcInst represents an XML processing instruction of the form <?target inst?>

func (ProcInst) Copy ¶

func (p ProcInst) Copy() ProcInst

type StartElement ¶

type StartElement struct {
	Name Name
	Attr []Attr
}

A StartElement represents an XML start element.

func (StartElement) Copy ¶

func (e StartElement) Copy() StartElement

func (StartElement) End ¶

func (e StartElement) End() EndElement

End returns the corresponding XML end element.

type SyntaxError ¶

type SyntaxError struct {
	Msg  string
	Line int
}

A SyntaxError represents a syntax error in the XML input stream.

func (*SyntaxError) Error ¶

func (e *SyntaxError) Error() string

type TagPathError ¶

type TagPathError struct {
	Struct       reflect.Type
	Field1, Tag1 string
	Field2, Tag2 string
}

A TagPathError represents an error in the unmarshalling process caused by the use of field tags with conflicting paths.

func (*TagPathError) Error ¶

func (e *TagPathError) Error() string

type Token ¶

type Token interface{}

A Token is an interface holding one of the token types: StartElement, EndElement, CharData, Comment, ProcInst, or Directive.

func CopyToken ¶

func CopyToken(t Token) Token

CopyToken returns a copy of a Token.

type UnmarshalError ¶

type UnmarshalError string

An UnmarshalError represents an error in the unmarshalling process.

func (UnmarshalError) Error ¶

func (e UnmarshalError) Error() string

type Unmarshaler ¶

type Unmarshaler interface {
	UnmarshalXML(d *Decoder, start StartElement) error
}

Unmarshaler is the interface implemented by objects that can unmarshal an XML element description of themselves.

UnmarshalXML decodes a single XML element beginning with the given start element. If it returns an error, the outer call to Unmarshal stops and returns that error. UnmarshalXML must consume exactly one XML element. One common implementation strategy is to unmarshal into a separate value with a layout matching the expected XML using d.DecodeElement, and then to copy the data from that value into the receiver. Another common strategy is to use d.Token to process the XML object one token at a time. UnmarshalXML may not use d.RawToken.

type UnmarshalerAttr ¶

type UnmarshalerAttr interface {
	UnmarshalXMLAttr(attr Attr) error
}

UnmarshalerAttr is the interface implemented by objects that can unmarshal an XML attribute description of themselves.

UnmarshalXMLAttr decodes a single XML attribute. If it returns an error, the outer call to Unmarshal stops and returns that error. UnmarshalXMLAttr is used only for struct fields with the "attr" option in the field tag.

type XMLParser ¶

type XMLParser interface {
	parser.Interface
	//Init intialises the parser with a byte buffer containing xml.
	Init([]byte) error
	Reset() error
}

XMLParser is an xml parser.

func NewXMLParser ¶

func NewXMLParser(options ...Option) XMLParser

NewXMLParser returns a new xml parser.

Notes ¶

Bugs ¶

Mapping between XML elements and data structures is inherently flawed: an XML element is an order-dependent collection of anonymous values, while a data structure is an order-independent collection of named values. See package json for a textual representation more suitable to data structures.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL