Documentation ¶
Overview ¶
Package xml is an alternative to the standard library `encoding/xml` package.
This package uses of buffers and reusable object instances during unmarshalling to reduce allocations and struct initialization and the copy-by-value behavior of Go. This saves considerable amounts of resources for constrained systems.
The library is still incomplete, see the repository's README. But should be ready to be used in prod assuming you're currently unmarshalling by manually extracting tokens out of the decoder.
10-34% faster 76% less allocated memory 66% less memory allocations
Example (ManualDecodingWithTokens) ¶
This example demonstrates how to decode an XML file using manual tokenization into an object, and how to terminate the read-parse loop.
const data = ` <msg id="123" desc="flying mammal"> Bat </msg> <msg id="456" desc="baseball item"> Bat </msg> ` type Msg struct { ID string Desc string Contents string } var msgs []Msg var msg Msg d := xml.NewDecoder(strings.NewReader(data)) for { tok, err := d.Token() if err != nil { // Decoding completes when EOF is returned. if errors.Is(err, io.EOF) { break } log.Fatal(err) return } switch tok := tok.(type) { case *xml.StartTag: if tok.Name.Local() != "msg" { log.Fatalf("unexpected start tag: %s", tok.Name.Local()) } for _, attr := range tok.Attr { switch attr.Name.Local() { case "id": msg.ID = attr.Value case "desc": msg.Desc = attr.Value } } case *xml.CloseTag: if tok.Name.Local() != "msg" { log.Fatalf("unexpected close tag: %s", tok.Name.Local()) } msgs = append(msgs, msg) msg = Msg{} case *xml.CharData: msg.Contents = string(tok.Data) default: log.Fatalf("unexpected token: %T", tok) } } for _, m := range msgs { fmt.Printf("Msg{ID: '%s', Desc: '%s', Contents: '%s'}\n", m.ID, m.Desc, m.Contents) }
Output: Msg{ID: '123', Desc: 'flying mammal', Contents: ' Bat '} Msg{ID: '456', Desc: 'baseball item', Contents: ' Bat '}
Index ¶
Examples ¶
Constants ¶
const ( // UnexpectedChar is thrown when an unexpected rune or characters appears outside of an attribute // value or CharData token. UnexpectedChar decodeError = "unexpected char" )
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Attr ¶
Attr is a tag attribute like <foo bar="baz">. This will store an Attr with name "bar" and value "baz"
type Comment ¶
type Comment struct { // Data contains the contents of the comment. It is empty by default. // // Enable `d.ReadComment` to include the contents in the token. Data []byte }
Comment has the format <-- -->
It can have two or more `-` at the beginning, but it must have two `-` at the end.
type Decoder ¶
type Decoder struct { // ReadComment enables reading and returning back the comment contents. Otherwise returns an empty // node. Disabled by default. ReadComment bool // ReadComment enables reading and returning back the directive contents. Otherwise returns an // empty node. Disabled by default. // // Note that we DO NOT process directives, we simply return back the string within `<! ... >` ReadDirective bool // contains filtered or unexported fields }
Decoder processes an XML input and generates tokens or processes into a given struct.
func NewDecoder ¶
NewDecoder instantiates a Decoder to process a Reader input.
type Directive ¶
type Directive struct { // Data contains the contents of the directive. It is empty by default. // // Enable `d.ReadDirective` to include the contents in the token. Data []byte }
Directive has the format <! ... >
Note: We do NOT process the directive token. We only read it.
type Name ¶
type Name struct {
// contains filtered or unexported fields
}
Name stores an identifier name from either a tag or an attribute like <foo bar="baz"> This will generate the names "foo" for the tag, and "bar" for the attribute.
type Token ¶
type Token interface { // Copy the token into a new instance. // // Tokens instances are constantly modified by the decoding process, this function makes a copy // for the unlikely case when the token value must be stored, and for testing! Copy() Token // contains filtered or unexported methods }
Token represents an XML Token:
StartTag: <foo> or <foo /> CloseTag: </foo> implicitly </foo> too Comment: <-- foo --> ProcInst: <? foo ?> Directive: <! foo > CharData: Any string outside of angle brackets <>