exml

package module
v0.0.0-...-f83b975 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 11, 2014 License: BSD-2-Clause Imports: 5 Imported by: 0

README

go-exml Build Status

The go-exml package provides an intuitive event based XML parsing API which sits on top of a standard Go encoding/xml/Decoder, greatly simplifying the parsing code while retaining the raw speed and low memory overhead of the underlying stream engine, regardless of the size of the input. The module takes care of the complex tasks of maintaining contexts between event handlers allowing you to concentrate on dealing with the actual structure of the XML document.

Installation

go get github.com/lucsky/go-exml

or

go get http://gopkg.in/lucsky/go-exml.v1

Usage

The best way to illustrate how go-exml makes parsing very easy is to look at actual examples. Consider the following contrived sample document:

<?xml version="1.0"?>
<address-book name="homies">
    <contact>
        <first-name>Tim</first-name>
        <last-name>Cook</last-name>
        <address>Cupertino</address>
    </contact>
    <contact>
        <first-name>Steve</first-name>
        <last-name>Ballmer</last-name>
        <address>Redmond</address>
    </contact>
    <contact>
        <first-name>Mark</first-name>
        <last-name>Zuckerberg</last-name>
        <address>Menlo Park</address>
    </contact>
</address-book>

Here is a way to parse it into an array of contact objects using go-exml:

package main

import (
    "fmt"
    "os"

    "github.com/lucsky/go-exml"
)

type AddressBook struct {
    Name     string
    Contacts []*Contact
}

type Contact struct {
    FirstName string
    LastName  string
    Address   string
}

func main() {
    reader, _ := os.Open("example.xml")
    defer reader.Close()

    addressBook := AddressBook{}
    decoder := exml.NewDecoder(reader)

    decoder.On("address-book", func(attrs exml.Attrs) {
        addressBook.Name, _ = attrs.Get("name")

        decoder.On("contact", func(attrs exml.Attrs) {
            contact := &Contact{}
            addressBook.Contacts = append(addressBook.Contacts, contact)

            decoder.On("first-name", func(attrs exml.Attrs) {
                decoder.On("$text", func(text exml.CharData) {
                    contact.FirstName = string(text)
                })
            })

            decoder.On("last-name", func(attrs exml.Attrs) {
                decoder.On("$text", func(text exml.CharData) {
                    contact.LastName = string(text)
                })
            })

            decoder.On("address", func(attrs exml.Attrs) {
                decoder.On("$text", func(text exml.CharData) {
                    contact.Address = string(text)
                })
            })
        })
    })

    decoder.Run()

    fmt.Printf("Address book: %s\n", addressBook.Name)
    for _, c := range addressBook.Contacts {
        fmt.Printf("- %s %s @ %s\n", c.FirstName, c.LastName, c.Address)
    }
}

To reduce the amount and depth of event callbacks that you have to write, go-exml provides stacked events. Here's how you can simplify the last example using stacked events:

...
...

contact := &Contact{}

decoder.On("first-name/$text", func(text exml.CharData) {
    contact.FirstName = string(text)
})

decoder.On("last-name/$text", func(text exml.CharData) {
    contact.LastName = string(text)
})

decoder.On("address/$text", func(text exml.CharData) {
    contact.Address = string(text)
})

...
...

Finally, since using nodes text content to initialize struct fields is a pretty frequent task, go-exml provides a shortcut to make it shorter to write. Let's revisit the previous example and use this shortcut:

...
...

contact := &Contact{}
decoder.On("first-name/$text", decoder.Assign(&contact.FirstName))
decoder.On("last-name/$text", decoder.Assign(&contact.LastName))
decoder.On("address/$text", decoder.Assign(&contact.Address))

...
...

Another shortcut allows to accumulate text content from various nodes to a single slice:

...
...

info := []string{}
decoder.On("first-name/$text", decoder.Append(&info))
decoder.On("last-name/$text", decoder.Append(&info))
decoder.On("address/$text", decoder.Append(&info))

...
...

Benchmarks

The included benchmarks show that go-exml can be massively faster than standard unmarshaling and the difference would most likely be even greater for bigger inputs.

% go test -bench . -benchmem
OK: 16 passed
PASS
Benchmark_UnmarshalSimple      50000         55643 ns/op        6141 B/op        128 allocs/op
Benchmark_UnmarshalText       100000         23253 ns/op        3522 B/op         64 allocs/op
Benchmark_UnmarshalCDATA      100000         24455 ns/op        3611 B/op         64 allocs/op
Benchmark_UnmarshalMixed      100000         29596 ns/op        4120 B/op         70 allocs/op
Benchmark_DecodeSimple       5000000           375 ns/op          57 B/op          3 allocs/op
Benchmark_DecodeText         5000000           541 ns/op         123 B/op          5 allocs/op
Benchmark_DecodeCDATA        5000000           541 ns/op         123 B/op          5 allocs/op
Benchmark_DecodeMixed        5000000           541 ns/op         123 B/op          5 allocs/op
ok      github.com/lucsky/go-exml   23.966s

License

Code is under the BSD 2 Clause (NetBSD) license.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Attrs

type Attrs []xml.Attr

func (Attrs) Get

func (a Attrs) Get(name string) (string, error)

type CharData

type CharData xml.CharData

type Decoder

type Decoder struct {
	// contains filtered or unexported fields
}

func NewDecoder

func NewDecoder(r io.Reader) *Decoder

func (*Decoder) Append

func (d *Decoder) Append(a *[]string) func(CharData)

func (*Decoder) Assign

func (d *Decoder) Assign(slot *string) func(CharData)

func (*Decoder) On

func (d *Decoder) On(event string, handler Handler)

func (*Decoder) OnError

func (d *Decoder) OnError(handler ErrorHandler)

func (*Decoder) Run

func (d *Decoder) Run()

type ElemHandler

type ElemHandler func(Attrs)

type ErrorHandler

type ErrorHandler func(error)

type Handler

type Handler interface{}

type TextHandler

type TextHandler func(CharData)

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL