springytools

package module
v0.0.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 15, 2021 License: BSD-3-Clause Imports: 8 Imported by: 0

README

LibGuides Tools

A Golang package for working with LibGuides exported XML.

License Latest release

Table of contents

Introduction

There is a periodic need to work with exported LibGuides XML in Caltech Library. This is a Golang package for working with the exported data. Go provides a robust may of mapping simple data structures to and from XML (or JSON). This makes working with XML very easy in a consistent fashion. It seem time to move beyond my usual Bash/sed/python scripts.

One program is currently provided with springytools, lgxml2sjon which converts a LibGuides XML export file into JSON.

Installation

This is a Golang package providing two commands for working with LibGuides' exported XML. To compile you will need Go 1.16 or better, GNU Make and Stephen Dolan's jq for browser JSON output.

Steps to compile from source
  1. clone the repository
  2. change into the clone directory
  3. test
  4. build the command line tool lgxml2json
  5. use lgxml2json and test output with jq
  • Replace "LibGuides_export_XXXXX.xml" with the file path to your exported LibGuides XML file
  1. install lgxml2json

Example commands to execute in the shell (e.g. Terminal on macOS, xterm on Linux)

git clone git@github.com:caltechlibrary/springytools
cd springytools
make
make test
make install

By default installation is to your $HOME/bin directory. This directory should be in your shell's "PATH".

You can get a brief description of the commands using the -h option with the command.

lgxml2json -h
lglinkreport -h

Known issues and limitations

This library is currently written to perform the LibGuides link analysis. It only provides the commands I needed to do the data analysis. It will grow as needed.

The exported XML output from the LibGuides may not be valid UTF-8. UTF-8 encoding is required to successfully parse the export file. Looking at the raw XML markup in vim I noticed a number of control code sequences. This corresponded to the errors on parsing the unsanitized XML file. The problem characters appear as ^A, ^K, ^L, ^S, ^C, ^R. These maybe non-UTF-8 characters embedded as UTF-8 when the rich text documents were pasted in via the LibGuides edit UI. My hunch is these were pasted in/imported from Word documents. Remove the offending characters allowed the export to parse successfully. These edits are destructive as some of the codes probably represent UTF-8 characters used in non-English European names or terminology.

Getting help

File an issue on GitHub.

License

Software produced by the Caltech Library is Copyright © 2021 California Institute of Technology. This software is freely distributed under a BSD/MIT type license. Please see the LICENSE file for more information.

Authors and history

  • R. S. Doiel, Software Developer, Digital Library Development, Caltech Library

Acknowledgments

This work was funded by the California Institute of Technology Library.

(If this work was also supported by other organizations, acknowledge them here. In addition, if your work relies on software libraries, or was inspired by looking at other work, it is appropriate to acknowledge this intellectual debt too.)

Documentation

Overview

expected.go is a set of testing functions

Author: R. S. Doiel <rsdoiel@caltech.edu>

Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

extractors.go provides funcs for processing text and pulling out elements like URL links.

Author: R. S. Doiel <rsdoiel@caltech.edu>

Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

libguides.go implements the data structures for working with with LibGuides exported XML.

Author: R. S. Doiel <rsdoiel@caltech.edu>

Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

reports.go provides the functions that work in an input filename and output filename generating reports or data conversions.

Author: R. S. Doiel <rsdoiel@caltech.edu>

Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

tables.go provides a XML, JSON and CSV rendering of the Table datastructure.

Author: R. S. Doiel <rsdoiel@caltech.edu>

Copyright (c) 2021, Caltech All rights not granted herein are expressly reserved by Caltech.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Index

Constants

View Source
const Version = "0.0.3"

Variables

This section is empty.

Functions

func ExtractHTTPLinks(src string) ([]string, int)

ExtractHTTPLinks scan a string and look for URL or HRef extracting the links returning a list of URLs found and count. If count is zero, no URLs found.

NOTE: This only extracts full URLs (e.g. starts with http://, https://)

func LibGuidesXMLFileToJSONFile added in v0.0.2

func LibGuidesXMLFileToJSONFile(srcName, destName string) error

LibGuidesXMLFileToJSONFile reads in a LibGuides XML export file and writes a JSON version of the file. It expects the name of the XML file in srcName the name of the JSON file in destName. It will return an error if any encountered.

func LinkReport added in v0.0.2

func LinkReport(srcName, destName, format string) error

LinkReport reads in a LibGuides XML export and generates a link report encoded in JSON. Accepts a srcName (LibGuides XML export), destName, format (i.e. csv, json, xml). Returns an error if any encountered.

Types

type Account

type Account struct {
	Id        int    `xml:"id" json:"id"`
	Email     string `xml:"email" json:"email"`
	FirstName string `xml:"first_name" json:"first_name"`
	LastName  string `xml:"last_name" json:"last_name"`
	Title     string `xml:"title" json:"title"`
	Nickname  string `xml:"nickname" json:"nickname"`
	Signature string `xml:"signature" json:"signature"`
	Image     string `xml:"image" json:"image"`
	Address   string `xml:"address" json:"address"`
	Phone     string `xml:"phone" json:"phone"`
	Skype     string `xml:"skype" json:"skype"`
	Website   string `xml:"website" json:"website"`
	Created   string `xml:"created" json:"created"`
	Updated   string `xml:"updated" json:"updated"`
}

type Asset

type Asset struct {
	Id   int    `xml:"id" json:"id"`
	Name string `xml:"name" json:"name"`
	Type string `xml:"type" json:"type"`
	// Description contains HTML encoded text, double encoding existing encoded text
	Description string `xml:"description" json:"description"`
	Url         string `xml:"url" json:"url"`
	Owner       Owner  `xml:"owner" json:"owner"`
	MapId       string `xml:"map_id" json:"map_id"`
	Position    int    `xml:"position" json:"position"`
	Created     string `xml:"created" json:"created"`
	Updated     string `xml:"updated" json:"updated"`
}

type Box

type Box struct {
	XMLName  xml.Name `xml:"box" json:"box"`
	Id       int      `xml:"id" json:"id"`
	Name     string   `xml:"name" json:"name"`
	Type     string   `xml:"type" json:"type"`
	MapId    string   `xml:"map_id" json:"map_id"`
	Column   int      `xml:"column" json:"column"`
	Position int      `xml:"position" json:"position"`
	Hidden   int      `xml:"hidden" json:"hidden"`
	Created  string   `xml:"created" json:"created"`
	Updated  string   `xml:"updated" json:"updated"`
	Assets   []*Asset `xml:"assets>asset" json:"assets"`
	Panes    []*Pane  `xml:"panes>pane,omitempty" json:"panes,omitempty"`
}

type Customer

type Customer struct {
	XMLName  xml.Name `xml:"customer" json:"-"`
	Id       int      `xml:"id" json:"id"`
	Type     string   `xml:"type" json:"type"`
	Name     string   `xml:"name" json:"name"`
	Url      string   `xml:"url" json:"url"`
	City     string   `xml:"city" json:"city"`
	State    string   `xml:"state" json:"state"`
	Country  string   `xml:"country" json:"country"`
	TimeZone string   `xml:"time_zone" json:"time_zone"`
	Created  string   `xml:"created" json:"created"`
	Updated  string   `xml:"updated" json:"updated"`
}

type Group

type Group struct {
	Id          int    `xml:"id" json:"id"`
	Type        string `xml:"type" json:"type"`
	Name        string `xml:"name" json:"name"`
	Url         string `xml:"url" json:"url"`
	Description string `xml:"description" json:"description"`
	Password    string `xml:"password" json:"password"`
	Created     string `xml:"created" json:"created"`
	Updated     string `xml:"updated" json:"updated"`
}

type Guide

type Guide struct {
	Id          int        `xml:"id" json:"id"`
	Type        string     `xml:"type" json:"type"`
	Name        string     `xml:"name" json:"name"`
	Description string     `xml:"description" json:"description"`
	Url         string     `xml:"url" json:"url"`
	Owner       Owner      `xml:"owner" json:"owner"`
	Group       Group      `xml:"group" json:"group"`
	Redirect    string     `xml:"redirect" json:"redirect"`
	Status      string     `xml:"status" json:"status"`
	Created     string     `xml:"created" json:"created"`
	Updated     string     `xml:"updated" json:"updated"`
	Modified    string     `xml:"modified" json:"modified"`
	Published   string     `xml:"published" json:"published"`
	Subjects    []*Subject `xml:"subjects>subject" json:"subjects"`
	Tags        []*Tag     `xml:"tags>tag" json:"tags"`
	Pages       []*Page    `xml:"pages>page" json:"pages"`
}

type LibGuides

type LibGuides struct {
	XMLName  xml.Name   `json:"-"`
	Customer *Customer  `xml:"customer" json:"customer"`
	Site     *Site      `xml:"site" json:"site"`
	Accounts []*Account `xml:"accounts>account" json:"accounts"`
	Groups   []*Group   `xml:"groups>group" json:"groups"`
	Subjects []*Subject `xml:"subjects>subject" json:"subjects"`
	Tags     []*Tag     `xml:"tags>tag" json:"tags"`
	Vendors  []*Vendor  `xml:"vendors>vendor" json:"vendors"`
	Guides   []*Guide   `xml:"guides>guide" json:"guides"`
}

func (*LibGuides) FromXML

func (lg *LibGuides) FromXML(src []byte) error

FromXML takes a LibGuides Object, []bytes of XML source populates the LibGuides object and returns any error.

func (*LibGuides) ToJSON

func (lg *LibGuides) ToJSON() ([]byte, error)

ToJSON takes a LibGuides object and renders JSON output and error

type Owner

type Owner struct {
	XMLName   xml.Name `xml:"owner" json:"owner"`
	Id        int      `xml:"id" json:"id"`
	Email     string   `xml:"email" json:"email"`
	FirstName string   `xml:"first_name" json:"first_name"`
	LastName  string   `xml:"last_name" json:"last_name"`
	Image     string   `xml:"image" json:"image"`
}

type Page

type Page struct {
	Id           int    `xml:"id" json:"id"`
	Name         string `xml:"name" json:"name"`
	Description  string `xml:"description" json:"description"`
	Url          string `xml:"url" json:"url"`
	Redirect     string `xml:"redirect" json:"redirect"`
	SourcePageId int    `xml:"source_page_id" json:"source_page_id"`
	ParentPageId int    `xml:"parent_page_id" json:"parent_page_id"`
	Position     int    `xml:"position" json:"position"`
	Hidden       int    `xml:"hidden" json:"hidden"`
	Created      string `xml:"created" json:"created"`
	Updated      string `xml:"updated" json:"updated"`
	Modified     string `xml:"modified" json:"modified"`
	Boxes        []*Box `xml:"boxes>box" json:"boxes"`
}

type Pane

type Pane struct {
	Assets []*Asset `xml:"assets>asset" json:"assets"`
}

type Site

type Site struct {
	XMLName xml.Name `xml:"site" json:"-"`
	Id      int      `xml:"id" json:"jd"`
	Type    string   `xml:"type" json:"type"`
	Name    string   `xml:"name" json:"name"`
	Domain  string   `xml:"domain" json:"domain"`
	Admin   string   `xml:"admin" json:"admin"`
	Created string   `xml:"created" json:"created"`
	Updated string   `xml:"updated" json:"updated"`
}

type Subject

type Subject struct {
	Id   int    `xml:"id" json:"id"`
	Name string `xml:"name" json:"name"`
	Url  string `xml:"url" json:"url"`
}

type TBody added in v0.0.2

type TBody struct {
	XMLName xml.Name   `xml:"tbody" json:"-"`
	Rows    [][]string `xml:"tr>td" json:"rows,omitempty"`
}

type THead added in v0.0.2

type THead struct {
	XMLName xml.Name `xml:"thead" json:"-"`
	Row     []string `xml:"tr>th" json:"columns,omitempty"`
}

type Table added in v0.0.2

type Table struct {
	XMLName xml.Name `xml:"table" json:"-"`
	Caption string   `xml:"caption" json:"caption,omitempty"`
	Head    THead    `xml:"thead" json:"head,omitempty"`
	Body    TBody    `xml:"tbody" json:"body,omitempty"`
}

func (*Table) AppendHeadings added in v0.0.2

func (t *Table) AppendHeadings(cells ...string)

func (*Table) AppendRow added in v0.0.2

func (t *Table) AppendRow(cells ...string)

func (*Table) SetCaption added in v0.0.2

func (t *Table) SetCaption(caption string)

func (*Table) ToCSVFile added in v0.0.2

func (t *Table) ToCSVFile(destName string, header bool) error

ToCSVFile will create a CSV version of Table, it is a destructive write. A file with the same name will be replaced. Accepts the filename and header boolean. if header is true and the table's header is populated it will render a header row at start of the CSV output. Returns an error if one is encountered.

func (*Table) ToJSON added in v0.0.2

func (t *Table) ToJSON() ([]byte, error)

func (*Table) ToJSONFile added in v0.0.2

func (t *Table) ToJSONFile(destName string) error

ToJSONFile will creates a JSON version of Table, it is a destructive write. A file with the same name will be replaced. Accepts the filename and Returns an error if one is encountered.

func (*Table) ToXML added in v0.0.2

func (t *Table) ToXML() ([]byte, error)

func (*Table) ToXMLFile added in v0.0.2

func (t *Table) ToXMLFile(destName string) error

ToXMLFile will creates an XML (HTML) version of Table, it is a destructive write. A file with the same name will be replaced. Accepts the filename and Returns an error if one is encountered.

type Tag

type Tag struct {
	Id   int    `xml:"id" json:"id"`
	Name string `xml:"name" json:"name"`
}

type Vendor

type Vendor struct {
	Id   int    `xml:"id" json:"id"`
	Name string `xml:"name" json:"name"`
}

Directories

Path Synopsis
cmd
lglinkreport
linkreport.go traverse all the fields that have links and reports where they are found.
linkreport.go traverse all the fields that have links and reports where they are found.
lgxml2json
lgxml2json.go converts a LibGuides XML export into JSON
lgxml2json.go converts a LibGuides XML export into JSON

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL