triplestore

package module
v0.0.0-...-4099dd9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 13, 2018 License: Apache-2.0 Imports: 18 Imported by: 64

README

Build Status Go Report Card GoDoc

Triple Store

Triple Store is a library to manipulate RDF triples in a fast and fluent fashion.

RDF triples allow to represent any data and its relations to other data. It is a very versatile concept and is used in Linked Data, graphs traversal and storage, etc....

Here the RDF triples implementation follows along the W3C RDF concepts. (Note that reification is not implemented.). More digestible info on RDF Wikipedia

Features overview

  • Create and manage triples through a convenient DSL
  • Snapshot and query RDFGraphs
  • Binary encoding/decoding
  • Lenient NTriples encoding/decoding (see W3C Test suite in testdata/ntriples/w3c_suite/)
  • DOT encoding
  • Stream encoding/decoding (for binary & NTriples format) for memory conscious program
  • CLI (Command line interface) utility to read and convert triples files.

Library

This library is written using the Golang language. You need to install Golang before using it.

Get it:

go get -u github.com/wallix/triplestore

Test it:

go test -v -cover -race github.com/wallix/triplestore

Bench it:

go test -run=none -bench=. -benchmem

Import it in your source code:

import (
	"github.com/wallix/triplestore"
	// tstore "github.com/wallix/triplestore" for less verbosity
)

Get the CLI with:

go get -u github.com/wallix/triplestore/cmd/triplestore

Concepts

A triple is made of 3 components:

subject -> predicate -> object

... or you can also view that as:

entity -> attribute -> value

So

  • A triple consists of a subject, a predicate and a object.
  • A subject is a unicode string.
  • A predicate is a unicode string.
  • An object is a resource (or IRI) or a literal (blank node are not supported).
  • A literal is a unicode string associated with a datatype (ex: string, integer, ...).
  • A resource, a.k.a IRI, is a unicode string which point to another resource.

And

  • A source is a persistent yet mutable source or container of triples.
  • A RDFGraph is an immutable set of triples. It is a snapshot of a source and queryable .
  • A dataset is a basically a collection of RDFGraph.

You can also view the library through the godoc

Usage

Create triples

Although you can build triples the way you want to model any data, they are usually built from known RDF vocabularies & namespace. Ex: foaf, ...

triples = append(triples,
	SubjPred("me", "name").StringLiteral("jsmith"),
 	SubjPred("me", "age").IntegerLiteral(26),
 	SubjPred("me", "male").BooleanLiteral(true),
 	SubjPred("me", "born").DateTimeLiteral(time.Now()),
 	SubjPred("me", "mother").Resource("mum#121287"),
)

or dynamically and even shorter with

triples = append(triples,
 	SubjPredLit("me", "age", "jsmith"), // String literal object
 	SubjPredLit("me", "age", 26), // Integer literal object
 	SubjPredLit("me", "male", true), // Boolean literal object
 	SubjPredLit("me", "born", time.now()) // Datetime literal object
 	SubjPredRes("me", "mother", "mum#121287"), // Resource object
)

or with blank nodes and language tag in literal

triples = append(triples,
 	SubjPred("me", "name").Bnode("jsmith"),
 	BnodePred("me", "name").StringLiteral("jsmith"),
 	SubjPred("me", "name").StringLiteralWithLang("jsmith", "en"),
)
Create triples from a struct

As a convenience you can create triples from a singular struct, where you control embedding through bnode.

Here is an example.

type Address struct {
	Street string `predicate:"street"`
	City   string `predicate:"city"`
}

type Person struct {
	Name     string    `predicate:"name"`
	Age      int       `predicate:"age"`
	Size     int64     `predicate:"size"`
	Male     bool      `predicate:"male"`
	Birth    time.Time `predicate:"birth"`
	Surnames []string  `predicate:"surnames"`
	Addr     Address   `predicate:"address" bnode:"myaddress"` // empty bnode value will make bnode value random
}

addr := &Address{...}
person := &Person{Addr: addr, ....}

tris := TriplesFromStruct("jsmith", person)

src := NewSource()
src.Add(tris)
snap := src.Snapshot()

snap.Contains(SubjPredLit("jsmith", "name", "..."))
snap.Contains(SubjPredLit("jsmith", "size", 186))
snap.Contains(SubjPredLit("jsmith", "surnames", "..."))
snap.Contains(SubjPredLit("jsmith", "surnames", "..."))
snap.Contains(SubjPred("me", "address").Bnode("myaddress"))
snap.Contains(BnodePred("myaddress", "street").StringLiteral("5th avenue"))
snap.Contains(BnodePred("myaddress", "city").StringLiteral("New York"))
Equality
	me := SubjPred("me", "name").StringLiteral("jsmith")
 	you := SubjPred("me", "name").StringLiteral("fdupond")

 	if me.Equal(you) {
 	 	...
 	}
)
Triple Source

A source is a persistent yet mutable source or container of triples

src := tstore.NewSource()

src.Add(
	SubjPredLit("me", "age", "jsmith"),
	SubjPredLit("me", "born", time.now()),
)
src.Remove(SubjPredLit("me", "age", "jsmith"))
RDFGraph

A RDFGraph is an immutable set of triples you can query. You get a RDFGraph by snapshotting a source:

graph := src.Snapshot()

tris := graph.WithSubject("me")
for _, tri := range tris {
	...
}
Codec

Triples can be encoded & decoded using either a simple binary format or more standard text format like NTriples, ...

Triples can therefore be persisted to disk, serialized or sent over the network.

For example

enc := NewBinaryEncoder(myWriter)
err := enc.Encode(triples)
...

dec := NewBinaryDecoder(myReader)
triples, err := dec.Decode()

Create a file of triples under the lenient NTriples format:

f, err := os.Create("./triples.nt")
if err != nil {
	return err
}
defer f.Close()

enc := NewLenientNTEncoder(f)
err := enc.Encode(triples)

Encode to a DOT graph

tris := []Triple{
        SubjPredRes("me", "rel", "you"),
        SubjPredRes("me", "rdf:type", "person"),
        SubjPredRes("you", "rel", "other"),
        SubjPredRes("you", "rdf:type", "child"),
        SubjPredRes("other", "any", "john"),
}

err := NewDotGraphEncoder(file, "rel").Encode(tris...)
...

// output
// digraph "rel" {
//  "me" -> "you";
//  "me" [label="me<person>"];
//  "you" -> "other";
//  "you" [label="you<child>"];
//}

Load a binary dataset (i.e. multiple RDFGraph) concurrently from given files:

path := filepath.Join(fmt.Sprintf("*%s", fileExt))
files, _ := filepath.Glob(path)

var readers []io.Reader
for _, f := range files {
	reader, err := os.Open(f)
	if err != nil {
		return g, fmt.Errorf("loading '%s': %s", f, err)
	}
	readers = append(readers, reader)
}

dec := tstore.NewDatasetDecoder(tstore.NewBinaryDecoder, readers...)
tris, err := dec.Decode()
if err != nil {
	return err
}
...
triplestore CLI

This CLI is mainly ised for triples files conversion and inspection. Install it with go get github.com/wallix/triplestore/cmd/triplestore. Then triplestore -h for help.

Example of usage:

triplestore -in ntriples -out bin -files fuzz/ntriples/corpus/samples.nt 
triplestore -in ntriples -out bin -files fuzz/ntriples/corpus/samples.nt 
triplestore -in bin -files fuzz/binary/corpus/samples.bin
RDFGraph as a Tree

A tree is defined from a RDFGraph given:

  • a specific predicate as an edge
  • and considering triples pointing to RDF resource Object

You can then navigate the tree using the existing API calls

tree := tstore.NewTree(myGraph, myPredicate)
tree.TraverseDFS(...)
tree.TraverseAncestors(...)
tree.TraverseSiblings(...)

Have a look at the godoc fro more info

Note that at the moment, constructing a new tree from a graph does not verify if the tree is valid namely no cycle and each child at most one parent.

Documentation

Overview

Package triplestore provides APIs to manage, store and query triples, sources and RDFGraphs

Index

Constants

View Source
const XMLSchemaNamespace = "http://www.w3.org/2001/XMLSchema"

Variables

View Source
var (
	XsdString   = XsdType("xsd:string")
	XsdBoolean  = XsdType("xsd:boolean")
	XsdDateTime = XsdType("xsd:dateTime")

	// 64-bit floating point numbers
	XsdDouble = XsdType("xsd:double")
	// 32-bit floating point numbers
	XsdFloat = XsdType("xsd:float")

	// signed 32 or 64 bit
	XsdInteger = XsdType("xsd:integer")
	// signed (8 bit)
	XsdByte = XsdType("xsd:byte")
	// signed (16 bit)
	XsdShort = XsdType("xsd:short")

	// unsigned 32 or 64 bit
	XsdUinteger = XsdType("xsd:unsignedInt")
	// unsigned 8 bit
	XsdUnsignedByte = XsdType("xsd:unsignedByte")
	// unsigned 16 bit
	XsdUnsignedShort = XsdType("xsd:unsignedShort")
)
View Source
var RDFContext = &Context{
	Prefixes: map[string]string{
		"xsd":  "http://www.w3.org/2001/XMLSchema#",
		"rdf":  "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
		"rdfs": "http://www.w3.org/2000/01/rdf-schema#",
	},
}

Functions

func BnodePred

func BnodePred(s, p string) *tripleBuilder

func BnodePredRes

func BnodePredRes(s, p, r string) *triple

func IsNTFormat

func IsNTFormat(r io.Reader) (bool, io.Reader)

Loosely detect if a ntriples format contrary to a binary format Used for retro compatibilty when changing file format on existing stores Detecttion work with ntriples format flushed by this library (i.e. no comment, no spaces, ...)

func ParseBoolean

func ParseBoolean(obj Object) (bool, error)

func ParseDateTime

func ParseDateTime(obj Object) (time.Time, error)

func ParseFloat32

func ParseFloat32(obj Object) (float32, error)

func ParseFloat64

func ParseFloat64(obj Object) (float64, error)

func ParseInt16

func ParseInt16(obj Object) (int16, error)

func ParseInt8

func ParseInt8(obj Object) (int8, error)

func ParseInteger

func ParseInteger(obj Object) (int, error)

func ParseLiteral

func ParseLiteral(obj Object) (interface{}, error)

func ParseString

func ParseString(obj Object) (string, error)

func ParseUint16

func ParseUint16(obj Object) (uint16, error)

func ParseUint8

func ParseUint8(obj Object) (uint8, error)

func ParseUinteger

func ParseUinteger(obj Object) (uint, error)

func SubjPred

func SubjPred(s, p string) *tripleBuilder

func SubjPredBnode

func SubjPredBnode(s, p, r string) *triple

func SubjPredLit

func SubjPredLit(s, p string, l interface{}) (*triple, error)

func SubjPredRes

func SubjPredRes(s, p, r string) *triple

Types

type Context

type Context struct {
	Base     string
	Prefixes map[string]string
}

func NewContext

func NewContext() *Context

type DecodeResult

type DecodeResult struct {
	Tri Triple
	Err error
}

type Decoder

type Decoder interface {
	Decode() ([]Triple, error)
}

func NewAutoDecoder

func NewAutoDecoder(r io.Reader) Decoder

Use for retro compatibilty when changing file format on existing stores

func NewBinaryDecoder

func NewBinaryDecoder(r io.Reader) Decoder

func NewDatasetDecoder

func NewDatasetDecoder(fn func(io.Reader) Decoder, readers ...io.Reader) Decoder

NewDatasetDecoder - a dataset is a basically a collection of RDFGraph.

func NewLenientNTDecoder

func NewLenientNTDecoder(r io.Reader) Decoder

type Encoder

type Encoder interface {
	Encode(tris ...Triple) error
}

func NewBinaryEncoder

func NewBinaryEncoder(w io.Writer) Encoder

func NewDotGraphEncoder

func NewDotGraphEncoder(w io.Writer, predicate string) Encoder

func NewLenientNTEncoder

func NewLenientNTEncoder(w io.Writer) Encoder

func NewLenientNTEncoderWithContext

func NewLenientNTEncoderWithContext(w io.Writer, c *Context) Encoder

type Literal

type Literal interface {
	Type() XsdType
	Value() string
	Lang() string
}

Literal is a unicode string associated with a datatype (ex: string, integer, ...).

type Object

type Object interface {
	Literal() (Literal, bool)
	Resource() (string, bool)
	Bnode() (string, bool)
	Equal(Object) bool
}

Object is a resource (i.e. IRI), a literal or a blank node.

func BooleanLiteral

func BooleanLiteral(bl bool) Object

func DateTimeLiteral

func DateTimeLiteral(tm time.Time) Object

func Float32Literal

func Float32Literal(i float32) Object

func Float64Literal

func Float64Literal(i float64) Object

func Int16Literal

func Int16Literal(i int16) Object

func Int8Literal

func Int8Literal(i int8) Object

func IntegerLiteral

func IntegerLiteral(i int) Object

func ObjectLiteral

func ObjectLiteral(i interface{}) (Object, error)

func Resource

func Resource(s string) Object

func StringLiteral

func StringLiteral(s string) Object

func StringLiteralWithLang

func StringLiteralWithLang(s, l string) Object

func Uint16Literal

func Uint16Literal(i uint16) Object

func Uint8Literal

func Uint8Literal(i uint8) Object

func UintegerLiteral

func UintegerLiteral(i uint) Object

type RDFGraph

type RDFGraph interface {
	Contains(Triple) bool
	Triples() []Triple
	Count() int
	WithSubject(s string) []Triple
	WithPredicate(p string) []Triple
	WithObject(o Object) []Triple
	WithSubjObj(s string, o Object) []Triple
	WithSubjPred(s, p string) []Triple
	WithPredObj(p string, o Object) []Triple
}

A RDFGraph is an immutable set of triples. It is a snapshot of a source and it is queryable.

type Source

type Source interface {
	Add(...Triple)
	Remove(...Triple)
	Snapshot() RDFGraph
	CopyTriples() []Triple
}

A source is a persistent yet mutable source or container of triples.

func NewSource

func NewSource() Source

A source is a persistent yet mutable source or container of triples

type StreamDecoder

type StreamDecoder interface {
	StreamDecode(context.Context) <-chan DecodeResult
}

func NewBinaryStreamDecoder

func NewBinaryStreamDecoder(r io.ReadCloser) StreamDecoder

func NewLenientNTStreamDecoder

func NewLenientNTStreamDecoder(r io.Reader) StreamDecoder

type StreamEncoder

type StreamEncoder interface {
	StreamEncode(context.Context, <-chan Triple) error
}

func NewBinaryStreamEncoder

func NewBinaryStreamEncoder(w io.Writer) StreamEncoder

func NewLenientNTStreamEncoder

func NewLenientNTStreamEncoder(w io.Writer) StreamEncoder

type Tree

type Tree struct {
	// contains filtered or unexported fields
}

A tree is defined from a RDF Graph when given a specific predicate as an edge and considering triples pointing to RDF resource Object

The tree defined by the graph/predicate should have no cycles and node should have at most one parent

func NewTree

func NewTree(g RDFGraph, pred string) *Tree

func (*Tree) TraverseAncestors

func (t *Tree) TraverseAncestors(node string, each func(RDFGraph, string, int) error, depths ...int) error

Traverse all ancestors from the given node

func (*Tree) TraverseDFS

func (t *Tree) TraverseDFS(node string, each func(RDFGraph, string, int) error, depths ...int) error

Traverse the tree in pre-order depth first search

func (*Tree) TraverseSiblings

func (t *Tree) TraverseSiblings(node string, siblingCriteriaFunc func(RDFGraph, string) (string, error), each func(RDFGraph, string, int) error) error

Traverse siblings of given node. Passed function allow to output the sibling criteria

type Triple

type Triple interface {
	Subject() string
	Predicate() string
	Object() Object
	Equal(Triple) bool
}

Triple consists of a subject, a predicate and a object

func TriplesFromStruct

func TriplesFromStruct(sub string, i interface{}, bnodes ...bool) (out []Triple)

Convert a Struct or ptr to Struct into triples using field tags. For each struct's field a triple is created: - Subject: function first argument - Predicate: tag value - Literal: actual field value according to field's type Unsupported types are ignored

type Triples

type Triples []Triple

func (Triples) Equal

func (ts Triples) Equal(others Triples) bool

func (Triples) Map

func (ts Triples) Map(fn func(Triple) string) (out []string)

func (Triples) Sort

func (ts Triples) Sort()

func (Triples) String

func (ts Triples) String() string

type UnsupportedLiteralTypeError

type UnsupportedLiteralTypeError struct {
	// contains filtered or unexported fields
}

func (UnsupportedLiteralTypeError) Error

type XsdType

type XsdType string

func (XsdType) NTriplesNamespaced

func (x XsdType) NTriplesNamespaced() string

Directories

Path Synopsis
cmd
fuzz

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL