protoscan

package
v1.15.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 19, 2018 License: Apache-2.0 Imports: 22 Imported by: 0

Documentation

Overview

Package protoscan provides the necessary tools & APIs to find, extract, version and build the dependency trees of all the protobuf schemas that have been instanciated by one or more protobuf library (golang/protobuf, gogo/protobuf...).

This is a fairly low-level package, used to power the deeper innards of `protein`; as such, it should very rarely be of use to the end-user.

Index

Constants

View Source
const (
	TEST_TSKnownName = ".test.TestSchema"
	TEST_DEKnownName = ".test.TestSchema.DepsEntry"
	TEST_GTKnownName = ".test.TestSchema.GhostType"

	TEST_TSKnownHashSingle = "PROT-00ede76a8940ef0f5d9022ecbca679d9"
	TEST_DEKnownHashSingle = "PROT-624796f94565bcdd2e785ef24a037ebb"
	TEST_GTKnownHashSingle = "PROT-f4d9460136ed7169a701ac2bab5a642b"

	TEST_TSKnownHashRecurse = "PROT-8b244a1a35e88f1e1aad8915dd603021"
	TEST_DEKnownHashRecurse = "PROT-4f6928d2737ba44dac0e3df123f80284"
	TEST_GTKnownHashRecurse = "PROT-3fecf73710581dfb3f46718988b9316e"
)

`TestSchema` and `DepsEntry` are both immutable, for-testing-purposes schemas and, as such, both their respective single & recursive hashes can be known in advance and shouldn't ever change.

If any modification to either of these schemas were to happen, you'd have to modify the following expected values in order to fix the tests.. That is, if you're sure about what you're doing.

Variables

This section is empty.

Functions

func BindProtofileSymbols

func BindProtofileSymbols() (map[string]*map[string][]byte, error)

BindProtofileSymbols finds the currently running executable then parses its symbol table in order to find every instanciated `proto.protoFiles` global variables.

These `proto.protoFiles` variables are maintained by the various protobuf libraries out there (i.e. golang/protobuf, gogo/protobuf & other implementations) in order to keep track of the `FileDescriptorProto`s that have been loaded at boot-time (see `proto.RegisterFile`). This essentially means that each and every protobuf schema known to the currently running program is stored into one of these maps.

There are two main issues that need to be worked around for this little trick to work though:

A:

`proto.protoFiles` is a package-level private variable and, as such,
cannot (AFAIK) be accessed by any means except by forking the original
package, which is not a viable option here.

B:

Because of how vendoring and mangling works, there can actually be an
infinite amount of `proto.protoFiles` variables instanciated at runtime,
and we must get ahold of each and every one of them.

Considering the above issues, doing some hacking with the symbols seem to be the smart(er) way to go here. As `proto.protoFiles` variables are declared as package-level globals, their respective virtual addresses are known at compile-time and stored in the binary: what we're doing here is we find those addresses then apply some unsafe-foo magic in order to create local pointers that point to these addresses.

And, voila!

func MD5

func MD5(bss ByteSSlice) ([]byte, error)

MD5 implements a Hasher using the MD5 hashing algorithm.

func NewDescriptorTrees

func NewDescriptorTrees(
	hasher Hasher, hashPrefix string,
	fdps map[string]*descriptor.FileDescriptorProto,
) (map[string]*DescriptorTree, error)

NewDescriptorTrees builds all the dependency trees it can compute from the specified protobuf file descriptors then returns the resulting `DescriptorTree`s as a map arranged by their respective schemaUIDs (computed using the user-specified `hasher` function).

func SHA1

func SHA1(bss ByteSSlice) ([]byte, error)

SHA1 implements a Hasher using the SHA1 hashing algorithm.

func SHA256

func SHA256(bss ByteSSlice) ([]byte, error)

SHA256 implements a Hasher using the SHA256 hashing algorithm.

func SHA512

func SHA512(bss ByteSSlice) ([]byte, error)

SHA512 implements a Hasher using the SHA512 hashing algorithm.

func UnzipAndUnmarshal

func UnzipAndUnmarshal(b []byte) (*descriptor.FileDescriptorProto, error)

UnzipAndUnmarshal g-unzips the given binary blob then unmarshals it into a `FileDescriptorProto`.

It is typically used to decode the binary blobs generated by the protobuf compiler.

func XXHash added in v1.8.2

func XXHash(bss ByteSSlice) ([]byte, error)

XXHash implements a Hasher using the xxHash hashing algorithm.

Types

type ByteSSlice

type ByteSSlice [][]byte

ByteSSlice is a sortable slice of byte-slices.

It is used to compute the schema hashes in `DescriptorTree`'s implementation.

func (ByteSSlice) Len

func (bss ByteSSlice) Len() int

func (ByteSSlice) Less

func (bss ByteSSlice) Less(i, j int) bool

func (ByteSSlice) Sort

func (bss ByteSSlice) Sort()

func (ByteSSlice) Swap

func (bss ByteSSlice) Swap(i, j int)

type DescriptorTree

type DescriptorTree struct {
	// contains filtered or unexported fields
}

`DescriptorTree` is a dependency tree of Message/Enum type protobuf descriptors.

It is the main datastructure used to generate versioning hashes for protobuf schemas and their dependency tree.

func (*DescriptorTree) DependencyUIDs

func (dt *DescriptorTree) DependencyUIDs() []string

DependencyUIDs recursively walks through the dependencies of `dt` and returns a sorted list of the (optionally prefixed) hexadecimal representations of their respective recursive hashes.

This should only be called once both the linking & recursive hashing computation have been done.

func (DescriptorTree) Descr

func (dn DescriptorTree) Descr() proto.Message

func (*DescriptorTree) FQName

func (dt *DescriptorTree) FQName() string

FQName returns the fully-qualified name of the underlying protobuf descriptor `dt.descr` (e.g. .google.protobuf.Timestamp).

func (*DescriptorTree) UID

func (dt *DescriptorTree) UID() string

UID returns a unique, deterministic, versioned identifier for this particular `DescriptorTree`.

This identifier is computed from `dt`'s protobuf schema as well as its dependencies' schemas. This means that modifying any schema in the dependency tree, not just `dt`'s, will result in a new identifier being generated.

The returned string is the hexadecimal representation of `dt`'s internal recursive hash (computed via the user-specified hasher), prefixed by `dt.hashPrefix` (also specified by the end-user).

type Hasher

type Hasher func(bss ByteSSlice) ([]byte, error)

Hasher takes a pre-sorted slice of byte-slices as input and outputs a hashed representation of this data as a result.

This Hasher, provided by the end-user, will be used to version every schema and associated dependencies found by the `protoscan` package.

This package provides some basic, ready-to-use hashers: `MD5`, `SHA1`, `SHA256`, `SHA512`, `xxHash`.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL