archivista

package module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 7, 2024 License: Apache-2.0 Imports: 15 Imported by: 0

README

OpenSSF Best Practices OpenSSF-Scorecard FOSSA Status

Archivista

Archivista is a graph and storage service for in-toto attestations. Archivista enables the discovery and retrieval of attestations for software artifacts.

Archivista enables you to:

  • Store and retrieve in-toto attestations
  • Query for relationships between attestations via a GraphQL API
  • Validate Witness policy without the need to manually list expected attestations

Archivista is a trusted store for supply chain metadata

  • It creates a graph of supply chain metadata while storing attestations that can be later used for policy validation and flexible querying.
  • It is designed to be horizontally scaleable, supporting storing a large number of attestations.
  • It supports deployment on major cloud service and infrastructure providers, making it a versatile and flexible solution for securing software supply chains.
  • It only stores signed attestations to further enhance security and and increase trust.

Key Features

  • Native support for storing attestations created by Witness
  • A GraphQL API endpoint and playground
  • Support for MySQL and Postgres database backends
  • Support for S3-compatible object storage
  • A Helm Chart for deployment in Kubernetes environments
  • The ability to download and export attestations to transfer across an air gap
  • Support for Darwin, Windows, and ARM architectures.

How Archivista Works

When an attestation is uploaded to Archivista it will store the entire attestation in a configured object store as well as scrape some data from the attestation and store it in a queryable metadata store. This metadata is exposed through a GraphQL API. This enables queries such as finding all attestations related to an artifact with a specified hash or finding all attestations that recorded the use of a specific dependency.

Archivista uses Subjects on the in-toto Statement as edges on this graph. Producers of attestations (such as Witness can use these subjects as a way to expose relationships between attestations.

For example when attesting that an artifact was compiled the compiled artifact may be a subject, as well as the git commit hash the artifact was built from. This would allow traversing the graph by the commit hash to find other relevant attestations such as those describing code reviews, testing, and scanning that happened on that git commit.

Running Archivista

A public instance of Archivista is running here for testing purposes. The data in this instance is open to the world and there are currently no SLAs defined for this instance.

Archivista requires a MySQL database as well as a compatible file store. Compatible file stores include a local directory or any S3 compatible store.

A docker compose file is included in the repository that will run a local instance of Archivista along with the necessary services for it to operate. These include Minio and MySQL. Simply cloning the repo and running

docker compose up --build -d

is enough to get a local instance of Archivista up and running. Archivista will be listening at http://localhost:8082 by default with this docker compose file.

Configuration

Archivista is configured through environment variables currently.

Variable Default Value Description
ARCHIVISTA_LISTEN_ON tcp://127.0.0.1:8082 URL endpoint for Archivista to listen on
ARCHIVISTA_LOG_LEVEL INFO Log level. Options are DEBUG, INFO, WARN, ERROR
ARCHIVISTA_CORS_ALLOW_ORIGINS Comma separated list of origins to allow CORS requests from
ARCHIVISTA_SQL_STORE_CONNECTION_STRING root:example@tcp(db)/testify SQL store connection string
ARCHIVISTA_STORAGE_BACKEND Backend to use for attestation storage. Options are FILE, BLOB, or empty string for disabled.
ARCHIVISTA_FILE_SERVE_ON What address to serve files on. Only valid when using FILE storage backend.
ARCHIVISTA_FILE_DIR /tmp/archivista/ Directory to store and serve files. Only valid when using FILE storage backend.
ARCHIVISTA_BLOB_STORE_ENDPOINT 127.0.0.1:9000 URL endpoint for blob storage. Only valid when using BLOB storage backend.
ARCHIVISTA_BLOB_STORE_CREDENTIAL_TYPE Blob store credential type. Options are IAM or ACCESS_KEY.
ARCHIVISTA_BLOB_STORE_ACCESS_KEY_ID Blob store access key id. Only valid when using BLOB storage backend.
ARCHIVISTA_BLOB_STORE_SECRET_ACCESS_KEY_ID Blob store secret access key id. Only valid when using BLOB storage backend.
ARCHIVISTA_BLOB_STORE_USE_TLS TRUE Use TLS for BLOB storage backend. Only valid when using BLOB storage backend.
ARCHIVISTA_BLOB_STORE_BUCKET_NAME Bucket to use for storage. Only valid when using BLOB storage backend.
ARCHIVISTA_ENABLE_GRAPHQL TRUE Enable GraphQL Endpoint
ARCHIVISTA_GRAPHQL_WEB_CLIENT_ENABLE TRUE Enable GraphiQL, the GraphQL web client
ARCHIVISTA_ENABLE_ARTIFACT_STORE FALSE Enable Artifact Store Endpoints
ARCHIVISTA_ARTIFACT_STORE_CONFIG /tmp/artifacts/config.yaml Location of the config describing available artifacts

Using Archivista

Archivista exposes two HTTP endpoints to upload or download attestations:

POST /upload - Uploads an attestation to Archivista. The attestation is to be in the request's body
GET /download/:gitoid: - Downloads an attestation with provided gitoid from Archivista

Additionally Archivista exposes a GraphQL API. By default the GraphQL playground is enabled and available at root.

archivistactl is a CLI tool in this repository that is available to interact with an Archivista instance. archivistctl is capable of uploading and downloading attestations as well as doing some basic queries such as finding all attestations with a specified subject and retrieving all subjects for a specified attestation.

Navigating the Graph

As previously mentioned, Archivista offers a GraphQL API that enables users to discover attestations. When Archivista ingests an attestation some metadata will be stored into the SQL metadata store. This metadata is exposed through the GraphQL API. Archivista uses Relay connections for querying and pagination.

Here is an entity relationship diagram of the metadata that is currently available.

erDiagram
dsse ||--|| statement : Contains
statement ||--o{ subject : has
subject ||--|{ subjectDigest : has
statement ||--o| attestationCollection : contains
attestationCollection ||--|{ attestation : contains
dsse ||--|{ payloadDigest : has
dsse ||--|{ signature : has
signature ||--o{ timestamp : has

dsse {
    string gitoidSha256
    string payloadType
}

statement {
    string predicate
}

subject {
    string name
}

subjectDigest {
    string algorithm
    string value
}

attestationCollection {
    string name
}

attestation {
    string type
}

payloadDigest {
    string algorithm
    string value
}

signature {
    string keyID
    string signature
}

timestamp {
    string type
    time timestamp
}

What's Next

We would like to expand the types of data Archivista can ingest as well as expand the metadata Archivista collected about ingested data. If you have ideas or use cases for Archivista, feel free to contact us or create an issue!

Contributing

See CONTRIBUTING.md for information on how to contribute to Archivista.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewExecutableSchema

func NewExecutableSchema(cfg Config) graphql.ExecutableSchema

NewExecutableSchema creates an ExecutableSchema from the ResolverRoot interface.

func NewSchema

func NewSchema(client *ent.Client) graphql.ExecutableSchema

NewSchema creates a graphql executable schema.

Types

type ComplexityRoot

type ComplexityRoot struct {
	Attestation struct {
		AttestationCollection func(childComplexity int) int
		ID                    func(childComplexity int) int
		Type                  func(childComplexity int) int
	}

	AttestationCollection struct {
		Attestations func(childComplexity int) int
		ID           func(childComplexity int) int
		Name         func(childComplexity int) int
		Statement    func(childComplexity int) int
	}

	Dsse struct {
		GitoidSha256   func(childComplexity int) int
		ID             func(childComplexity int) int
		PayloadDigests func(childComplexity int) int
		PayloadType    func(childComplexity int) int
		Signatures     func(childComplexity int) int
		Statement      func(childComplexity int) int
	}

	DsseConnection struct {
		Edges      func(childComplexity int) int
		PageInfo   func(childComplexity int) int
		TotalCount func(childComplexity int) int
	}

	DsseEdge struct {
		Cursor func(childComplexity int) int
		Node   func(childComplexity int) int
	}

	PageInfo struct {
		EndCursor       func(childComplexity int) int
		HasNextPage     func(childComplexity int) int
		HasPreviousPage func(childComplexity int) int
		StartCursor     func(childComplexity int) int
	}

	PayloadDigest struct {
		Algorithm func(childComplexity int) int
		Dsse      func(childComplexity int) int
		ID        func(childComplexity int) int
		Value     func(childComplexity int) int
	}

	Query struct {
		Dsses    func(childComplexity int, after *entgql.Cursor[int], first *int, before *entgql.Cursor[int], last *int, where *ent.DsseWhereInput) int
		Node     func(childComplexity int, id int) int
		Nodes    func(childComplexity int, ids []int) int
		Subjects func(childComplexity int, after *entgql.Cursor[int], first *int, before *entgql.Cursor[int], last *int, where *ent.SubjectWhereInput) int
	}

	Signature struct {
		Dsse       func(childComplexity int) int
		ID         func(childComplexity int) int
		KeyID      func(childComplexity int) int
		Signature  func(childComplexity int) int
		Timestamps func(childComplexity int) int
	}

	Statement struct {
		AttestationCollections func(childComplexity int) int
		Dsse                   func(childComplexity int) int
		ID                     func(childComplexity int) int
		Predicate              func(childComplexity int) int
		Subjects               func(childComplexity int, after *entgql.Cursor[int], first *int, before *entgql.Cursor[int], last *int, where *ent.SubjectWhereInput) int
	}

	Subject struct {
		ID             func(childComplexity int) int
		Name           func(childComplexity int) int
		Statement      func(childComplexity int) int
		SubjectDigests func(childComplexity int) int
	}

	SubjectConnection struct {
		Edges      func(childComplexity int) int
		PageInfo   func(childComplexity int) int
		TotalCount func(childComplexity int) int
	}

	SubjectDigest struct {
		Algorithm func(childComplexity int) int
		ID        func(childComplexity int) int
		Subject   func(childComplexity int) int
		Value     func(childComplexity int) int
	}

	SubjectEdge struct {
		Cursor func(childComplexity int) int
		Node   func(childComplexity int) int
	}

	Timestamp struct {
		ID        func(childComplexity int) int
		Signature func(childComplexity int) int
		Timestamp func(childComplexity int) int
		Type      func(childComplexity int) int
	}
}

type Config

type Config struct {
	Schema     *ast.Schema
	Resolvers  ResolverRoot
	Directives DirectiveRoot
	Complexity ComplexityRoot
}

type DirectiveRoot

type DirectiveRoot struct {
}

type QueryResolver

type QueryResolver interface {
	Node(ctx context.Context, id int) (ent.Noder, error)
	Nodes(ctx context.Context, ids []int) ([]ent.Noder, error)
	Dsses(ctx context.Context, after *entgql.Cursor[int], first *int, before *entgql.Cursor[int], last *int, where *ent.DsseWhereInput) (*ent.DsseConnection, error)
	Subjects(ctx context.Context, after *entgql.Cursor[int], first *int, before *entgql.Cursor[int], last *int, where *ent.SubjectWhereInput) (*ent.SubjectConnection, error)
}

type Resolver

type Resolver struct {
	// contains filtered or unexported fields
}

Resolver is the resolver root.

func (*Resolver) Query

func (r *Resolver) Query() QueryResolver

Query returns QueryResolver implementation.

type ResolverRoot

type ResolverRoot interface {
	Query() QueryResolver
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL