timeliner: github.com/mholt/timeliner Index | Files | Directories

package timeliner

import "github.com/mholt/timeliner"

Index

Package Files

account.go datasource.go db.go itemfiles.go itemgraph.go mapmutex.go oauth2.go persons.go processing.go ratelimit.go timeliner.go wrappedclient.go

Variables

var (
    RelReplyTo  = Relation{Label: "reply_to", Bidirectional: false}      // "<from> is in reply to <to>"
    RelAttached = Relation{Label: "attached", Bidirectional: true}       // "<to|from> is attached to <from|to>"
    RelQuotes   = Relation{Label: "quotes", Bidirectional: false}        // "<from> quotes <to>"
    RelCCed     = Relation{Label: "carbon_copied", Bidirectional: false} // "<from_item> is carbon-copied to <to_person>"
)

These are the standard relationships that Timeliner recognizes. Using these known relationships is not required, but it makes it easier to translate them to human-friendly phrases when visualizing the timeline.

var OAuth2AppSource func(providerID string, scopes []string) (oauth2client.App, error)

OAuth2AppSource returns an oauth2client.App for the OAuth2 provider with the given ID. Programs using data sources that authenticate with OAuth2 MUST set this variable, or the program will panic.

func Checkpoint Uses

func Checkpoint(ctx context.Context, checkpoint []byte)

Checkpoint saves a checkpoint for the processing associated with the provided context. It overwrites any previous checkpoint. Any errors are logged.

func FakeCloser Uses

func FakeCloser(r io.Reader) io.ReadCloser

FakeCloser turns an io.Reader into an io.ReadCloser where the Close() method does nothing.

func MarshalGob Uses

func MarshalGob(v interface{}) ([]byte, error)

MarshalGob is a convenient way to gob-encode v.

func RegisterDataSource Uses

func RegisterDataSource(ds DataSource) error

RegisterDataSource registers ds as a data source.

func UnmarshalGob Uses

func UnmarshalGob(data []byte, v interface{}) error

UnmarshalGob is a convenient way to gob-decode data into v.

type Account Uses

type Account struct {
    ID           int64
    DataSourceID string
    UserID       string
    // contains filtered or unexported fields
}

Account represents an account with a service.

func (Account) NewHTTPClient Uses

func (acc Account) NewHTTPClient() (*http.Client, error)

NewHTTPClient returns an HTTP client that is suitable for use with an API associated with the account's data source. If OAuth2 is configured for the data source, the client has OAuth2 credentials. If a rate limit is configured, this client is rate limited. A sane default timeout is set, and any fields on the returned Client valule can be modified as needed.

func (Account) NewOAuth2HTTPClient Uses

func (acc Account) NewOAuth2HTTPClient() (*http.Client, error)

NewOAuth2HTTPClient returns a new HTTP client which performs HTTP requests that are authenticated with an oauth2.Token stored with the account acc.

func (Account) NewRateLimitedRoundTripper Uses

func (acc Account) NewRateLimitedRoundTripper(rt http.RoundTripper) http.RoundTripper

NewRateLimitedRoundTripper adds rate limiting to rt based on the rate limiting policy registered by the data source associated with acc.

type AuthenticateFn Uses

type AuthenticateFn func(userID string) ([]byte, error)

AuthenticateFn is a function that authenticates userID with a service. It returns the authorization or credentials needed to operate. The return value should be byte-encoded so it can be stored in the DB to be reused. To store arbitrary types, encode the value as a gob, for example.

type CheckpointFn Uses

type CheckpointFn func(checkpoint []byte) error

CheckpointFn is a function that saves a checkpoint.

type Client Uses

type Client interface {
    // ListItems lists the items on the account. Items should be
    // sent on itemChan as they are discovered, but related items
    // should be combined onto a single ItemGraph so that their
    // relationships can be stored. If the relationships are not
    // discovered until later, that's OK: item processing is
    // idempotent, so repeating an item from earlier will have no
    // adverse effects (this is possible because a unique ID is
    // required for each item).
    //
    // Implementations must honor the context's cancellation. If
    // ctx.Done() is closed, the function should return. Typically,
    // this is done by having an outer loop select over ctx.Done()
    // and default, where the next page or set of items is handled
    // in the default case.
    //
    // ListItems MUST close itemChan when returning. A
    // `defer close(itemChan)` will usually suffice. Closing
    // this channel signals to the processing goroutine that
    // no more items are coming.
    //
    // Further options for listing items may be passed in opt.
    //
    // If opt.Filename is specified, the implementation is expected
    // to open and list items from that file. If this is not
    // supported, an error should be returned. Conversely, if a
    // filename is not specified but required, an error should be
    // returned.
    //
    // opt.Timeframe consists of two optional timestamp and/or item
    // ID values. If set, item listings should be bounded in the
    // respective direction by that timestamp / item ID. (Items
    // are assumed to be part of a chronology; both timestamp and
    // item ID *may be* provided, when possible, to accommodate
    // data sources which do not constrain by timestamp but which
    // do by item ID instead.) The respective time and item ID
    // fields, if set, will not be in conflict, so either may be
    // used if both are present. While it should be documented if
    // timeframes are not supported, an error need not be returned
    // if they cannot be honored.
    //
    // opt.Checkpoint consists of the last checkpoint for this
    // account if the last call to ListItems did not finish and
    // if a checkpoint was saved. If not nil, the checkpoint
    // should be used to resume the listing instead of starting
    // over from the beginning. Checkpoint values usually consist
    // of page tokens or whatever state is required to resume. Call
    // timeliner.Checkpoint to set a checkpoint. Checkpoints are not
    // required, but if the implementation sets checkpoints, it
    // should be able to resume from one, too.
    ListItems(ctx context.Context, itemChan chan<- *ItemGraph, opt Options) error
}

Client is a type that can interact with a data source.

type Collection Uses

type Collection struct {
    // The ID of the collection as given
    // by the service; for example, the
    // album ID. If the service does not
    // provide an ID for the collection,
    // invent one such that the next time
    // the collection is encountered and
    // processed, its ID will be the same.
    // An ID is necessary here to ensure
    // uniqueness.
    //
    // REQUIRED.
    OriginalID string

    // The name of the collection as
    // given by the service; for example,
    // the album title.
    //
    // Optional.
    Name *string

    // The description, caption, or any
    // other relevant text describing
    // the collection.
    //
    // Optional.
    Description *string

    // The items for the collection;
    // if ordering is significant,
    // specify each item's Position
    // field; the order of elememts
    // of this slice will not be
    // considered important.
    Items []CollectionItem
}

Collection represents a group of items.

type CollectionItem Uses

type CollectionItem struct {
    // The item to add to the collection.
    Item Item

    // Specify if ordering is important.
    Position int
    // contains filtered or unexported fields
}

CollectionItem represents an item stored in a collection.

type DataSource Uses

type DataSource struct {
    // A snake_cased name of the service
    // that uniquely identifies it from
    // all others.
    ID  string

    // The human-readable or brand name of
    // the service.
    Name string

    // If the service authenticates with
    // OAuth2, fill out this field.
    OAuth2 OAuth2

    // Otherwise, if the service uses some
    // other form of authentication,
    // Authenticate is a function which
    // returns the credentials needed to
    // access an account on the service.
    Authenticate AuthenticateFn

    // If the service enforces a rate limit,
    // specify it here. You can abide it by
    // getting an http.Client from the
    // Account passed into NewClient.
    RateLimit RateLimit

    // NewClient is a function which takes
    // information about the account and
    // returns a type which can facilitate
    // transactions with the service.
    NewClient NewClientFn
}

DataSource has information about a data source that can be registered.

type Item Uses

type Item interface {
    // The unique ID of the item assigned by the service.
    // If the service does not assign one, then invent
    // one such that the ID is unique to the content or
    // substance of the item (for example, an ID derived
    // from timestamp or from the actual content of the
    // item -- whatever makes it unique). The ID need
    // only be unique for the account it is associated
    // with, although more unique is, of course, acceptable.
    //
    // REQUIRED.
    ID() string

    // The originating timestamp of the item, which
    // may be different from when the item was posted
    // or created. For example, a photo may be taken
    // one day but uploaded a week later. Prefer the
    // time when the original item content was captured.
    //
    // REQUIRED.
    Timestamp() time.Time

    // A classification of the item's kind.
    //
    // REQUIRED.
    Class() ItemClass

    // The user/account ID of the owner or
    // originator of the content, along with their
    // username or real name. The ID is used to
    // relate the item with the person behind it;
    // the name is used to make the person
    // recognizable to the human reader. If the
    // ID is nil, the current account owner will
    // be assumed. (Use the ID as given by the
    // data source.) If the data source only
    // provides a name but no ID, you may return
    // the name as the ID with the understanding
    // that a different name will be counted as a
    // different person. You may also return the
    // name as the name and leave the ID nil and
    // have correct results if it is safe to assume
    // the name belongs to the current account owner.
    Owner() (id *string, name *string)

    // Returns the text of the item, if any.
    // This field is indexed in the DB, so don't
    // use for unimportant metadata or huge
    // swaths of text; if there is a large
    // amount of text, use an item file instead.
    DataText() (*string, error)

    // For primary content which is not text or
    // which is too large to be stored well in a
    // database, the content can be downloaded
    // into a file. If so, the following methods
    // should return the necessary information,
    // if available from the service, so that a
    // data file can be obtained, stored, and
    // later read successfully.
    //
    // DataFileName returns the filename (NOT full
    // path or URL) of the file; prefer the original
    // filename if it originated as a file. If the
    // filename is not unique on disk when downloaded,
    // it will be made unique by modifying it. If
    // this value is nil/empty, a filename will be
    // generated from the item's other data.
    //
    // DataFileReader returns a way to read the data.
    // It will be closed when the read is completed.
    //
    // DataFileHash returns the checksum of the
    // content as provided by the service. If the
    // service (or data source) does not provide a
    // hash, leave this field empty, but note that
    // later it will be impossible to efficiently
    // know whether the content has changed on the
    // service from what is stored locally.
    //
    // DataFileMIMEType returns the MIME type of
    // the data file, if known.
    DataFileName() *string
    DataFileReader() (io.ReadCloser, error)
    DataFileHash() []byte
    DataFileMIMEType() *string

    // Metadata returns any optional metadata.
    // Feel free to leave as many fields empty
    // as you'd like: the less fields that are
    // filled out, the smaller the storage size.
    // Metadata is not indexed by the DB but is
    // rendered in projections and queries
    // according to the item's classification.
    Metadata() (*Metadata, error)

    // Location returns an item's location,
    // if known. For now, only Earth
    // coordinates are accepted, but we can
    // improve this later.
    Location() (*Location, error)
}

Item is the central concept of a piece of content from a service or data source. Take note of which methods are required to return non-empty values.

The actual content of an item is stored either in the database or on disk as a file. Generally, content that is text-encoded can and should be stored in the database where it will be indexed. However, if the item's content (for example, the bytes of a photo or video) are not text or if the text is too large to store well in a database (for example, an entire novel), it should be stored on disk, and this interface has methods to accommodate both. Note that an item may have both text and non-text content, too: for example, photos and videos may have descriptions that are as much "content" as the media iteself. One part of an item is not mutually exclusive with any other.

type ItemClass Uses

type ItemClass int

ItemClass classifies an item.

const (
    ClassUnknown ItemClass = iota
    ClassImage
    ClassVideo
    ClassAudio
    ClassPost
    ClassLocation
    ClassEmail
    ClassPrivateMessage
    ClassMessage
)

Various classes of items.

type ItemGraph Uses

type ItemGraph struct {
    // The node item. This can be nil, but note that
    // Edges will not be traversed if Node is nil,
    // because there must be a node on both ends of
    // an edge.
    //
    // Optional.
    Node Item

    // Edges are represented as 1:many relations
    // to other "graphs" (nodes in the graph).
    // Fill this out to add multiple items to the
    // timeline at once, while drawing the
    // designated relationships between them.
    // Useful when processing related items in
    // batches.
    //
    // Directional relationships go from Node to
    // the map key.
    //
    // If the items involved in a relationship are
    // not efficiently available at the same time
    // (i.e. if loading both items involved in the
    // relationship would take a non-trivial amount
    // of time or API calls), you can use the
    // Relations field instead, but only after the
    // items have been added to the timeline.
    //
    // Optional.
    Edges map[*ItemGraph][]Relation

    // If items in the graph belong to a collection,
    // specify them here. If the collection does not
    // exist (by row ID or AccountID+OriginalID), it
    // will be created. If it already exists, the
    // collection in the DB will be unioned with the
    // collection specified here. Collections are
    // processed regardless of Node and Edges.
    //
    // Optional.
    Collections []Collection

    // Relationships between existing items in the
    // timeline can be represented here in a list
    // of item IDs that are connected by a label.
    // This field is useful when relationships and
    // the items involved in them are not discovered
    // at the same time. Relations in this list will
    // be added to the timeline, joined by the item
    // IDs described in the RawRelations, only if
    // the items having those IDs (as provided by
    // the data source; we're not talking about DB
    // row IDs here) already exist in the timeline.
    // In other words, this is a best-effort field;
    // useful for forming relationships of existing
    // items, but without access to the actual items
    // themselves. If you have the items involved in
    // the relationships, use Edges instead.
    //
    // Optional.
    Relations []RawRelation
}

ItemGraph is an item with optional connections to other items. All ItemGraph values should be pointers to ensure consistency. The usual weird/fun thing about representing graph data structures in memory is that a graph is a node, and a node is a graph. 🤓

func NewItemGraph Uses

func NewItemGraph(node Item) *ItemGraph

NewItemGraph returns a new node/graph.

func (*ItemGraph) Add Uses

func (ig *ItemGraph) Add(item Item, rel Relation)

Add adds item to the graph ig by making an edge described by rel from the node ig to a new node for item.

This method is for simple inserts, where the only thing to add to the graph at this moment is a single item, since the graph it inserts contains only a single node populated by item. To add a full graph with multiple items (i.e. a graph with edges), call ig.Connect directly.

func (*ItemGraph) Connect Uses

func (ig *ItemGraph) Connect(node *ItemGraph, rel Relation)

Connect is a simple convenience function that adds a graph (node) to ig by an edge described by rel.

type ItemRow Uses

type ItemRow struct {
    ID         int64
    AccountID  int64
    OriginalID string
    PersonID   int64
    Timestamp  time.Time
    Stored     time.Time
    Modified   *time.Time
    Class      ItemClass
    MIMEType   *string
    DataText   *string
    DataFile   *string
    DataHash   *string // base64-encoded SHA-256
    Metadata   *Metadata
    Location
    // contains filtered or unexported fields
}

ItemRow has the structure of an item's row in our DB.

type Location Uses

type Location struct {
    Latitude  *float64
    Longitude *float64
}

Location contains location information.

type Metadata Uses

type Metadata struct {
    // A hash or etag provided by the service to
    // make it easy to know if it has changed
    ServiceHash []byte

    // Locations
    LocationAccuracy int
    Altitude         int // meters
    AltitudeAccuracy int
    Heading          int // degrees
    Velocity         int

    GeneralArea string // natural language description of a location

    // Photos and videos
    EXIF map[string]interface{}

    Width  int
    Height int

    // TODO: Google Photos (how many of these belong in EXIF?)
    CameraMake      string
    CameraModel     string
    FocalLength     float64
    ApertureFNumber float64
    ISOEquivalent   int
    ExposureTime    time.Duration

    FPS float64 // Frames Per Second

    // Posts (Facebook so far)
    Link        string
    Description string
    Name        string
    ParentID    string
    StatusType  string
    Type        string

    Shares int // aka "Retweets" or "Reshares"
    Likes  int
}

Metadata is a unified structure for storing item metadata in the DB.

type NewClientFn Uses

type NewClientFn func(acc Account) (Client, error)

NewClientFn is a function that returns a client which, given the account passed in, can interact with a service provider.

type OAuth2 Uses

type OAuth2 struct {
    // The ID of the service must be recognized
    // by the OAuth2 app configuration.
    ProviderID string

    // The list of scopes to ask for during auth.
    Scopes []string
}

OAuth2 defines which OAuth2 provider a service uses and which scopes it requires.

type Options Uses

type Options struct {
    // A file from which to read the data.
    Filename string

    // Time bounds on which data to retrieve.
    // The respective time and item ID fields
    // which are set must never conflict.
    Timeframe Timeframe

    // A checkpoint from which to resume
    // item retrieval.
    Checkpoint []byte
}

Options specifies parameters for listing items from a data source. Some data sources might not be able to honor all fields.

type Person Uses

type Person struct {
    ID         int64
    Name       string
    Identities []PersonIdentity
}

Person represents a person.

type PersonIdentity Uses

type PersonIdentity struct {
    ID           int64
    PersonID     string
    DataSourceID string
    UserID       string
}

PersonIdentity is a way to map a user ID on a service to a person.

type RateLimit Uses

type RateLimit struct {
    RequestsPerHour int
    BurstSize       int
    // contains filtered or unexported fields
}

RateLimit describes a rate limit.

type RawRelation Uses

type RawRelation struct {
    FromItemID       string
    ToItemID         string
    FromPersonUserID string
    ToPersonUserID   string
    Relation
}

RawRelation represents a relationship between two items or people (or both) from the same data source (but not necessarily the same accounts; we assume that a data source's item IDs are globally unique across accounts). The item IDs should be those which are assigned/provided by the data source, NOT a database row ID. Likewise, the persons' user IDs should be the IDs of the user as associated with the data source, NOT their row IDs.

type Relation Uses

type Relation struct {
    Label         string
    Bidirectional bool
}

Relation describes how two nodes in a graph are related. It's essentially an edge on a graph.

type Timeframe Uses

type Timeframe struct {
    Since, Until             *time.Time
    SinceItemID, UntilItemID *string
}

Timeframe represents a start and end time and/or a start and end item, where either value could be nil which means unbounded in that direction. When items are used as the timeframe boundaries, the ItemID fields will be populated. It is not guaranteed that any particular field will be set or unset just because other fields are set or unset. However, if both Since or both Until fields are set, that means the timestamp and items are correlated; i.e. the Since timestamp is (approx.) that of the item ID. Or, put another way: there will never be conflicts among the fields which are non-nil.

type Timeline Uses

type Timeline struct {
    // contains filtered or unexported fields
}

Timeline represents an opened timeline repository. The zero value is NOT valid; use Open() to obtain a valid value.

func Open Uses

func Open(repo string) (*Timeline, error)

Open creates/opens a timeline at the given repository directory. Timelines should always be Close()'d for a clean shutdown when done.

func (*Timeline) AddAccount Uses

func (t *Timeline) AddAccount(dataSourceID, userID string) error

AddAccount authenticates userID with the service identified within the application by dataSourceID, and then stores it in the database. The account must not yet exist.

func (*Timeline) Authenticate Uses

func (t *Timeline) Authenticate(dataSourceID, userID string) error

Authenticate gets authentication for userID with dataSourceID. If the account already exists in the database, it will be updated with the latest authorization.

func (*Timeline) Close Uses

func (t *Timeline) Close() error

Close frees up resources allocated from Open.

func (*Timeline) NewClient Uses

func (t *Timeline) NewClient(dataSourceID, userID string) (WrappedClient, error)

NewClient returns a new Client that is ready to interact with the data source for the account uniquely specified by the data source ID and the user ID for that data source. The Client is actually wrapped by a type with unexported fields that are necessary for internal use.

type WrappedClient Uses

type WrappedClient struct {
    Client
    // contains filtered or unexported fields
}

WrappedClient wraps a Client instance with unexported fields that contain necessary state for performing data collection operations. Do not craft this type manually; use Timeline.NewClient() to obtain one.

func (*WrappedClient) DataSourceID Uses

func (wc *WrappedClient) DataSourceID() string

DataSourceID returns the ID of the data source wc was created from.

func (*WrappedClient) DataSourceName Uses

func (wc *WrappedClient) DataSourceName() string

DataSourceName returns the name of the data source wc was created from.

func (*WrappedClient) GetAll Uses

func (wc *WrappedClient) GetAll(ctx context.Context, reprocess, prune, integrity bool) error

GetAll gets all the items using wc. If reprocess is true, items that are already in the timeline will be re-processed. If prune is true, items that are not listed on the data source by wc will be removed from the timeline at the end of the listing. If integrity is true, all items that are listed by wc that exist in the timeline and which consist of a data file will be opened and checked for integrity; if the file has changed, it will be reprocessed.

func (*WrappedClient) GetLatest Uses

func (wc *WrappedClient) GetLatest(ctx context.Context) error

GetLatest gets the most recent items from wc. It does not prune or reprocess; only meant for a quick pull. If there are no items pulled yet, all items will be pulled.

func (*WrappedClient) Import Uses

func (wc *WrappedClient) Import(ctx context.Context, filename string, reprocess, prune, integrity bool) error

Import is like GetAll but for a locally-stored archive or export file that can simply be opened and processed, rather than needing to run over a network. See the godoc for GetAll. This is only for data sources that support Import.

func (*WrappedClient) UserID Uses

func (wc *WrappedClient) UserID() string

UserID returns the ID of the user associated with this client.

Directories

PathSynopsis
datasources/facebookPackage facebook implements the Facebook service using the Graph API: https://developers.facebook.com/docs/graph-api
datasources/googlelocationPackage googlelocation implements a Timeliner data source for importing data from the Google Location History (aka Google Maps Timeline).
datasources/googlephotosPackage googlephotos implements the Google Photos service using its API, documented at https://developers.google.com/photos/.
datasources/instagramPackage instagram implements a Timeliner data source for importing data from Instagram archive files.
datasources/twitterPackage twitter implements a Timeliner service for importing and downloading data from Twitter.
oauth2client
oauth2client/oauth2proxy

Package timeliner imports 23 packages (graph) and is imported by 5 packages. Updated 2019-07-15. Refresh now. Tools for package owners.