pathgtfsrt

package module
v0.0.0-...-3018eca Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 19, 2024 License: MIT Imports: 20 Imported by: 0

README

GTFS Realtime for the PATH Train

This repository hosts a simple Go application that reads PATH train realtime data from Matt Razza's public API and outputs a feed of the data in the GTFS Realtime format. Some important notes:

  • You don't need to run the application yourself. The GTFS Realtime feed produced by this software can be accessed at https://path.transitdata.nyc/gtfsrt. It's updated every 5 seconds.

  • The outputted data is compatible with the official GTFS Static data published by the Port Authority in the sense that the stop IDs and route IDs match up. The feed should work correctly for software that integrates realtime and static data.

  • Unfortunately the Port Authority doesn't distribute the full realtime data set, and so the GTFS Realtime feed has some big missing pieces:

    • There is no trip data: all the Port Authority communicates are stops, and arrival times at those stops. There is no easy way to connect arrival times for the same train at multiple stops. So, in the GTFS Realtime feed, the "trips" are dummy trips with a random ID and a single stop time update. This should be sufficient for consumers that want to show arrival times at stops, but of course prevents other uses like tracking trains through the system.
    • The GTFS Static feed describes all the tracks/platforms at each of the PATH stations but in the realtime data we don't known which platform a train will stop at. In the realtime feed, all of the trains stop at the "station" stop (i.e., the stop in the static feed with location type 1).

Running the application

The application is an HTTP server with the GTFS Realtime feed available at the /gtfsrt path.

There are 2 options for the data source to use for PATH arrival times:

  1. The path-data API (default), which fetches the data that the RidePATH app uses.
  2. The PANYNJ JSON API, which powers the PATH train schedules web page.

In the background, the program periodically retrieves data from the selected API and updates the feed. By default, this update occurs every 5 seconds for the path-data API and every 15 seconds for the PANYNJ JSON API.

There are a couple flags that can be passed to the binary:

  • --port <int>: the port to bind the HTTP server to (default 8080)

  • --timeout_period <duration>: the maximum duration to wait for a response from the source API (default 5s)

  • --update_period <duration>: how often to update the feed (default 5s for the path-data API, 15s for the PANYNJ API). Remember that the more frequently you update, the more stress you place on the source API, so be nice.

  • --use_http_source_api use the HTTP path-data API instead of the default gRPC API.

  • --use_panynj_api: use the PANYNJ JSON API instead of the path-data API.

Running using Docker

The CI process (using Github actions) builds a Docker image and stores it at the jamespfennell/path-train-gtfs-realtime:latest tag on Docker Hub. You can also build the Docker image locally by running docker build . in the root of the repo.

It is generally simplest to run the application using Docker. The only thing you need to do is port forward the HTTP server's port outside of the container. This is a functioning Docker compose configuration that does this:

version: '3.5'

services:
  path-train-gtfs-realtime:
    image: jamespfennell/path-train-gtfs-realtime:latest
    port: 8080:9001
    restart: always
Running using go run

When doing dev work it is generally necessary to run the application on "bare metal", which you can do simply with go run cmd/pathgtfsrt.go.

The source gRPC API and the GTFS Realtime format are both built on proto files. Getting these proto files and compiling them to go files is a bit of a pain, so they're kept in source control. To regenerate them, it's probably just simplest to use the Docker build process.

Error handling and exit codes

A number of errors can prevent the application from running 100% correctly, with the main source of errors being network failures when hitting the source API. At start-up, the application downloads static and realtime data from the API; if this fails, the application will exit.

After start-up, any further errors encountered are handled gracefully, and the server will not exit until interrupted. If, during a particular update, the realtime data for a specific stop cannot be retrieved, or is malformed, then the previously retrieved data will be used.

Monitoring

The application exports metrics in Prometheus format on the /metrics endpoint. See cmd/pathgtfsrt.go for the metric definitions.

Licence notes

  • All the code in the root directory of the repo is released under the MIT License (see LICENSE).

  • The proto files in the sourceapi directory are sourced from the mrazza/path-data Github repo, are released under the MIT License and are copyright Matthew Razza.

  • The proto files in the gtfsrt directory are sourced from the google/tranist Github repo, are released under the Apache License 2.0 and are copyright Google Inc.

  • My understanding is that the proto copyrights extend to the compiled go files.

Documentation

Overview

Package pathgtfsrt contains a GTFS realtime feed generator for the PATH train.

Index

Constants

View Source
const (
	ViaHobokenSuffix = "via hoboken"
)

Variables

View Source
var BuildNumber string

Set via flags on Go build

Functions

This section is empty.

Types

type Destination

type Destination struct {
	Label    string    `json:"label"`
	Messages []Message `json:"messages"`
}

Destination is a labeled direction with associated trains

type Feed

type Feed struct {
	// contains filtered or unexported fields
}

Feed periodically generates GTFS Realtime data for the PATH train and makes it available through the `Get` method.

Feed also satisfies the http.Handler interface, and simply responds to all requests with the most recent GTFS realtime data.

func NewFeed

func NewFeed(ctx context.Context, clock clock.Clock, updatePeriod time.Duration, sourceClient SourceClient, callback UpdateCallback) (*Feed, error)

NewFeed creates a new feed.

This function gets static and realtime data from the source API and creates the first version of the GTFS realtime feed before returning. It then, in the background, periodically updates the realtime data following the provided update period.

After each update, including the first synchronous update, the provided callback is invoked.

func (*Feed) Get

func (f *Feed) Get() []byte

Get returns the most recent GTFS realtime data.

func (*Feed) ServeHTTP

func (f *Feed) ServeHTTP(w http.ResponseWriter, r *http.Request)

ServeHTTP responds to all requests with the most recent GTFS realtime data.

type GrpcSourceClient

type GrpcSourceClient struct {
	// contains filtered or unexported fields
}

GrpcSourceClient is a source client that gets data using the Razza gRPC API.

func NewGrpcSourceClient

func NewGrpcSourceClient(timeoutPeriod time.Duration) (*GrpcSourceClient, error)

func (*GrpcSourceClient) Close

func (client *GrpcSourceClient) Close() error

func (*GrpcSourceClient) GetRouteToRouteId

func (client *GrpcSourceClient) GetRouteToRouteId(ctx context.Context) (routeToRouteId map[sourceapi.Route]string, err error)

func (*GrpcSourceClient) GetStationToStopId

func (client *GrpcSourceClient) GetStationToStopId(ctx context.Context) (stationToStopId map[sourceapi.Station]string, err error)

func (*GrpcSourceClient) GetTrainsAtStation

func (client *GrpcSourceClient) GetTrainsAtStation(ctx context.Context, station sourceapi.Station) ([]Train, error)

type HttpClient

type HttpClient interface {
	Get(url string) (resp *http.Response, err error)
}

type HttpSourceClient

type HttpSourceClient struct {
	// contains filtered or unexported fields
}

HttpSourceClient is a source client that gets data using the Razza HTTP API.

func NewHttpSourceClient

func NewHttpSourceClient(httpClient HttpClient) *HttpSourceClient

func (*HttpSourceClient) GetRouteToRouteId

func (client *HttpSourceClient) GetRouteToRouteId(_ context.Context) (map[sourceapi.Route]string, error)

func (*HttpSourceClient) GetStationToStopId

func (client *HttpSourceClient) GetStationToStopId(_ context.Context) (map[sourceapi.Station]string, error)

func (*HttpSourceClient) GetTrainsAtStation

func (client *HttpSourceClient) GetTrainsAtStation(_ context.Context, station sourceapi.Station) ([]Train, error)

type Message

type Message struct {
	Target             string `json:"target"`
	SecondsToArrival   string `json:"secondsToArrival"`
	ArrivalTimeMessage string `json:"arrivalTimeMessage"`
	LineColor          string `json:"lineColor"`
	HeadSign           string `json:"headSign"`
	LastUpdated        string `json:"lastUpdated"`
}

Message contains information about a single train

type PaNyNjClient

type PaNyNjClient struct {
	// contains filtered or unexported fields
}

PaNyNjClient is a source client that gets data from the Port Authority of New York and New Jersey. It is what is used to power the official realtime schedules on the PATH website: https://www.panynj.gov/path/en/index.html

func NewPaNyNjSourceClient

func NewPaNyNjSourceClient(httpClient HttpClient, clock clock.Clock) *PaNyNjClient

func (*PaNyNjClient) GetRouteToRouteId

func (client *PaNyNjClient) GetRouteToRouteId(_ context.Context) (map[sourceapi.Route]string, error)

func (*PaNyNjClient) GetStationToStopId

func (client *PaNyNjClient) GetStationToStopId(_ context.Context) (map[sourceapi.Station]string, error)

func (*PaNyNjClient) GetTrainsAtStation

func (client *PaNyNjClient) GetTrainsAtStation(_ context.Context, station sourceapi.Station) ([]Train, error)

type Result

type Result struct {
	ConsideredStation string        `json:"consideredStation"`
	Destinations      []Destination `json:"destinations"`
}

Result represents all destinations for a given station

type RidePathResponse

type RidePathResponse struct {
	Results []Result `json:"results"`
}

RidePathResponse contains information about all incoming trains at all stations

type SourceClient

type SourceClient interface {
	// Return a map from source API station code to GTFS static stop ID
	GetStationToStopId(context.Context) (map[sourceapi.Station]string, error)
	// Return a map from source API route code to GTFS static route ID
	GetRouteToRouteId(context.Context) (map[sourceapi.Route]string, error)
	// List all upcoming trains at a station
	GetTrainsAtStation(context.Context, sourceapi.Station) ([]Train, error)
}

SourceClient describes the methods that the feed generator requires from the source API in order to build the feed.

type Train

Train contains data about a PATH train at a specific station.

type UpdateCallback

type UpdateCallback func(msg *gtfs.FeedMessage, requestErrs []error)

UpdateCallback is the type of callback that the feed runs after each update.

The first argument is the GTFS realtime message that was just built. The second argument is the list of all errors that occured when getting realtime data from the source API.

Directories

Path Synopsis
proto

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL