tweet-provider

command module
v0.0.0-...-dd98a42 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 24, 2020 License: Apache-2.0 Imports: 17 Imported by: 0

README

tweet-provider

Cloud Run Twitter search service, configured with service account identity, invoked by Cloud Scheduler, and persisting query state across invocations in Firestore. This service also publishes search results to Cloud PubSub for further consumption downstream, and records its tweet throughput as custom metrics in Stackdriver.

Pre-requirements

GCP Project and gcloud SDK

If you don't have one already, start by creating new project and configuring Google Cloud SDK. Similarly, if you have not done so already, you will have set up Cloud Run.

Twitter

To query Twitter API you will need to obtain Twitter Consumer and OAuth tokens. Good instructions on how to obtain these are located here. Once you obtain these, export these as environment variables:

export T_CONSUMER_KEY="***"
export T_CONSUMER_SECRET="***"
export T_ACCESS_TOKEN="***"
export T_ACCESS_SECRET="***"

Setup

Build Container Image

Cloud Run runs container images. To build one we are going to use the included Dockerfile and submit the build job to Cloud Build using bin/image script.

Note, you should review each one of the provided scripts for complete content of these commands

bin/image
Service Account and IAM Policies

In this example we are going to follow the principle of least privilege (POLP) to ensure our Cloud Schedule and Cloud Run servie have only the necessary rights and nothing more:

  • run.invoker - required to execute Cloud Run service
  • pubsub.editor - required to create and publish to Cloud PubSub
  • datastore.user - required to create and write/read to Firestore collection
  • logging.logWriter - required for Stackdriver logging
  • cloudtrace.agent - required for Stackdriver tracing
  • monitoring.metricWriter - required to write custom metrics to Stackdriver

To do that we will create a GCP service account and assign the necessary IAM policies and roles using bin/account script:

bin/account
Cloud Run Service

Once you have configured the GCP accounts, you can deploy a new Cloud Run service and set it to run under that account using and preventing unauthenticated access bin/service script:

bin/service

Notice the use of the Twitter environment variables as configuration values using the --set-env-vars argument.

Since the Cloud Run services are stateless, we are going to stores the service state (last tweet ID to use as a starting point for subsequent searches) in Firestore collection (by default twitter-query-state).

Cloud Schedule

The Cloud Run service will search Twitter for provided query. To invoke that service on regular bases, we are going to configure Cloud Schedule to execute that service every 10 min using bin/schedule script.

bin/schedule

You can change the search term which is used to query Twitter in the Cloud Scheduler UI after the schedule job is created or by changing the --message-body parameter in the above command. By default, the query is:

{ "query": "serverless AND knative" }

Monitoring

You can monitor the throughput of retrieved tweets in Stackdriver using the Metric Explorer

Cleanup

To cleanup all resources created by this sample execute the bin/cleanup script.

bin/cleanup

Disclaimer

This is my personal project and it does not represent my employer. I take no responsibility for issues caused by this code. I do my best to ensure that everything works, but if something goes wrong, my apologies is all you will get.

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL