phish-food

module
v0.3.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 11, 2021 License: GPL-3.0

README

PhishFood

Deploy CodeQL

Like the Ben & Jerry's Ice Cream

About

PhishFood is the source code for the ETL pipeline and associated cloud infrastructure required to produce the TheKettle database. This project aims to provide quality, reliable data that an end user can have confidence in. At it's core, the pipeline and database attempts to collect and summarize what is "Hot" on Reddit's most popular trading subreddits.

Below is an example database entry (NoSQL version):

{
    "id": "wallstreetbets_20210408",
    "hour": 18,
    "data": [
        {
            "Stock": {
                "Symbol": "GME",
                "FullName": "GameStop Corporation Common Stock",
                "Exchange": "NYSE"
            },
            "Count": {
                "PostScore": 16852,
                "CommentScore": 3306,
                "TotalScore": 975.1710719570256,
                "PostMentions": 2,
                "CommentMentions": 50
            }
        }
    ]
}

Currently there are 3 supported subreddits:

  • stocks
  • wallstreetbets
  • investing

Get the SQLite Version of the database here

Why?

It was widely reported during the GameStop hype that hedge funds were setting up or buying applications to scrape Reddit for the latest trending stock data. I thought it would be helpful to a retail trader to have access to the same data.

Directories

Path Synopsis
cmd
internal
etl
stocks
TODO: This needs refactoring
TODO: This needs refactoring

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL