libsuger

package
v0.0.0-...-9d82ea4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 4, 2016 License: BSD-2-Clause Imports: 9 Imported by: 0

Documentation

Overview

Libsuger is a micro-library for the executable suger. Suger is a tool to crawl and scrape the film classification database of Singapore's Media Development Authority (MDA).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Crawler

type Crawler struct {
	http.Client
	// contains filtered or unexported fields
}

Crawler is a type that embeds an http.Client and holds state information.

func NewCrawler

func NewCrawler() (*Crawler, error)

NewCrawler returns a pointer to a new Crawler.

func (*Crawler) Crawl

func (c *Crawler) Crawl(j Job, results chan<- Result, jobs chan<- Job)

The Crawl method takes a Job and two channels. The results channel is sent results as they are crawled. The jobs channal is sent jobs in the case of an error or they are done.

type Job

type Job struct {
	Error error
	// contains filtered or unexported fields
}

Job is a type that stores certain state information used by the Crawl method the Crawler type. Its only exported field is Error, which contains the last error recorded by Crawl method.

func NewJob

func NewJob(start int, count int) (Job, error)

NewJob creates a Job from the first result you want to crawl (start) and the number of results (count) that you want to crawl. It returns an error if start or count are less than one.

func (Job) IsDone

func (j Job) IsDone() bool

IsDone returns true if there are no more results to crawl (i.e., all have been successfully crawled.)

func (Job) Partition

func (j Job) Partition(n int) ([]Job, error)

Partition returns a slice of non-overlapping Jobs of roughly equal count.

type Rating

type Rating struct {
	Rating   string
	Decision string
}

Rating is a simple type to hold a single rating (e.g. "No Children Under 16") and decision (e.g. "Passed Clean").

type Result

type Result struct {
	URL  string // get-able URL of result page
	HTML []byte // html of the result page
	Page int    // search result page the result was found on
	Row  int    // search result row the result was found on
}

Result is a type returned through a channel by the Crawl method of the Crawler type. It holds the HTML of a classification database title page.

type Title

type Title struct {
	Name    string
	Ratings []Rating
	URL     string
}

Title is a simple type to hold the Name, URL, and various Ratings for a title in the database.

func NewTitleFromHTML

func NewTitleFromHTML(html []byte) (*Title, error)

func (*Title) MaxRating

func (t *Title) MaxRating() (string, bool)

MaxRating returns the "highest" rating a title has been given. It's bool return value is false if the Title has no ratings (an ok pattern).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL