package sqrape

import "github.com/cathalgarvey/sqrape"

Package sqrape provides a way to fill struct objects from raw HTML using CSS struct tags.


Package Files


func ExtractHTMLReader Uses

func ExtractHTMLReader(reader io.Reader, dest interface{}, context ...interface{}) (err error)

ExtractHTMLReader provides an entry point for parsing a HTML document in reader-form into a destination struct.

func ExtractHTMLString Uses

func ExtractHTMLString(document string, dest interface{}, context ...interface{}) error

ExtractHTMLString provides an entry point for parsing a HTML document

type FieldSelecter Uses

type FieldSelecter interface {
    SqrapeFieldSelect(fieldName string, context ...interface{}) (bool, error)

FieldSelecter is an optional method set; if defined, then prior to collecting data for a field from the scraped content this method will be called; it should return (true, nil) to fill that field, (false, nil) to skip that field, and (_, error) to cancel scraping. The context argument is completely user-defined, and is not Sqrape's business. It is passed through from the variadic optional argument to Sqrape's entry functions, and is exactly as passed to those functions.

type PostFlighter Uses

type PostFlighter interface {
    SqrapePostFlight(context ...interface{}) error

PostFlighter is an optional method for performing post-scrape operations on a struct. For example, scraped data might pull a set of data into one field, and a postflight operation might summarise those data and set another field. A more specific example might be a scraper that harvests view, favourite, and repost counts for social media posts, and then post-flight summarises the three counts as "interactions". The context arguments are exactly as passed to Sqrape's entry functions, and are there for user-defined behaviours. This method will only be called for finalised objects, so if some error or behaviour cancels a scrape this method will not be called.



