scrape

package
v0.0.0-...-631d120 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 21, 2024 License: MIT Imports: 15 Imported by: 2

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func RawEach

func RawEach(s *goquery.Selection) (a []*goquery.Selection)

useful with a for-range

func RecursiveChildFiltered

func RecursiveChildFiltered(s *goquery.Selection, filters ...string) *goquery.Selection

Types

type Scraper

type Scraper struct {
	Cookies []*network.CookieParam // required: Name, Value, Domain: ".ope.ee"
	Timeout time.Duration          // 0: disabled

	// Only works after InitScraper()
	InitExtraAllocatorOpts    []chromedp.ExecAllocatorOption
	InitGlobalConcurrentLimit int

	Ctx context.Context
	// contains filtered or unexported fields
}

use InitScraper to initialize internal values

func InitScraper

func InitScraper(ctx context.Context, raw *Scraper) *Scraper

populates internal values on Scraper

func (*Scraper) DoRaw

func (s *Scraper) DoRaw(urlS, method string, data []byte) (body []byte, _ error)

func (*Scraper) DownloadFile

func (s *Scraper) DownloadFile(urlS, outdir string) (suggested, filename, newURL string, _ error)

based on https://github.com/chromedp/examples/blob/3384adb2158f6df7e6a48458875a3a5f24aea0c3/download_file/main.go timeout: 0 to disable

func (*Scraper) Get

func (s *Scraper) Get(urlS, sel string) (_ *goquery.Selection, newURL string, _ error)

sel: goquery selector

func (*Scraper) GetRaw

func (s *Scraper) GetRaw(urlS string) ([]byte, error)

type ScraperOpts

type ScraperOpts struct {
	Executable string
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL