fetch.cli

command
v0.0.0-...-d33463d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 12, 2020 License: BSD-3-Clause Imports: 12 Imported by: 0

Documentation

Overview

Fetcher CLI of the Dataflow kit downloads html content from web pages via Fetcher service endpoint.

Currently two types of fetcher are available : Headless Chrome Fetcher and Base Fetcher.

Base fetcher is used for downloading html web page using Go standard library's http.

Chrome Fetcher connects to Headless Chrome which renders JavaScript pages.

Accessing Fetcher endpoints

Examples

./fetch.cli --URL http://example.com
./fetch.cli --URL http://example.com --FETCHER_TYPE base
./fetch.cli -u http://example.com -t base
./fetch.cli -u http://example.com -t chrome

Flags and configuration settings

DFK_FETCH: HTTP listen address of Fetch service (defaults to "127.0.0.1:8000")
FETCHER_TYPE: DFK Fetcher type: chrome, base (defaults to base)
URL: URL to be fetched

Request parameters for fetchers

FORMDATA: string value for passing formdata parameters.
For example the following string for processing pages which
require authentication may be passed:
"auth_key=880ea6a14ea49e853634fbdc5015a024&referer=http%3A%2F%2Fexample.com%2F&ips_username=user&ips_password=userpassword&rememberMe=1"
COOKIES: Cookies contain cookies to be added to request  before
sending it to browser.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL