fisher

package module
v0.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 27, 2021 License: BSD-3-Clause Imports: 14 Imported by: 0

README

fisher

Crawl utils.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GetSocks5Proxy

func GetSocks5Proxy(ctx context.Context, url string) (ret string, err error)

获取SOCKS5代理地址

func RecordData

func RecordData(data interface{}) (err error)

写入已经json序列化好的字符数组,字符串或者是可以进行json序列化的对象

func RunWithCrawler

func RunWithCrawler(crawler Crawler, opts ...chromedp.ExecAllocatorOption)

func SetStorageTarget

func SetStorageTarget(file *os.File)

如果要将数据存储到文件中,则先需要调用该接口,设置用于写入的文件句柄

func StartStorageServer

func StartStorageServer(ctx context.Context)

需要使用go关键字,单独起一个协程运行,只有开启了该服务,才能存储数据

func ToCrawl

func ToCrawl(action CrawlAction)

Types

type CrawlAction

type CrawlAction = func(proxyReqUrl string, isHeadless bool, customMap map[string]string, outputFile *os.File)

func GetAction

func GetAction(crawler Crawler) CrawlAction

type Crawler

type Crawler interface {
	Crawl(ctx context.Context) error
	ParseArgs(argMap map[string]string) error
}

type ProxyInfo

type ProxyInfo struct {
	IP   string `json:"ip"`
	Port int    `json:"port"`
}

先对 json 格式进行 struct 结构定义

func GetProxyInfo

func GetProxyInfo(ctx context.Context, url string) (ret ProxyInfo, err error)

获取代理地址信息

type ProxyResponse

type ProxyResponse struct {
	Code    int           `json:"code"`
	Data    [16]ProxyInfo `json:"data"`
	Msg     string        `json:"msg"`
	Success bool          `json:"success"`
}

type Storage

type Storage struct {
	// contains filtered or unexported fields
}

func GetStorageInstance

func GetStorageInstance() *Storage

func (*Storage) SetTarget

func (storage *Storage) SetTarget(file *os.File)

func (*Storage) StoreData

func (storage *Storage) StoreData(ctx context.Context)

func (*Storage) WriteData

func (storage *Storage) WriteData(data []byte)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL