httpsyet

package

v0.2.2 Latest Latest Go to latest Published: Jun 30, 2018 License: MIT Imports: 11 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/golangsam/pipe

Links

Open Source Insights

README ¶

`v4` - Teach `traffic` to be more gentle and easy to use

Overview

Proper use of the new struct traffic is still awkward for crawling.

Thus, time to teach traffic to behave better and more robust. Sepcifically:

new constructor New(): don't bother the client with initialisation - and thus now no need anymore to import "sync"
have Processor return a signal channel to broadcast "traffic has subsided and nothing is left to be processed"
lazy initialisation of this mechanism upon first Feed, and Do() only synce.Once
new method Done() - just a convenince to receive the broadcast channel another way
wrap the crawl function passed to Processor and have it register the site having left - thus: no need anymore for crawling to do so in it's crawl method.

So, crawling now is 20% shorter more more focused on it's own subject, is it not?

Please also note: "launch the results closer" now happens happily before the first "feed initial urls" - no need anymore to worry for something like "goWaitAndClose is to be used after initial traffic has been added".

The client (crawling) is free to use the channel returned from Processor (as it does now) or may even use <-crawling.Done() at any time he likes or seems fit (even before(!)) the Processor is build.

And: Done() is a method familar e.g. from the "context" package - thus easy to use and understand. Easier as is <-sites.SiteDoneWait(c.Travel, c), is it not?

Some remarks regarding changes to source files compared with the previous version:

`traffic.go`

Implement a.m. improvements straight-forward. Note: The network itself remains as is.

`genny.go` in `traffic/`

Just change to private site.

`site.go`

Just make it's (previously public) methods (Attr & Print) private.

`crawling.go`

Much more focused and compact now.

`crawler_test.go`

Just the import path.

Changes to `crawler.go`

No need to touch.

Back to Overview

Documentation ¶

Overview ¶

Package httpsyet provides the configuration and execution for crawling a list of sites for links that can be updated to HTTPS.

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Crawler ¶

type Crawler struct {
	Sites    []string                             // At least one URL.
	Out      io.Writer                            // Required. Writes one detected site per line.
	Log      *log.Logger                          // Required. Errors are reported here.
	Depth    int                                  // Optional. Limit depth. Set to >= 1.
	Parallel int                                  // Optional. Set how many sites to crawl in parallel.
	Delay    time.Duration                        // Optional. Set delay between crawls.
	Get      func(string) (*http.Response, error) // Optional. Defaults to http.Get.
	Verbose  bool                                 // Optional. If set, status updates are written to logger.
}

Crawler is used as configuration for Run. Is validated in Run().

func (Crawler) Run ¶

func (c Crawler) Run() error

Run the crawler. Can return validation errors. All crawling errors are reported via logger. Output is written to writer. Crawls sites recursively and reports all external links that can be changed to HTTPS. Also reports broken links via error logger.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
result
sites

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

v4 - Teach traffic to be more gentle and easy to use

Overview

traffic.go

genny.go in traffic/

site.go

crawling.go

crawler_test.go

Changes to crawler.go

Documentation ¶

Overview ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type Crawler ¶

func (Crawler) Run ¶

Source Files ¶

Directories ¶

`v4` - Teach `traffic` to be more gentle and easy to use

`traffic.go`

`genny.go` in `traffic/`

`site.go`

`crawling.go`

`crawler_test.go`

Changes to `crawler.go`