sites

package
v0.2.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 30, 2018 License: MIT Imports: 5 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func SiteMakeChan added in v0.2.0

func SiteMakeChan() (out chan Site)

SiteMakeChan returns a new open channel (simply a 'chan Site' that is). Note: No 'Site-producer' is launched here yet! (as is in all the other functions).

This is useful to easily create corresponding variables such as:

var mySitePipelineStartsHere := SiteMakeChan() // ... lot's of code to design and build Your favourite "mySiteWorkflowPipeline"

// ...
// ... *before* You start pouring data into it, e.g. simply via:
for drop := range water {

mySitePipelineStartsHere <- drop

}

close(mySitePipelineStartsHere)

Hint: especially helpful, if Your piping library operates on some hidden (non-exported) type
(or on a type imported from elsewhere - and You don't want/need or should(!) have to care.)

Note: as always (except for SitePipeBuffer) the channel is unbuffered.

func SiteTubeFunc added in v0.2.0

func SiteTubeFunc(act func(a Site) Site) (tube func(inp SiteFrom) (out SiteFrom))

SiteTubeFunc returns a closure around PipeSiteFunc (_, act).

Types

type Site

type Site struct {
	URL    *url.URL
	Parent *url.URL
	Depth  int
}

Site represents what travels: an URL which may have a Parent URL, and a Depth.

type SiteFrom added in v0.2.0

type SiteFrom <-chan Site

SiteFrom is a receive-only Site channel

func SiteChan added in v0.2.0

func SiteChan(inp ...Site) (out SiteFrom)

SiteChan returns a channel to receive all inputs before close.

func SiteChanFuncErr added in v0.2.0

func SiteChanFuncErr(gen func() (Site, error)) (out SiteFrom)

SiteChanFuncErr returns a channel to receive all results of generator `gen` until `err != nil` before close.

func SiteChanFuncNok added in v0.2.0

func SiteChanFuncNok(gen func() (Site, bool)) (out SiteFrom)

SiteChanFuncNok returns a channel to receive all results of generator `gen` until `!ok` before close.

func SiteChanSlice added in v0.2.0

func SiteChanSlice(inp ...[]Site) (out SiteFrom)

SiteChanSlice returns a channel to receive all inputs before close.

func (SiteFrom) SiteDone added in v0.2.0

func (inp SiteFrom) SiteDone() (done <-chan struct{})

SiteDone returns a channel to receive one signal upon close and after `inp` has been drained.

func (SiteFrom) SiteDoneFunc added in v0.2.0

func (inp SiteFrom) SiteDoneFunc(act func(a Site)) (done <-chan struct{})

SiteDoneFunc will apply `act` to every `inp` and returns a channel to receive one signal upon close.

func (SiteFrom) SiteDoneLeave added in v0.2.0

func (inp SiteFrom) SiteDoneLeave(wg SiteWaiter) (done <-chan struct{})

SiteDoneLeave returns a channel to receive one signal after all throughput on `inp` has been registered as departure on the given `sync.WaitGroup` before close.

func (SiteFrom) SiteDoneSlice added in v0.2.0

func (inp SiteFrom) SiteDoneSlice() (done <-chan []Site)

SiteDoneSlice returns a channel to receive a slice with every Site received on `inp` upon close.

Note: Unlike SiteDone, SiteDoneSlice sends the fully accumulated slice, not just an event, once upon close of inp.

func (SiteFrom) SiteFanIn2 added in v0.2.0

func (inp SiteFrom) SiteFanIn2(inp2 SiteFrom) (out SiteFrom)

SiteFanIn2 returns a channel to receive all from both `inp` and `inp2` before close.

func (SiteFrom) SiteFini added in v0.2.0

func (inp SiteFrom) SiteFini() func(inp SiteFrom) (done <-chan struct{})

SiteFini returns a closure around `SiteDone()`.

func (SiteFrom) SiteFiniFunc added in v0.2.0

func (inp SiteFrom) SiteFiniFunc(act func(a Site)) func(inp SiteFrom) (done <-chan struct{})

SiteFiniFunc returns a closure around `SiteDoneFunc(act)`.

func (SiteFrom) SiteFiniLeave added in v0.2.0

func (inp SiteFrom) SiteFiniLeave(wg SiteWaiter) func(inp SiteFrom) (done <-chan struct{})

SiteFiniLeave returns a closure around `SiteDoneLeave(wg)` registering throughput as departure on the given `sync.WaitGroup`.

func (SiteFrom) SiteFiniSlice added in v0.2.0

func (inp SiteFrom) SiteFiniSlice() func(inp SiteFrom) (done <-chan []Site)

SiteFiniSlice returns a closure around `SiteDoneSlice()`.

func (SiteFrom) SiteFork added in v0.2.0

func (inp SiteFrom) SiteFork() (out1, out2 SiteFrom)

SiteFork returns two channels either of which is to receive every result of inp before close.

func (SiteFrom) SiteForkSeen added in v0.2.0

func (inp SiteFrom) SiteForkSeen() (new, old SiteFrom)

SiteForkSeen returns two channels, `new` and `old`, where `new` is to receive all `inp` not been seen before and `old` all `inp` seen before (internally growing a `sync.Map` to discriminate) until close.

func (SiteFrom) SiteForkSeenAttr added in v0.2.0

func (inp SiteFrom) SiteForkSeenAttr(attr func(a Site) interface{}) (new, old SiteFrom)

SiteForkSeenAttr returns two channels, `new` and `old`, where `new` is to receive all `inp` whose attribute `attr` has not been seen before and `old` all `inp` seen before (internally growing a `sync.Map` to discriminate) until close.

func (SiteFrom) SitePair added in v0.2.0

func (inp SiteFrom) SitePair() (out1, out2 SiteFrom)

SitePair returns a pair of channels to receive every result of inp before close.

Note: Yes, it is a VERY simple fanout - but sometimes all You need.

func (SiteFrom) SitePipeAdjust added in v0.2.0

func (inp SiteFrom) SitePipeAdjust(sizes ...int) (out SiteFrom)

SitePipeAdjust returns a channel to receive all `inp` buffered by a SiteSendProxy process before close.

func (SiteFrom) SitePipeEnter added in v0.2.0

func (inp SiteFrom) SitePipeEnter(wg SiteWaiter) (out SiteFrom)

SitePipeEnter returns a channel to receive all `inp` and registers throughput as arrival on the given `sync.WaitGroup` until close.

func (SiteFrom) SitePipeFunc added in v0.2.0

func (inp SiteFrom) SitePipeFunc(act func(a Site) Site) (out SiteFrom)

SitePipeFunc returns a channel to receive every result of action `act` applied to `inp` before close. Note: it 'could' be SitePipeMap for functional people, but 'map' has a very different meaning in go lang.

func (SiteFrom) SitePipeLeave added in v0.2.0

func (inp SiteFrom) SitePipeLeave(wg SiteWaiter) (out SiteFrom)

SitePipeLeave returns a channel to receive all `inp` and registers throughput as departure on the given `sync.WaitGroup` until close.

func (SiteFrom) SitePipeSeen added in v0.2.0

func (inp SiteFrom) SitePipeSeen() (out SiteFrom)

SitePipeSeen returns a channel to receive all `inp` not been seen before while silently dropping everything seen before (internally growing a `sync.Map` to discriminate) until close. Note: SitePipeFilterNotSeenYet might be a better name, but is fairly long.

func (SiteFrom) SitePipeSeenAttr added in v0.2.0

func (inp SiteFrom) SitePipeSeenAttr(attr func(a Site) interface{}) (out SiteFrom)

SitePipeSeenAttr returns a channel to receive all `inp` whose attribute `attr` has not been seen before while silently dropping everything seen before (internally growing a `sync.Map` to discriminate) until close. Note: SitePipeFilterAttrNotSeenYet might be a better name, but is fairly long.

func (SiteFrom) SiteStrew added in v0.2.0

func (inp SiteFrom) SiteStrew(size int) (outS []SiteFrom)

SiteStrew returns a slice (of size = size) of channels one of which shall receive each inp before close.

func (SiteFrom) SiteTubeAdjust added in v0.2.0

func (inp SiteFrom) SiteTubeAdjust(sizes ...int) (tube func(inp SiteFrom) (out SiteFrom))

SiteTubeAdjust returns a closure around SitePipeAdjust (_, sizes ...int).

func (SiteFrom) SiteTubeEnter added in v0.2.0

func (inp SiteFrom) SiteTubeEnter(wg SiteWaiter) (tube func(inp SiteFrom) (out SiteFrom))

SiteTubeEnter returns a closure around SitePipeEnter (wg) registering throughput as arrival on the given `sync.WaitGroup`.

func (SiteFrom) SiteTubeLeave added in v0.2.0

func (inp SiteFrom) SiteTubeLeave(wg SiteWaiter) (tube func(inp SiteFrom) (out SiteFrom))

SiteTubeLeave returns a closure around SitePipeLeave (wg) registering throughput as departure on the given `sync.WaitGroup`.

func (SiteFrom) SiteTubeSeen added in v0.2.0

func (inp SiteFrom) SiteTubeSeen() (tube func(inp SiteFrom) (out SiteFrom))

SiteTubeSeen returns a closure around SitePipeSeen() (silently dropping every Site seen before).

func (SiteFrom) SiteTubeSeenAttr added in v0.2.0

func (inp SiteFrom) SiteTubeSeenAttr(attr func(a Site) interface{}) (tube func(inp SiteFrom) (out SiteFrom))

SiteTubeSeenAttr returns a closure around SitePipeSeenAttr(attr) (silently dropping every Site whose attribute `attr` was seen before).

type SiteInto added in v0.2.0

type SiteInto chan<- Site

SiteInto is a send-only Site channel

func SiteSendProxy added in v0.2.0

func SiteSendProxy(out SiteInto, sizes ...int) (send SiteInto)

SiteSendProxy returns a channel to serve as a sending proxy to 'out'. Uses a goroutine to receive values from 'out' and store them in an expanding buffer, so that sending to 'out' never blocks.

Note: the expanding buffer is implemented via "container/ring"

Note: SiteSendProxy is kept for the Sieve example and other dynamic use to be discovered even so it does not fit the pipe tube pattern as SitePipeAdjust does.

func (SiteInto) SiteDoneWait added in v0.2.0

func (out SiteInto) SiteDoneWait(wg SiteWaiter) (done <-chan struct{})

SiteDoneWait returns a channel to receive one signal after wg.Wait() has returned and out has been closed before close.

Note: Use only *after* You've started flooding the facilities.

func (SiteInto) SiteFiniWait added in v0.2.0

func (out SiteInto) SiteFiniWait(wg SiteWaiter) func(out SiteInto) (done <-chan struct{})

SiteFiniWait returns a closure around `SiteDoneWait(wg)`.

type SiteWaiter

type SiteWaiter interface {
	Add(delta int)
	Done()
	Wait()
}

SiteWaiter - as implemented by `*sync.WaitGroup` - attends Flapdoors and keeps counting who enters and who leaves.

Use SiteDoneWait to learn about when the facilities are closed.

Note: You may also use Your provided `*sync.WaitGroup.Wait()` to know when to close the facilities. Just: SiteDoneWait is more convenient as it also closes the primary channel for You.

Just make sure to have _all_ entrances and exits attended, and `Wait()` only *after* You've started flooding the facilities.

type Traffic

type Traffic struct {
	// contains filtered or unexported fields
}

Traffic goes around inside a circular Site pipe network, e. g. a crawling Crawler.

func New

func New() (t *Traffic)

New returns a new and operational Traffic processor.

func (*Traffic) Done

func (t *Traffic) Done() (done <-chan struct{})

Done returns a channel which will be signalled and closed when traffic has subsided and nothing is left to be processed and consequently all goroutines have terminated.

Note: Done() here is a convenience method.
It is well known from the "context" package.
Just: no need to use it as `Processor` returns same.

func (*Traffic) Feed

func (t *Traffic) Feed(urls []*url.URL, parent *url.URL, depth int)

Feed registers new entries.

func (*Traffic) Processor

func (t *Traffic) Processor(crawl func(s Site), parallel int) (done <-chan struct{})

Processor builds the site traffic processing network; it is cirular if crawl uses Feed to provide feedback.

returned is a channel which will be signalled and closed when traffic has subsided and nothing is left to be processed and consequently all goroutines have terminated - as is from `Done()`.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL