webcrawler

command module
v0.0.0-...-90f9a36 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 8, 2016 License: GPL-2.0 Imports: 5 Imported by: 0

README

webcrawler

Build Status
Microservice that crawl webpages and generate a sitemap with links and pages assets.

###Build go get github.com/cleitonmarx/webcrawler
cd $GOPATH/src/github.com/cleitonmarx/webcrawler
make sure godep is installed, go get github.com/tools/godep and then build with
godep restore
godep go build -a

###Build Docker image
go get github.com/cleitonmarx/webcrawler
cd $GOPATH/src/github.com/cleitonmarx/webcrawler
docker build -t="webcrawler" .

###Run Docker image docker run -p 3333:3333 -d --name webcrawler webcrawler

##How to use ####Get current version: curl -X GET http://127.0.0.1:3333/

####Crawling a website: curl -X POST http://127.0.0.1:3333/crawler -d "url=http://www.digitalocean.com&depth=3&timeout=3s"

#####Parameters: url - mandatory (e.g. http://www.digitalocean.com, http://www.londondrugs.com/shop/electronics)
depth - optional - default: 5
timeout - optional - default: 5m (e.g. 1s to one second, 4m / four minutes)

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis
Godeps
_workspace/src/github.com/PuerkitoBio/goquery
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document.
_workspace/src/github.com/andybalholm/cascadia
The cascadia package is an implementation of CSS selectors.
The cascadia package is an implementation of CSS selectors.
_workspace/src/github.com/codegangsta/negroni
Package negroni is an idiomatic approach to web middleware in Go.
Package negroni is an idiomatic approach to web middleware in Go.
_workspace/src/golang.org/x/net/html
Package html implements an HTML5-compliant tokenizer and parser.
Package html implements an HTML5-compliant tokenizer and parser.
_workspace/src/golang.org/x/net/html/atom
Package atom provides integer codes (also known as atoms) for a fixed set of frequently occurring HTML strings: tag names and attribute keys such as "p" and "id".
Package atom provides integer codes (also known as atoms) for a fixed set of frequently occurring HTML strings: tag names and attribute keys such as "p" and "id".
_workspace/src/golang.org/x/net/html/charset
Package charset provides common text encodings for HTML documents.
Package charset provides common text encodings for HTML documents.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL