scrape

package module
v0.0.0-...-d912992 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 31, 2016 License: MIT Imports: 2 Imported by: 0

README

scrape

A jquery like interface for Go website scrapping.

Usage

import (
    "fmt"
    "net/http"

    "golang.org/x/net/html"

    "github.com/VirrageS/scrape"
)

func main() {
    response, err := http.Get("https://github.com/trending")
    if err != nil {
        return
    }

    root, err := html.Parse(response.Body)
    if err != nil {
        return
    }

    repos := scrape.Find(root, ".repo-list-item")
    for _, repo := range repos {
        // get url
        link := scrape.Find(repo, ".repo-list-name a")[0]
        url := "https://github.com" + scrape.Attr(link, "href")

        // get name
        name := scrape.Text(link)

        fmt.Printf("[REPO] name: %s; url: %s\n", name, url)
    }
}

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Attr

func Attr(node *html.Node, key string) string

Attr returns the value of an HTML attribute.

func Closest

func Closest(node *html.Node, selector string) (*html.Node, bool)

Closest searches up HTML tree from the current node until either a match is found or the top is hit.

func Find

func Find(node *html.Node, selector string) []*html.Node

Find returns all nodes which match selector.

func Text

func Text(node *html.Node) string

Text searches for text and concatenate all separated strings

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL