webutil

package module
v0.0.0-...-d82d3e1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2016 License: MIT Imports: 6 Imported by: 0

README

webutil

webutil provides Go functions that operate at a level better matching the level that I'm working at when I'm using JSON API's and scraping the web.

Features

  • GET a []byte. 404, etc. are an error.
  • Same, but return an io.ReadCloser.
  • Same, but parse with net/html and return the html.Node. Useful for using Cascadia to extract parts.
  • Get text from an html.Node.
  • Walk an html.Node.

Usage

go get sethwklein.net/go/webutil

import "sethwklein.net/go/webutil"

http://godoc.org/sethwklein.net/go/webutil

Rationale

I know better than to publish this package. Someone has said (I don't have a link handy :( ) that packages with util in the name are almost always a bad idea. And it makes no sense to use a package that pulls in an HTML parser when all you need is a []byte. But I can't seem to stop using it, so I'm publishing it even if it should be a bad idea.

Documentation

Overview

Package webutil is for working with the web at a higher level of abstraction than net/http and such.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Attr

func Attr(n *html.Node, name string) (string, bool)

Attr returns the value of the first attribute named name and a boolean indicating whether any was found.

func GetBytes

func GetBytes(url string) (buf []byte, err error)

GetBytes returns a slice of bytes containing the response body. It returns an error on any non-200 response.

func GetHTML

func GetHTML(url string) (doc *html.Node, err error)

GetHTML returns the parsed HTML response body as a (pointer to) a go.net/html.Node. It returns an error on any non-200 response.

func GetReadCloser

func GetReadCloser(url string) (io.ReadCloser, error)

GetReadCloser returns the response body as an io.ReadCloser. The caller is responsible for closing it. GetReadCloser returns an error on any non-200 response.

func Text

func Text(n *html.Node) string

Text returns the text in the nodes.

func Walk

func Walk(n *html.Node, f func(*html.Node))

Walk calls f once for the node and each of it's descendants. This function probably belongs in go.net/html and may go away if it is added there.

Types

type HTTPError

type HTTPError struct {
	Status string
	Code   int
	URL    string
}

HTTPError represents an undesired HTTP response. It is usually used by functions that want to return an error on non-200 responses.

func NewHTTPError

func NewHTTPError(resp *http.Response) *HTTPError

NewHTTPError constructs an HTTPError from the provided http.Response.

func (*HTTPError) Error

func (e *HTTPError) Error() string

Error returns a human readable version.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL