htmlstrip

package module
v0.0.0-...-922f784 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 9, 2019 License: MIT Imports: 3 Imported by: 0

README

htmlstrip

Strips HTML from the input, outputs plain text. It is streamed in realtime without preloading the whole document.

  • Easy to use Writer interface:
    io.Copy(&htmlstrip.Writer{W: os.Stdout}, os.Stdin)

  • All it does is strip HTML into plain text.

  • Should never use excessive memory as it does not buffer the whole document.

  • Script, style and head tags are removed entirely, as they are not part of the page's text.

  • The provided command strips HTML from standard input or specified files, writes plain text to standard output.
    go install github.com/millerlogic/htmlstrip/cmd/htmlstrip

  • Could be used as an extremely basic, non-interactive text browser:
    curl -s -S https://en.wikipedia.org/wiki/Chinchilla | htmlstrip | less

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Writer

type Writer struct {
	W io.Writer
	// contains filtered or unexported fields
}

Writer strips any HTML written, calls W.Write() with the plain text.

func (*Writer) Write

func (p *Writer) Write(data []byte) (int, error)

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL