readability

package module
v0.0.0-...-c2dce56 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 19, 2022 License: Unlicense Imports: 9 Imported by: 10

README

go-readability

go-readability is library for extracting the main content off of an HTML page. This library implements the readability algorithm created by arc90 labs and was heavily inspired by https://github.com/cantino/ruby-readability.

Installation

go get github.com/mauidude/go-readability

Example

import(
  "github.com/mauidude/go-readability"
)

...

doc, err := readability.NewDocument(html)
if err != nil {
  // do something ...
}

content := doc.Content()
// do something with my content

Tests

To run tests go test github.com/mauidude/go-readability

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	Logger = log.New(ioutil.Discard, "[readability] ", log.LstdFlags)
)

Functions

This section is empty.

Types

type Document

type Document struct {
	RemoveUnlikelyCandidates bool
	WeightClasses            bool
	CleanConditionally       bool
	BestCandidateHasImage    bool
	RetryLength              int
	MinTextLength            int
	RemoveEmptyNodes         bool
	WhitelistTags            []string
	WhitelistAttrs           map[string][]string
	// contains filtered or unexported fields
}

func NewDocument

func NewDocument(s string) (*Document, error)

func (*Document) Content

func (d *Document) Content() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL