htmlizer

package module
v0.0.0-...-003fcb5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 18, 2018 License: MIT Imports: 4 Imported by: 1

README

htmlizer

Build Status

Parses only human readable content from HTML DOM.

Example

import (
  "fmt"
  "github.com/gpestana/htmlizer"
)

func main() {
  html := `
    <html>
     <body>
       <h1>Heading H1</h1>
       <p>This is the first text</p>
       <h2>heading h2</h2>
       <p>This is the second text</p>
     </body>
     <script>console.log("scripts are discarded")</script>
   </html>`

  // will trim out all the tabs from text
  ignore := []rune{'\t'}
  hizer := htmlizer.New(ignore)
  hizer.Load(html)

  fmt.Println(">> Struct:")
  fmt.Println(hizer)

  fmt.Println(">> Human readable content:")
  fmt.Println(hizer.HumanReadable())
}

Output:

>> Struct:
{[Heading H1 heading h2], [this is the first text this is the seconf text]}
>> Human readable content:
Heading H1
This is the first text
heading h2
This is the second text

Contribute

Fork and PR and use issues for bug reports, feature requests and general comments.

gpestana © MIT

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Htmlizer

type Htmlizer struct {
	Tags []Tag
	// contains filtered or unexported fields
}

func New

func New(ignore []rune) (Htmlizer, error)

func (*Htmlizer) GetValues

func (h *Htmlizer) GetValues(tagType string) ([]Tag, error)

Returns all values of `tagType`

func (*Htmlizer) HumanReadable

func (h *Htmlizer) HumanReadable() string

func (*Htmlizer) Load

func (h *Htmlizer) Load(s string) error

type Tag

type Tag struct {
	Type  string
	Value string
}

func (*Tag) String

func (t *Tag) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL