article

package module

v0.0.0-...-92c3d03 Latest Latest Go to latest Published: Jan 17, 2016 License: LGPL-3.0 Imports: 6 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/golibri/article

Links

Open Source Insights

README ¶

golibri/article

Get Text Content from HTML. An Article gets constructed through processing a HTML page. The relevant content is stripped from all the useless junk and markup and stored as Fulltext. Works best with blog posts or news articles, but even a tweet should suffice.

Given an HTML string of any "content"-site, this module:

Fulltext: extracts the relevant paragraphs as plain text
Language: determines the language of the text (fallback: en)
Description: summarizes the text into a short snippet of upto 3 sentences

installation

go get -u github.com/golibri/article

usage

import "github.com/golibri/article"

func main() {
    // ...get HTML string somewhere, e.g.: with golibri/fetch
    a := article.Parse("website-html-string")
    // a is an Article object, see below
}

data fields

type Article struct {
    Language    string
    Description string
    Fulltext    string
}

license

LGPLv3. (You can use it in commercial projects as you like, but improvements/bugfixes must flow back to this lib.)

Documentation ¶

Overview ¶

Extract Text Content from a HTML string

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Article ¶

type Article struct {
	Language    string
	Description string
	Fulltext    string
}

func Parse ¶

func Parse(s string) Article

Source Files ¶

View all Source files

article.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL