firstlink

package module
v0.0.0-...-4acceec Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 5, 2020 License: MIT Imports: 11 Imported by: 0

README

README

firstlink: library to extract the first article link of a Wikipedia article

Documentation

Index

Constants

This section is empty.

Variables

View Source
var FirstLinkError = errors.New("first link not found")
View Source
var LimitError = errors.New("limit reached")
View Source
var LoopError = errors.New("loop detected")

Functions

func ExtractLanguage

func ExtractLanguage(link string) (string, error)

ExtractLanguage extracts the language code of a Wikipedia article URL. I.e. for https://de.wikipedia.org/wiki/Whatever, the language code is de.

func ExtractLinkName

func ExtractLinkName(a *html.Node) (string, error)

ExtractLinkName extracts the link name of an <a> tag. Example: <a href="foo.com">Foo</a> yields "Foo"

func ExtractLinkNameURL

func ExtractLinkNameURL(articleURL string) (string, error)

ExtractLinkNameURL extracts the article name from a Wikipedia article URL, i.e. "https://en.wikipedia.org/wiki/Computer" becomes "Computer"

func FilterChildren

func FilterChildren(n *html.Node, predicate func(n *html.Node) bool) []*html.Node

FilterChildren filters the children of Node n recursively by applying the given predicate function.

func FilterChildrenTerminate

func FilterChildrenTerminate(n *html.Node,
	terminate, predicate func(n *html.Node) bool) []*html.Node

FilterChildrenTerminate filters the children of Node n recursively by applying the given predicate function. The children of a node the terminate predicate applies to will not be entered.

func FindFirstLink(link string) (string, error)

FindFirstLink finds the first link from a given Wikipedia article to another article.

func IsSubSliceOf

func IsSubSliceOf(subslice, slice []string) bool

IsSubSliceOf tests whether or not subslice is contained in slice.

func OutputRecordsToCSV

func OutputRecordsToCSV(records []TestOutputRecord, w io.Writer) error

OutputRecordsToCSV converts the records slice to the CSV format using the record structure lang,source,target,expected,actual,result

func RemoveParens

func RemoveParens(paragraph string) string

RemoveParens removes all parentheses and square brackets with their content.

func RenderHTML

func RenderHTML(n *html.Node) string

RenderHTML renders the given HTML node as plain text.

Types

type ArticleHopCountError

type ArticleHopCountError struct {
	// contains filtered or unexported fields
}

func ArticleHopCount

func ArticleHopCount(lang, source, target string, limit uint8) (int, *ArticleHopCountError)

ArticleHopCount counts the number of first article link references needed to get from the source to the target article. If the limit is reached, an error and -1 is returned.

func (*ArticleHopCountError) Error

func (err *ArticleHopCountError) Error() string

type TestInputRecord

type TestInputRecord struct {
	Lang     string `json:"lang"`
	Source   string `json:"source"`
	Target   string `json:"target"`
	Expected int    `json:"expected"`
}

TestInputRecord describes the parameters to run a article hop count test.

func InputRecordsFromCSV

func InputRecordsFromCSV(r io.Reader) ([]TestInputRecord, error)

InputRecordsFromCSV converts the CSV data behind the given reader with the record structure of lang,source,target,expected into a slice of test input records.

func (TestInputRecord) Equals

func (r TestInputRecord) Equals(other TestInputRecord) bool

func (TestInputRecord) String

func (r TestInputRecord) String() string

type TestOutputRecord

type TestOutputRecord struct {
	TestInputRecord
	Actual int    `json:"actual"`
	Result string `json:"result"`
}

TestOutputRecord describes the input parameters of a hop count test combined with its Result.

func ProcessArticleHopTests

func ProcessArticleHopTests(input []TestInputRecord, limit uint8) []TestOutputRecord

ProcessArticleHopTests processes the test cases from the input using the ArticleHopCount function and returns the result report.

func (TestOutputRecord) EqualInput

func (r TestOutputRecord) EqualInput(input TestInputRecord) bool

func (TestOutputRecord) Equals

func (r TestOutputRecord) Equals(other TestOutputRecord) bool

func (TestOutputRecord) String

func (r TestOutputRecord) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL