xurls

package module
v1.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 25, 2017 License: BSD-3-Clause Imports: 1 Imported by: 0

README

xurls

GoDoc Travis

Extract urls from text using regular expressions.

go get -u github.com/mvdan/xurls
import "github.com/mvdan/xurls"

func main() {
	xurls.Relaxed.FindString("Do gophers live in golang.org?")
	// "golang.org"
	xurls.Strict.FindAllString("foo.com is http://foo.com/.", -1)
	// []string{"http://foo.com/"}
}

Relaxed is around five times slower than Strict since it does more work to find the URLs without relying on the scheme:

BenchmarkStrictEmpty-4           1000000              1885 ns/op
BenchmarkStrictSingle-4           200000              8356 ns/op
BenchmarkStrictMany-4             100000             22547 ns/op
BenchmarkRelaxedEmpty-4           200000              7284 ns/op
BenchmarkRelaxedSingle-4           30000             58557 ns/op
BenchmarkRelaxedMany-4             10000            130251 ns/op
cmd/xurls
go get -u github.com/mvdan/xurls/cmd/xurls
$ echo "Do gophers live in http://golang.org?" | xurls
http://golang.org

Documentation

Overview

Package xurls extracts urls from plain text using regular expressions.

Example
package main

import (
	"fmt"

	"github.com/mvdan/xurls"
)

func main() {
	fmt.Println(xurls.Relaxed.FindString("Do gophers live in http://golang.org?"))
	fmt.Println(xurls.Relaxed.FindAllString("foo.com is http://foo.com/.", -1))
}
Output:

http://golang.org
[foo.com http://foo.com/]

Index

Examples

Constants

This section is empty.

Variables

View Source
var (
	// Relaxed matches all the urls it can find.
	Relaxed = regexp.MustCompile(relaxed)
	// Strict only matches urls with a scheme to avoid false positives.
	Strict = regexp.MustCompile(strict)
)
View Source
var PseudoTLDs = []string{
	`bit`,
	`example`,
	`exit`,
	`gnu`,
	`i2p`,
	`invalid`,
	`local`,
	`localhost`,
	`test`,
	`zkey`,
}

PseudoTLDs is a sorted list of some widely used unofficial TLDs.

Sources:

View Source
var SchemesNoAuthority = []string{
	`bitcoin`,
	`file`,
	`magnet`,
	`mailto`,
	`sms`,
	`tel`,
	`xmpp`,
}

SchemesNoAuthority is a sorted list of some well-known url schemes that are followed by ":" instead of "://". Since these are more prone to false positives, we limit their matching.

View Source
var TLDs = []string{}/* 1554 elements not displayed */

TLDs is a sorted list of all public top-level domains.

Sources:

Functions

func StrictMatchingScheme added in v0.8.0

func StrictMatchingScheme(exp string) (*regexp.Regexp, error)

StrictMatchingScheme produces a regexp that matches urls like Strict but whose scheme matches the given regular expression.

Types

This section is empty.

Directories

Path Synopsis
cmd
generate

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL