urlverifier

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 2, 2024 License: MIT Imports: 6 Imported by: 7

README

url-verifier

🔗 A Go library for URL validation and verification: does this URL actually work?

Build Status codecov Go Report Card Go Reference

Features

  • URL Validation: validates whether a string is a valid URL.
  • Different Validation Types: validates whether the URL is valid according to a "human" definition of a correct URL, strict compliance with RFC3986 (Uniform Resource Identifier (URI): Generic Syntax), and/or compliance with RFC3986 with the addition of a schema e.g. HTTPS.
  • Reachability: verifies whether the URL is actually reachable via an HTTP GET request and provides the status code returned.

Rationale

There are several methods of validating URLs in Go depending on what you're trying to achieve. Strict, technical validation can be done through a simple call to url.Parse in Go's Standard library or a more "human" definition of a valid URL using govalidator (which is what this library uses internally for syntax verification).

However, this will successfully validate all types of URLs, from relative paths through to hostnames without a scheme. Often, when building user-facing applications, what we actually want is a way to check whether the URL input provided will actually work i.e. it's valid, it resolves, and it can be loaded in a web browser.

Install

Use go get to install this package.

go get -u github.com/davidmytton/url-verifier

Usage

Basic usage

Use Verify to check whether a URL is correct:

package main

import (
 "fmt"

 urlverifier "github.com/davidmytton/url-verifier"
)

func main() {
 url := "https://example.com/"

 verifier := urlverifier.NewVerifier()
 ret, err := verifier.Verify(url)

 if err != nil {
  fmt.Errorf("Error: %s", err)
 }

 fmt.Printf("Result: %+v\n", ret)
 /*
   Result: &{
    URL:https://example.com/
    URLComponents:https://example.com/
    IsURL:true
    IsRFC3986URL:true
    IsRFC3986URI:true
    HTTP:<nil>
   }
 */
}

URL reachability check

Call EnableHTTPCheck() to issue a GET request to the HTTP or HTTPS URL and check whether it is reachable and successfully returns a response (a success (2xx) or success-like code (3xx)). Non-HTTP(S) URLs will return an error.

package main

import (
 "fmt"

 urlverifier "github.com/davidmytton/url-verifier"
)

func main() {
 url := "https://example.com/"

 verifier := urlverifier.NewVerifier()
 verifier.EnableHTTPCheck()
 ret, err := verifier.Verify(url)

 if err != nil {
  fmt.Errorf("Error: %s", err)
 }

 fmt.Printf("Result: %+v\n", ret)
 fmt.Printf("HTTP: %+v\n", ret.HTTP)

 if ret.HTTP.IsSuccess {
  fmt.Println("The URL is reachable with status code", ret.HTTP.StatusCode)
 }
 /*
   Result: &{
    URL:https://example.com/
    URLComponents:https://example.com/
    IsURL:true
    IsRFC3986URL:true
    IsRFC3986URI:true
    HTTP:0x140000b6a50
   }
   HTTP: &{
    Reachable:true
    StatusCode:200
    IsSuccess:true
   }
   The URL is reachable with status code 200
 */
}

HTTP checks against internal URLs

By default, the reachability checks are only executed if the host resolves to a non-internal IP address. An internal IP address is defined as any of: private, loopback, link-local unicast, link-local multicast, interface-local multicast, or unspecified.

This is one layer of protection against Server Side Request Forgery (SSRF) requests.

To allow internal HTTP checks, call verifier.AllowHTTPCheckInternal():

urlToCheck := "http://localhost:3000"

verifier := NewVerifier()
verifier.EnableHTTPCheck()
// Danger: Makes SSRF easier!
verifier.AllowHTTPCheckInternal()
ret, err := verifier.Verify(urlToCheck)
...

Credits

This library is heavily inspired by email-verifier.

License

This package is licensed under the MIT License.

Documentation

Overview

SPDX-License-Identifier: MIT

Package urlverifier is a Go library for URL validation and verification: does this URL actually work? SPDX-License-Identifier: MIT

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type HTTP

type HTTP struct {
	Reachable  bool `json:"reachable"`   // Whether the URL is reachable via HTTP. This may be true even if the response is an HTTP error e.g. a 500 error.
	StatusCode int  `json:"status_code"` // The HTTP status code
	IsSuccess  bool `josn:"is_success"`  // Whether the HTTP response is a success (2xx) or success-like code (3xx)
}

HTTP is the result of a HTTP check

type Result

type Result struct {
	URL           string   `json:"url"`            // The URL that was checked
	URLComponents *url.URL `json:"url_components"` // The URL components, if the URL is valid
	IsURL         bool     `json:"is_url"`         // Whether the URL is valid
	IsRFC3986URL  bool     `json:"is_rfc3986_url"` // Whether the URL is a valid URL according to RFC 3986. This is the same as IsRFC3986URI but with a check for a scheme.
	IsRFC3986URI  bool     `json:"is_rfc3986_uri"` // Whether the URL is a valid URI according to RFC 3986
	HTTP          *HTTP    `json:"http"`           // The result of a HTTP check, if enabled
}

Result is the result of a URL verification

type Verifier

type Verifier struct {
	// contains filtered or unexported fields
}

Verifier is a URL Verifier. Create one using NewVerifier()

func NewVerifier

func NewVerifier() *Verifier

NewVerifier creates a new URL Verifier

func (*Verifier) AllowHTTPCheckInternal added in v0.2.0

func (v *Verifier) AllowHTTPCheckInternal()

AllowHTTPCheckInternal allows checking internal URLs

func (*Verifier) CheckHTTP

func (v *Verifier) CheckHTTP(urlToCheck string) (*HTTP, error)

CheckHTTP checks if the URL is reachable via HTTP

func (*Verifier) DisableHTTPCheck

func (v *Verifier) DisableHTTPCheck()

DisableHTTPCheck disables checking if the URL is reachable via HTTP

func (*Verifier) DisallowHTTPCheckInternal added in v0.2.0

func (v *Verifier) DisallowHTTPCheckInternal()

DisallowHTTPCheckInternal disallows checking internal URLs

func (*Verifier) EnableHTTPCheck

func (v *Verifier) EnableHTTPCheck()

EnableHTTPCheck enables checking if the URL is reachable via HTTP

func (*Verifier) IsRequestURI

func (v *Verifier) IsRequestURI(rawURL string) bool

IsRequestURI checks if the string rawURL, assuming it was received in an HTTP request, is an absolute URI or an absolute path. Implemented from govalidator: https://github.com/asaskevich/govalidator/blob/f21760c49a8d602d863493de796926d2a5c1138d/validator.go#L144

func (*Verifier) IsRequestURL

func (v *Verifier) IsRequestURL(rawURL string) bool

IsRequestURL checks if the string rawURL, assuming it was received in an HTTP request, is a valid URL confirm to RFC 3986. Implemented from govalidator: https://github.com/asaskevich/govalidator/blob/f21760c49a8d602d863493de796926d2a5c1138d/validator.go#L130

func (*Verifier) Verify

func (v *Verifier) Verify(rawURL string) (*Result, error)

Verify verifies a URL. It checks if the URL is valid, parses it if so, and checks if it is valid according to RFC 3986 (as a URI without a scheme and a URL with a scheme). If the HTTP check is enabled, it also checks if the URL is reachable via HTTP.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL