urlutil

package
v0.0.92 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 30, 2024 License: MIT Imports: 11 Imported by: 172

README

urlutil

The package contains various helpers to interact with URLs

URL Parsing Methods

Function Description Type Behavior
Parse(inputURL string) Standard URL Parsing (+ Some Edgecases) Both Relative & Absolute URLs NA
ParseURL(inputURL string, unsafe bool) Standard + Unsafe URL Parsing (+ Edgecases) Both Relative & Absolute URLs NA
ParseRelativeURL(inputURL string, unsafe bool) Standard + Unsafe URL Parsing (+ Edgecases) Only Relative URLs error if absolute URL is given
ParseRawRelativeURL(inputURL string, unsafe bool) Standard + Unsafe URL Parsing Only Relative URLs error if absolute URL is given
ParseAbsoluteURL(inputURL string, unsafe bool) Standard + Unsafe URL Parsing (+ Edgecases) Only Absolute URLs error if relative URL is given
Known Edgecases / Changes from url.URL
  • Query Parameters are Ordered
  • Invalid unicode characters and invalid url encodings allowed in unsafe mode
  • u.Path is always / prefixed if not empty (Except ParseRawRelativePath)
  • allows invalid values / encodings in url path
  • Does not encode characters except reserved characters in query parameters (see: Raw Params)
  • almost proper parsing of url into parts (scheme,host,path,query,fragment) [known limitation of manually added hostnames like mydomain (without . in hostname)]

More details on each edgecase/behavior is given below

difference b/w net/url.URL and utils/url/URL

  • url.URL caters to variety of urls and for that reason its parsing is not that accurate under various conditions

  • utils/url/URL is a wrapper around url.URL that handles below edgecases and is able to parse complex (i.e non-RFC compilant urls but required in infosec) url edgecases.

  • url.URL allows u.Path without / prefix but it is not allowed in utils/url/URL and is autocorrected if / prefix is missing

  • Parsing URLs without scheme

// if below urls are parsed with url.Parse(). url parts(scheme,host,path etc) are not properly classified
scanme.sh
scanme.sh:443/port
scame.sh/with/path
  • Encoding of parameters(url.Values)

    • url.URL encodes all reserved characters(as per RFC(s)) in parameter key-value pair (i.e url.Values{})
    • If reserved/special characters are url encoded then integrity of specially crafted payloads (lfi,xss,sqli) is lost.
    • utils/url/URL uses utils/url/Params to store/handle parameters and integrity of all such payload is preserved
    • utils/url/URL also provides options to customize url encoding using global variable and function params
  • Parsing Unsafe/Invalid Paths

    • while parsing urls url.Parse() either discards or re-encodes some of the specially crafted payloads
    • If a non valid url encoding is given in url (ex: scanme.sh/%invalid) url.Parse() returns error and url is not parsed
    • Such cases are implicitly handled if unsafe is true
// Example urls for above condition
scanme.sh/?some'param=`'+OR+ORDER+BY+1--
scanme.sh/?some[param]=<script>alert(1)</script>
scanme.sh/%invalid/path
  • utils/url/URL has some extra methods

    • .TrimPort()
    • .MergePath(newrelpath string, unsafe bool)
    • .UpdateRelPath(newrelpath string, unsafe bool)
    • .Clone() and more
  • Dealing with Double URL Encoding of chars like %0A when .Path is directly updated

    when url.Parse is used to parse url like https://127.0.0.1/%0A it internally calls u.setPath which decodes %0A to \n and saves it in u.Path and when final url is created at time of writing to connection in http.Request Path is then escaped again thus \n becomes %0A and final url becomes https://127.0.0.1/%0A which is expected/required behavior.

    If u.Path is changed/updated directly after url.Parse ex: u.Path = "%0A" then at time of writing to connection in http.Request, Path is escaped again thus %0A becomes %250A and final url becomes https://127.0.0.1/%250A which is not expected/required behavior to avoid this we manually unescape/decode u.Path and we set u.Path = unescape(u.Path) which takes care of this edgecase.

    This is how utils/url/URL handles this edgecase when u.Path is directly updated.

Note

utils/url/URL embeds url.URL and thus inherits and exposes all url.URL methods and variables. Its ok to use any method from url.URL (directly/indirectly) except url.URL.Query() and url.URL.String() (due to parameter encoding issues). In any case if it is not possible to follow above point (ex: directly updating/referencing http.Request.URL) .Update() method should be called before accessing them which updates url.URL instance for this edgecase. (Not required if above rule is followed)

Documentation

Index

Constants

View Source
const (
	HTTP  = "http"
	HTTPS = "https"

	// Deny all protocols
	// Allow:
	// websocket + websocket over ssl
	WEBSOCKET     = "ws"
	WEBSOCKET_SSL = "wss"
	FTP           = "ftp"

	SchemeSeparator  = "://"
	DefaultHTTPPort  = "80"
	DefaultHTTPSPort = "443"
)

Variables

View Source
var AllowLegacySeperator bool = false

Legacy Seperator (i.e `;`) is used as seperator for parameters this was removed in go >=1.17

View Source
var MustEscapeCharSet []rune = []rune{'?', '#', '@', ';', '&', ',', '[', ']', '^'}

MustEscapeCharSet are special chars that are always escaped and are based on reserved chars from RFC Some of Reserved Chars From RFC were excluded and some were added for various reasons and goal here is to encode parameters key and value only

View Source
var RFCEscapeCharSet []rune = []rune{'!', '*', '\'', '(', ')', ';', ':', '@', '&', '=', '+', '$', ',', '/', '?', '%', '#', '[', ']'}

Reserved Chars from RFC ! * ' ( ) ; : @ & = + $ , / ? % # [ ]

Functions

func AutoMergeRelPaths added in v0.0.6

func AutoMergeRelPaths(path1 string, path2 string) (string, error)

AutoMergeRelPaths merges two relative paths including parameters and returns final string

func ParamEncode added in v0.0.4

func ParamEncode(data string) string

ParamEncode encodes Key characters only. key characters include whitespaces + non printable chars + non-ascii also this does not double encode encoded characters

func PercentEncoding added in v0.0.4

func PercentEncoding(data string) string

PercentEncoding encodes all characters to percent encoded format just like burpsuite decoder

func URLEncodeWithEscapes added in v0.0.4

func URLEncodeWithEscapes(data string, charset ...rune) string

URLEncodeWithEscapes URL encodes data with given special characters escaped (similar to burpsuite intruder) Note `MustEscapeCharSet` is not included

Types

type OrderedParams added in v0.0.40

type OrderedParams struct {

	// IncludeEquals is used to include = in encoded parameters, default is false
	IncludeEquals bool
	// contains filtered or unexported fields
}

OrderedParams is a map that preserves the order of elements

func NewOrderedParams added in v0.0.40

func NewOrderedParams() *OrderedParams

NewOrderedParams creates a new ordered params

func (*OrderedParams) Add added in v0.0.40

func (o *OrderedParams) Add(key string, value ...string)

Add Parameters to store

func (*OrderedParams) Clone added in v0.0.40

func (o *OrderedParams) Clone() *OrderedParams

Clone returns a copy of the ordered params

func (*OrderedParams) Decode added in v0.0.40

func (o *OrderedParams) Decode(raw string)

Decode is opposite of Encode() where ("bar=baz&foo=quux") is parsed Parameters are loosely parsed to allow any scenario

func (*OrderedParams) Del added in v0.0.40

func (o *OrderedParams) Del(key string)

Del deletes values associated with key

func (*OrderedParams) Encode added in v0.0.40

func (o *OrderedParams) Encode() string

Encode returns encoded parameters by preserving order

func (*OrderedParams) Get added in v0.0.40

func (o *OrderedParams) Get(key string) string

Get returns first value of given key

func (*OrderedParams) GetAll added in v0.0.40

func (o *OrderedParams) GetAll(key string) []string

GetAll returns all values of given key or returns empty slice if key doesn't exist

func (*OrderedParams) Has added in v0.0.40

func (o *OrderedParams) Has(key string) bool

Has returns if given key exists

func (*OrderedParams) IsEmpty added in v0.0.40

func (o *OrderedParams) IsEmpty() bool

IsEmpty checks if the OrderedParams is empty

func (*OrderedParams) Iterate added in v0.0.40

func (o *OrderedParams) Iterate(f func(key string, value []string) bool)

Iterate iterates over the OrderedParams

func (*OrderedParams) Merge added in v0.0.40

func (o *OrderedParams) Merge(raw string)

Merges given paramset into existing one with base as priority

func (*OrderedParams) Set added in v0.0.40

func (o *OrderedParams) Set(key string, value string)

Set sets the key to value and replaces if already exists

func (*OrderedParams) Update added in v0.0.40

func (o *OrderedParams) Update(key string, value []string)

Update is similar to Set but it takes value as slice (similar to internal implementation of url.Values)

type Params added in v0.0.4

type Params map[string][]string

func GetParams added in v0.0.4

func GetParams(query url.Values) Params

GetParams return Params type using url.Values

func NewParams added in v0.0.4

func NewParams() Params

func (Params) Add added in v0.0.4

func (p Params) Add(key string, value ...string)

Add Parameters to store

func (Params) Decode added in v0.0.4

func (p Params) Decode(raw string)

Decode is opposite of Encode() where ("bar=baz&foo=quux") is parsed Parameters are loosely parsed to allow any scenario

func (Params) Del added in v0.0.4

func (p Params) Del(key string)

Del deletes values associated with key

func (Params) Encode added in v0.0.4

func (p Params) Encode() string

Encode URL encodes and returns values ("bar=baz&foo=quux") sorted by key.

func (Params) Get added in v0.0.4

func (p Params) Get(key string) string

Get returns first value of given key

func (Params) Has added in v0.0.4

func (p Params) Has(key string) bool

Has returns if given key exists

func (Params) Merge added in v0.0.4

func (p Params) Merge(x Params)

Merges given paramset into existing one with base as priority

func (Params) Set added in v0.0.4

func (p Params) Set(key string, value string)

Set sets the key to value and replaces if already exists

type URL

type URL struct {
	*url.URL

	Original   string         // original or given url(without params if any)
	Unsafe     bool           // If request is unsafe (skip validation)
	IsRelative bool           // If URL is relative
	Params     *OrderedParams // Query Parameters
	// contains filtered or unexported fields
}

URL a wrapper around net/url.URL

func Parse

func Parse(inputURL string) (*URL, error)

ParseURL (can be relative or absolute)

func ParseAbsoluteURL added in v0.0.69

func ParseAbsoluteURL(inputURL string, unsafe bool) (*URL, error)

ParseAbsoluteURL parses and returns absolute url should be preferred over others when input is known to be absolute url this reduces any normalization and autocorrection related to relative paths and returns error if input is relative path

func ParseRawRelativePath added in v0.0.69

func ParseRawRelativePath(inputURL string, unsafe bool) (*URL, error)

ParseRelativePath

func ParseRelativePath added in v0.0.8

func ParseRelativePath(inputURL string, unsafe bool) (*URL, error)

ParseRelativePath parses and returns relative path should be preferred over others when input is known to be relative path this reduces any normalization and autocorrection related to absolute paths and returns error if input is absolute path

func ParseURL added in v0.0.4

func ParseURL(inputURL string, unsafe bool) (*URL, error)

Parse and return URL (can be relative or absolute)

func (*URL) Clone added in v0.0.6

func (u *URL) Clone() *URL

Clone

func (*URL) EscapedString added in v0.0.8

func (u *URL) EscapedString() string

EscapedString returns a string that can be used as filename (i.e stripped of / and params etc)

func (*URL) GetRelativePath added in v0.0.6

func (u *URL) GetRelativePath() string

GetRelativePath ex: /some/path?param=true#fragment

func (*URL) MergePath added in v0.0.6

func (u *URL) MergePath(newrelpath string, unsafe bool) error

mergepath merges given relative path

func (*URL) Query added in v0.0.6

func (u *URL) Query() *OrderedParams

Query returns Query Params

func (*URL) String

func (u *URL) String() string

String

func (*URL) TrimPort added in v0.0.8

func (u *URL) TrimPort()

TrimPort if any

func (*URL) Update added in v0.0.6

func (u *URL) Update()

Updates internal wrapped url.URL with any changes done to Query Parameters

func (*URL) UpdatePort added in v0.0.6

func (u *URL) UpdatePort(newport string)

Updates port

func (*URL) UpdateRelPath added in v0.0.8

func (u *URL) UpdateRelPath(newrelpath string, unsafe bool) error

UpdateRelPath updates relative path with new path (existing params are not removed)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL