checker

package
v0.0.0-...-6ff5fd0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 8, 2013 License: MIT Imports: 2 Imported by: 1

Documentation

Overview

Package checker provides some utility functions for checking the validity of HTML5 tags, attribute name, and attribute values.

Index

Examples

Constants

View Source
const (
	UnicodeAmpersand  = '\u0026'
	UnicodeSemicolon  = '\u003B'
	UnicodeQuoteMark  = '\u0022'
	UnicodeApostrophe = '\u0027'
)

Unicode characters

View Source
const ControlCharacters string = "\u0000\u0007\u0008\u000B\u001B\u007F"

ControlCharacters are the non-SpaceCharacters ASCII control characters NUL, BEL, BS, VT, ESC, and DEL.

View Source
const InvalidAttributeNameCharacters string = "\u0022\u0027\u003E\u002F\u003D"

InvalidAttributeNameCharacters are the characters not valid in an attribute's name, minus the SpaceCharacters and the ControlCharacters.

View Source
const InvalidAttributeValueUnquotedCharacters string = "\u0022\u0027\u003C\u003D\u003E\u0060"

InvalidAttributeValueUnquotedCharacters are the characters not valid in an unquoted attribute's value.

View Source
const SpaceCharacters string = "\u0020\u0009\u000A\u000C\u000D"

SpaceCharacters are space, tab, linefeed, formfeed, and carriagereturn.

Variables

This section is empty.

Functions

func HasAmbiguousAmpersand

func HasAmbiguousAmpersand(val string) bool

HasAmbiguousAmpersand returns true if the argument contains a substring that is an ambiguous ampersand.

An ambiguous ampersand is a U+0026 AMPERSAND character (&) that is
followed by one or more alphanumeric ASCII characters, followed by
a ";" (U+003B) character, where these characters do not match any
of the names given in the named character references section.

It is ambiguous if it looks like a named character reference but is NOT one: "&ambiguous;" is ambiguous, but "&" is not because "&" is a valid reference. See also IsCharacterReferenceName and IsCharacterReference.

Example
fmt.Println(HasAmbiguousAmpersand("this & that"))
fmt.Println(HasAmbiguousAmpersand("this & that"))
fmt.Println(HasAmbiguousAmpersand("nothing here"))
fmt.Println(HasAmbiguousAmpersand("Can &what; be one?"))
Output:

false
false
false
true

func IsCharacterReference

func IsCharacterReference(ref string) bool

IsCharacterReference returns true if the argument is a valid character reference. See also IsCharacterReferenceName.

Example
fmt.Println(IsCharacterReference("&"))
fmt.Println(IsCharacterReference("&"))
fmt.Println(IsCharacterReference("amp"))   // Needs & and ;
fmt.Println(IsCharacterReference("&Amp;")) // Case sensitive
Output:

true
true
false
false

func IsCharacterReferenceName

func IsCharacterReferenceName(name string) bool

IsCharacterReferenceName returns true if the argument is a valid character reference name according to this list:

http://www.w3.org/TR/html5/syntax.html#named-character-references

See also IsCharacterReference.

Example
fmt.Println(IsCharacterReferenceName("amp"))
fmt.Println(IsCharacterReferenceName("AMP"))
fmt.Println(IsCharacterReferenceName("Amp")) // Names are case sensitive.
fmt.Println(IsCharacterReferenceName("&"))
Output:

true
true
false
false

func IsHTMLTagName

func IsHTMLTagName(name string) bool

IsHTMLTagName returns true if the argument is a defined HTML5 tag, case insensitive.

Note: In the interest of optimizing for the common case, the argument is assumed to be lowercase or uppercase. If `name` is mixed case, this function may return a false negative. Use IsHTMLTagNameSafe for a version that won't return a false negative.

Example
fmt.Println(IsHTMLTagName("strong"))
fmt.Println(IsHTMLTagName("tuesday"))
Output:

true
false

func IsHTMLTagNameSafe

func IsHTMLTagNameSafe(name string) bool

IsHTMLTagNameSafe returns true if the argument is a defined HTML5 tag. Unlike IsHTMLTagName, this version downcases the argument first to prevent mixed- case false negatives.

func IsValidAttributeName

func IsValidAttributeName(name string) bool

IsValidAttributeName returns true if the argument is a HTML5-valid attribute name, as defined here: http://www.w3.org/TR/html5/syntax.html#attributes-0

Attribute names must consist of one or more characters other than the
space characters, U+0000 NULL, U+0022 QUOTATION MARK ("), U+0027
APOSTROPHE ('), ">" (U+003E), "/" (U+002F), and "=" (U+003D) characters,
the control characters, and any characters that are not defined by
Unicode.

This is merely a syntax check. This function makes no judgments about the semantic validity of the argument.

Example
fmt.Println(IsValidAttributeName("example"))
fmt.Println(IsValidAttributeName("not valid"))
Output:

true
false

func IsValidAttributeValue

func IsValidAttributeValue(val string) bool

IsValidAttributeValue returns true if the argument is a valid attribute value, with the caveat that additional rules apply to unquoted, single- quoted, and double-quoted attribute values.

Attribute values are a mixture of text and character references,
except with the additional restriction that the text cannot contain
an ambiguous ampersand.

Definition: http://www.w3.org/TR/html5/syntax.html#attributes-0

Note: It is almost certainly a bad idea to call this function directly. Use one of IsValidAttributeValueUnquoted, IsValidAttributeValueSingleQuoted, or IsValidAttributeValueDoubleQuoted instead.

func IsValidAttributeValueDoubleQuoted

func IsValidAttributeValueDoubleQuoted(val string) bool

IsValidAttributeValueDoubleQuoted return true if the argument is a valid double-quoted attribute value. Note, the argument must not contain the double quotes.

the attribute value, which, in addition to the requirements given above
for attribute values, must not contain any literal """ (U+0022)
characters

From http://www.w3.org/TR/html5/syntax.html#attributes-0

Example
fmt.Println(IsValidAttributeValueDoubleQuoted("yes"))
fmt.Println(IsValidAttributeValueDoubleQuoted(`"no"`))
Output:

true
false

func IsValidAttributeValueSingleQuoted

func IsValidAttributeValueSingleQuoted(val string) bool

IsValidAttributeValueSingleQuoted return true if the argument is a valid single-quoted attribute value. Note, the argument must not contain the single quotes.

the attribute value, which, in addition to the requirements given above
for attribute values, must not contain any literal "'" (U+0027)
characters

From http://www.w3.org/TR/html5/syntax.html#attributes-0

Example
fmt.Println(IsValidAttributeValueSingleQuoted("yes"))
fmt.Println(IsValidAttributeValueSingleQuoted("'no'"))
Output:

true
false

func IsValidAttributeValueUnquoted

func IsValidAttributeValueUnquoted(val string) bool

IsValidAttributeValueUnquoted return true if the argument is a valid unquoted attribute value. For example, "email" in <input type=email>.

The attribute name, followed by zero or more space characters, followed
by a single U+003D EQUALS SIGN character, followed by zero or more space
characters, followed by the attribute value, which, in addition to the
requirements given above for attribute values, must not contain any
literal space characters, any U+0022 QUOTATION MARK characters ("),
U+0027 APOSTROPHE characters ('), "=" (U+003D) characters, "<" (U+003C)
characters, ">" (U+003E) characters, or "`" (U+0060) characters, and
must not be the empty string

From http://www.w3.org/TR/html5/syntax.html#attributes-0

func IsValidCss3IdValue

func IsValidCss3IdValue(val string) bool

IsValidCss3IdValue returns true if the argument is a valid CSS 3 ID value.

From http://www.w3.org/TR/css3-selectors/#id-selectors

An ID selector contains a "number sign" (U+0023, #) immediately followed
by the ID value, which must be an CSS identifiers.

See also IsValidCss3Identifier.

func IsValidCss3Identifier

func IsValidCss3Identifier(val string) bool

IsValidCss3Identifier returns true if the argument is a valid CSS3 identifier. Identifiers include element names, and the selector part of ID and class names.

Prohibited characters can be included using escaped values, and this function takes escaped values into account.

From http://www.w3.org/TR/CSS21/syndata.html#value-def-identifier

In CSS, identifiers ... can contain only:

- the characters [a-zA-Z0-9] and
- ISO 10646 characters U+00A0 and higher, plus
- the hyphen (-) and
- the underscore (_);
- they cannot start with a digit, two hyphens, or a hyphen followed by a
  digit.

Also, special characters can be escaped:

Any character (except a hexadecimal digit, linefeed, carriage return, or
form feed) can be escaped with a backslash to remove its special meaning.

or

Third, backslash escapes allow authors to refer to characters they cannot
easily put in a document. In this case, the backslash is followed by at
most six hexadecimal digits (0..9A..F), which stand for the ISO 10646
([ISO10646]) character with that number, which must not be zero.

BUG(dr): IsValidCss3Identifier uses the CSS 2.1 spec, which the CSS 3 spec links to when it refers "identifiers". Is this the most up-to-date?

Example
fmt.Println(IsValidCss3Identifier("hullo"))
Output:

true

func IsValidHTMLTagName

func IsValidHTMLTagName(name string) bool

IsValidHTMLTagName returns true if the argument *can be* a valid HTML 5 tag name.

Tags contain a tag name, giving the element's name. HTML elements all
have names that only use alphanumeric ASCII characters. In the HTML
syntax, tag names, even those for foreign elements, may be written with
any mix of lower- and uppercase letters that, when converted to all-
lowercase, matches the element's tag name; tag names are case-
insensitive.

This function checks the structural syntax of the argument (i.e. alphanumeric ASCII characters). It does not check if the argument is a pre-defined HTML5 tag name. Use IsHTMLTagName to see if it is a pre-defined name.

func IsValidHtml4IdValue

func IsValidHtml4IdValue(val string) bool

IsValidHtml4IdValue returns true if the argument is a valid HTML 4 ID attribute value.

Note: HTML 4 is more strict than HTML 5.

Note: HTML 4 ID values are strict enough that a valid ID value will be valid also for a quoted or unquoted attribute value.

Note: This function only checks the syntax. Additional restrictions on ID values, such as global uniqueness, also apply.

From http://www.w3.org/TR/html401/types.html#type-name

ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
followed by any number of letters, digits ([0-9]), hyphens ("-"),
underscores ("_"), colons (":"), and periods (".").
Example
fmt.Println(IsValidHtml4IdValue("introduction"))
fmt.Println(IsValidHtml4IdValue("last remarks"))
Output:

true
false

func IsValidHtml5IdValue

func IsValidHtml5IdValue(val string) bool

IsValidHtml5IdValue returns true if the argument is a valid HTML 5 ID value.

Note: This is much more permissive than HTML 4 ID values and CSS3 ID values, so don't get too wacky.

Note: Additional restrictions of HTML attribute values apply.

From http://dev.w3.org/html5/markup/global-attributes.html#common.attrs.id

Any string, with the following restrictions:
 - must be at least one character long
 - must not contain any space characters
Example
fmt.Println(IsValidHtml5IdValue("introduction"))
fmt.Println(IsValidHtml5IdValue("last remarks"))
Output:

true
false

Types

type NamedReferenceScanner

type NamedReferenceScanner struct {
	Value     string // The string to be scanned.
	LastIndex int    // The stopping point where Next last finished, or -1 to start from the beginning.
}

INTERNAL USE ONLY. NO API GUARANTEES.

NamedReferenceScanner is a utility for scanning through strings and looking for named character references.

The Next method only checks for the structure: an ampersand, one or more alphanumeric values, and a semicolon. Use IsCharacterReferenceName to see if the returned values are actually valid character reference names.

func NewNamedReferenceScanner

func NewNamedReferenceScanner(value string) *NamedReferenceScanner

NewNamedReferenceScanner creates a new scanner with the given value.

func (*NamedReferenceScanner) Next

func (scanner *NamedReferenceScanner) Next() (name string, ampIndex int)

Next returns the next named character reference (just the alphanumeric part, skipping the ampersand and semicolon) and the byte index of the leading ampersand.

Or empty string and -1 if no named character references could be found.

func (*NamedReferenceScanner) Reset

func (scanner *NamedReferenceScanner) Reset()

Reset resets the scanner to the beginning of the Value string.

Notes

Bugs

  • IsValidCss3Identifier uses the CSS 2.1 spec, which the CSS 3 spec links to when it refers "identifiers". Is this the most up-to-date?

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL