Documentation ¶
Overview ¶
Package checker provides some utility functions for checking the validity of HTML5 tags, attribute name, and attribute values.
Index ¶
- Constants
- func HasAmbiguousAmpersand(val string) bool
- func IsCharacterReference(ref string) bool
- func IsCharacterReferenceName(name string) bool
- func IsHTMLTagName(name string) bool
- func IsHTMLTagNameSafe(name string) bool
- func IsValidAttributeName(name string) bool
- func IsValidAttributeValue(val string) bool
- func IsValidAttributeValueDoubleQuoted(val string) bool
- func IsValidAttributeValueSingleQuoted(val string) bool
- func IsValidAttributeValueUnquoted(val string) bool
- func IsValidCss3IdValue(val string) bool
- func IsValidCss3Identifier(val string) bool
- func IsValidHTMLTagName(name string) bool
- func IsValidHtml4IdValue(val string) bool
- func IsValidHtml5IdValue(val string) bool
- type NamedReferenceScanner
- Bugs
Examples ¶
Constants ¶
const ( UnicodeAmpersand = '\u0026' UnicodeSemicolon = '\u003B' UnicodeQuoteMark = '\u0022' UnicodeApostrophe = '\u0027' )
Unicode characters
const ControlCharacters string = "\u0000\u0007\u0008\u000B\u001B\u007F"
ControlCharacters are the non-SpaceCharacters ASCII control characters NUL, BEL, BS, VT, ESC, and DEL.
const InvalidAttributeNameCharacters string = "\u0022\u0027\u003E\u002F\u003D"
InvalidAttributeNameCharacters are the characters not valid in an attribute's name, minus the SpaceCharacters and the ControlCharacters.
const InvalidAttributeValueUnquotedCharacters string = "\u0022\u0027\u003C\u003D\u003E\u0060"
InvalidAttributeValueUnquotedCharacters are the characters not valid in an unquoted attribute's value.
const SpaceCharacters string = "\u0020\u0009\u000A\u000C\u000D"
SpaceCharacters are space, tab, linefeed, formfeed, and carriagereturn.
Variables ¶
This section is empty.
Functions ¶
func HasAmbiguousAmpersand ¶
HasAmbiguousAmpersand returns true if the argument contains a substring that is an ambiguous ampersand.
An ambiguous ampersand is a U+0026 AMPERSAND character (&) that is followed by one or more alphanumeric ASCII characters, followed by a ";" (U+003B) character, where these characters do not match any of the names given in the named character references section.
It is ambiguous if it looks like a named character reference but is NOT one: "&ambiguous;" is ambiguous, but "&" is not because "&" is a valid reference. See also IsCharacterReferenceName and IsCharacterReference.
Example ¶
fmt.Println(HasAmbiguousAmpersand("this & that")) fmt.Println(HasAmbiguousAmpersand("this & that")) fmt.Println(HasAmbiguousAmpersand("nothing here")) fmt.Println(HasAmbiguousAmpersand("Can &what; be one?"))
Output: false false false true
func IsCharacterReference ¶
IsCharacterReference returns true if the argument is a valid character reference. See also IsCharacterReferenceName.
Example ¶
fmt.Println(IsCharacterReference("&")) fmt.Println(IsCharacterReference("&")) fmt.Println(IsCharacterReference("amp")) // Needs & and ; fmt.Println(IsCharacterReference("&Amp;")) // Case sensitive
Output: true true false false
func IsCharacterReferenceName ¶
IsCharacterReferenceName returns true if the argument is a valid character reference name according to this list:
http://www.w3.org/TR/html5/syntax.html#named-character-references
See also IsCharacterReference.
Example ¶
fmt.Println(IsCharacterReferenceName("amp")) fmt.Println(IsCharacterReferenceName("AMP")) fmt.Println(IsCharacterReferenceName("Amp")) // Names are case sensitive. fmt.Println(IsCharacterReferenceName("&"))
Output: true true false false
func IsHTMLTagName ¶
IsHTMLTagName returns true if the argument is a defined HTML5 tag, case insensitive.
Note: In the interest of optimizing for the common case, the argument is assumed to be lowercase or uppercase. If `name` is mixed case, this function may return a false negative. Use IsHTMLTagNameSafe for a version that won't return a false negative.
Example ¶
fmt.Println(IsHTMLTagName("strong")) fmt.Println(IsHTMLTagName("tuesday"))
Output: true false
func IsHTMLTagNameSafe ¶
IsHTMLTagNameSafe returns true if the argument is a defined HTML5 tag. Unlike IsHTMLTagName, this version downcases the argument first to prevent mixed- case false negatives.
func IsValidAttributeName ¶
IsValidAttributeName returns true if the argument is a HTML5-valid attribute name, as defined here: http://www.w3.org/TR/html5/syntax.html#attributes-0
Attribute names must consist of one or more characters other than the space characters, U+0000 NULL, U+0022 QUOTATION MARK ("), U+0027 APOSTROPHE ('), ">" (U+003E), "/" (U+002F), and "=" (U+003D) characters, the control characters, and any characters that are not defined by Unicode.
This is merely a syntax check. This function makes no judgments about the semantic validity of the argument.
Example ¶
fmt.Println(IsValidAttributeName("example")) fmt.Println(IsValidAttributeName("not valid"))
Output: true false
func IsValidAttributeValue ¶
IsValidAttributeValue returns true if the argument is a valid attribute value, with the caveat that additional rules apply to unquoted, single- quoted, and double-quoted attribute values.
Attribute values are a mixture of text and character references, except with the additional restriction that the text cannot contain an ambiguous ampersand.
Definition: http://www.w3.org/TR/html5/syntax.html#attributes-0
Note: It is almost certainly a bad idea to call this function directly. Use one of IsValidAttributeValueUnquoted, IsValidAttributeValueSingleQuoted, or IsValidAttributeValueDoubleQuoted instead.
func IsValidAttributeValueDoubleQuoted ¶
IsValidAttributeValueDoubleQuoted return true if the argument is a valid double-quoted attribute value. Note, the argument must not contain the double quotes.
the attribute value, which, in addition to the requirements given above for attribute values, must not contain any literal """ (U+0022) characters
From http://www.w3.org/TR/html5/syntax.html#attributes-0
Example ¶
fmt.Println(IsValidAttributeValueDoubleQuoted("yes")) fmt.Println(IsValidAttributeValueDoubleQuoted(`"no"`))
Output: true false
func IsValidAttributeValueSingleQuoted ¶
IsValidAttributeValueSingleQuoted return true if the argument is a valid single-quoted attribute value. Note, the argument must not contain the single quotes.
the attribute value, which, in addition to the requirements given above for attribute values, must not contain any literal "'" (U+0027) characters
From http://www.w3.org/TR/html5/syntax.html#attributes-0
Example ¶
fmt.Println(IsValidAttributeValueSingleQuoted("yes")) fmt.Println(IsValidAttributeValueSingleQuoted("'no'"))
Output: true false
func IsValidAttributeValueUnquoted ¶
IsValidAttributeValueUnquoted return true if the argument is a valid unquoted attribute value. For example, "email" in <input type=email>.
The attribute name, followed by zero or more space characters, followed by a single U+003D EQUALS SIGN character, followed by zero or more space characters, followed by the attribute value, which, in addition to the requirements given above for attribute values, must not contain any literal space characters, any U+0022 QUOTATION MARK characters ("), U+0027 APOSTROPHE characters ('), "=" (U+003D) characters, "<" (U+003C) characters, ">" (U+003E) characters, or "`" (U+0060) characters, and must not be the empty string
func IsValidCss3IdValue ¶
IsValidCss3IdValue returns true if the argument is a valid CSS 3 ID value.
From http://www.w3.org/TR/css3-selectors/#id-selectors
An ID selector contains a "number sign" (U+0023, #) immediately followed by the ID value, which must be an CSS identifiers.
See also IsValidCss3Identifier.
func IsValidCss3Identifier ¶
IsValidCss3Identifier returns true if the argument is a valid CSS3 identifier. Identifiers include element names, and the selector part of ID and class names.
Prohibited characters can be included using escaped values, and this function takes escaped values into account.
From http://www.w3.org/TR/CSS21/syndata.html#value-def-identifier
In CSS, identifiers ... can contain only: - the characters [a-zA-Z0-9] and - ISO 10646 characters U+00A0 and higher, plus - the hyphen (-) and - the underscore (_); - they cannot start with a digit, two hyphens, or a hyphen followed by a digit.
Also, special characters can be escaped:
Any character (except a hexadecimal digit, linefeed, carriage return, or form feed) can be escaped with a backslash to remove its special meaning.
or
Third, backslash escapes allow authors to refer to characters they cannot easily put in a document. In this case, the backslash is followed by at most six hexadecimal digits (0..9A..F), which stand for the ISO 10646 ([ISO10646]) character with that number, which must not be zero.
BUG(dr): IsValidCss3Identifier uses the CSS 2.1 spec, which the CSS 3 spec links to when it refers "identifiers". Is this the most up-to-date?
Example ¶
fmt.Println(IsValidCss3Identifier("hullo"))
Output: true
func IsValidHTMLTagName ¶
IsValidHTMLTagName returns true if the argument *can be* a valid HTML 5 tag name.
Tags contain a tag name, giving the element's name. HTML elements all have names that only use alphanumeric ASCII characters. In the HTML syntax, tag names, even those for foreign elements, may be written with any mix of lower- and uppercase letters that, when converted to all- lowercase, matches the element's tag name; tag names are case- insensitive.
This function checks the structural syntax of the argument (i.e. alphanumeric ASCII characters). It does not check if the argument is a pre-defined HTML5 tag name. Use IsHTMLTagName to see if it is a pre-defined name.
func IsValidHtml4IdValue ¶
IsValidHtml4IdValue returns true if the argument is a valid HTML 4 ID attribute value.
Note: HTML 4 is more strict than HTML 5.
Note: HTML 4 ID values are strict enough that a valid ID value will be valid also for a quoted or unquoted attribute value.
Note: This function only checks the syntax. Additional restrictions on ID values, such as global uniqueness, also apply.
From http://www.w3.org/TR/html401/types.html#type-name
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
Example ¶
fmt.Println(IsValidHtml4IdValue("introduction")) fmt.Println(IsValidHtml4IdValue("last remarks"))
Output: true false
func IsValidHtml5IdValue ¶
IsValidHtml5IdValue returns true if the argument is a valid HTML 5 ID value.
Note: This is much more permissive than HTML 4 ID values and CSS3 ID values, so don't get too wacky.
Note: Additional restrictions of HTML attribute values apply.
From http://dev.w3.org/html5/markup/global-attributes.html#common.attrs.id
Any string, with the following restrictions: - must be at least one character long - must not contain any space characters
Example ¶
fmt.Println(IsValidHtml5IdValue("introduction")) fmt.Println(IsValidHtml5IdValue("last remarks"))
Output: true false
Types ¶
type NamedReferenceScanner ¶
type NamedReferenceScanner struct { Value string // The string to be scanned. LastIndex int // The stopping point where Next last finished, or -1 to start from the beginning. }
INTERNAL USE ONLY. NO API GUARANTEES.
NamedReferenceScanner is a utility for scanning through strings and looking for named character references.
The Next method only checks for the structure: an ampersand, one or more alphanumeric values, and a semicolon. Use IsCharacterReferenceName to see if the returned values are actually valid character reference names.
func NewNamedReferenceScanner ¶
func NewNamedReferenceScanner(value string) *NamedReferenceScanner
NewNamedReferenceScanner creates a new scanner with the given value.
func (*NamedReferenceScanner) Next ¶
func (scanner *NamedReferenceScanner) Next() (name string, ampIndex int)
Next returns the next named character reference (just the alphanumeric part, skipping the ampersand and semicolon) and the byte index of the leading ampersand.
Or empty string and -1 if no named character references could be found.
func (*NamedReferenceScanner) Reset ¶
func (scanner *NamedReferenceScanner) Reset()
Reset resets the scanner to the beginning of the Value string.
Notes ¶
Bugs ¶
IsValidCss3Identifier uses the CSS 2.1 spec, which the CSS 3 spec links to when it refers "identifiers". Is this the most up-to-date?