microformats

package module
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 31, 2023 License: MIT Imports: 12 Imported by: 22

README

microformats

GoDoc Test Status Test Coverage

microformats is a go library and tool for parsing microformats, supporting both classic v1 and v2 syntax. It is based on Andy Leap's original library.

Usage

To see this package in action, the simplest way is to install the command line app and use it to fetch and parse a webpage with microformats on it:

% go install willnorris.com/go/microformats/cmd/gomf@latest
% gomf https://indieweb.org

To use it in your own code, import the package:

import "willnorris.com/go/microformats"

If you have the HTML contents of a page in an io.Reader, call Parse like in this example:

content := `<article class="h-entry"><h1 class="p-name">Hello</h1></article>`
r := strings.NewReader(content)

data := microformats.Parse(r, nil)

// do something with data, or just print it out as JSON:
enc := json.NewEncoder(os.Stdout)
enc.SetIndent("", "  ")
enc.Encode(data)

Alternately, if you have already parsed the page and have an html.Node, then call ParseNode. For example, you might want to select a subset of the DOM, and parse only that for microformats. An example of doing this with the goquery package can be seen in cmd/gomf/main.go.

To see that in action using the gomf app installed above, you can parse the microformats from indieweb.org that appear within the #content element:

% gomf https://indieweb.org "#content"

{
  "items": [
    {
      "id": "content",
      "type": [
        "h-entry"
      ],
      "properties": ...
      "children": ...
    }
  ],
  "rels": {},
  "rel-urls": {}
}

Documentation

Overview

Package microformats provides a microformats parser, supporting both v1 and v2 syntax.

Usage:

import "willnorris.com/go/microformats"

Retrieve the HTML contents of a page, and call Parse or ParseNode, depending on what input you have (an io.Reader or an html.Node).

To parse only a section of an HTML document, use a package like goquery to select the root node to parse from. For example, see cmd/gomf/main.go.

See also: http://microformats.org/wiki/microformats2

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Data

type Data struct {
	// Items includes all top-level microformats found on the page.
	Items []*Microformat `json:"items"`

	// Rels includes all related URLs found on the page (<a> or <link>
	// elements with a "rel" value).  Map keys are the rel value, mapped to
	// a slice of URLs with that relation.  For example:
	//
	//     map[string][]string{
	//         "author": {"http://example.com/a", "http://example.com/b"},
	//         "alternate": {"http://example.com/fr"},
	//     }
	//
	// Relative URL values are resolved to absolute URLs using the base URL
	// of the page.
	Rels map[string][]string `json:"rels"`

	// RelURLs maps related URLs found on the page to additional metadata
	// about that relationship. If a URL is linked to more than once, only
	// the metadata for the first link is included here.  Relative URL
	// values are resolved to absolute URLs using the base URL of the page.
	RelURLs map[string]*RelURL `json:"rel-urls"`
}

Data specifies all of the microformats and data parsed from a single HTML page.

func Parse

func Parse(r io.Reader, baseURL *url.URL) *Data

Parse the microformats found in the HTML document read from r. baseURL is the URL this document was retrieved from and is used to expand any relative URLs. If baseURL is nil and the base URL is not referenced in the document, relative URLs are not expanded.

func ParseNode

func ParseNode(doc *html.Node, baseURL *url.URL) *Data

ParseNode parses the microformats found in doc. baseURL is the URL this document was retrieved from and is used to expand any relative URLs. If baseURL is nil and the base URL is not referenced in the document, relative URLs are not expanded.

type Microformat

type Microformat struct {
	ID         string           `json:"id,omitempty"`
	Value      string           `json:"value,omitempty"`
	HTML       string           `json:"html,omitempty"`
	Type       []string         `json:"type"`
	Properties map[string][]any `json:"properties"`
	Shape      string           `json:"shape,omitempty"`
	Coords     string           `json:"coords,omitempty"`
	Children   []*Microformat   `json:"children,omitempty"`
	// contains filtered or unexported fields
}

Microformat specifies a single microformat object and its properties. It may represent a person, an address, a blog post, etc.

type RelURL

type RelURL struct {
	Rels     []string `json:"rels,omitempty"`
	Text     string   `json:"text,omitempty"`
	Media    string   `json:"media,omitempty"`
	HrefLang string   `json:"hreflang,omitempty"`
	Title    string   `json:"title,omitempty"`
	Type     string   `json:"type,omitempty"`
}

RelURL represents the attributes of a URL. The URL value itself is the map key in the RelURLs field of the Data type.

Directories

Path Synopsis
cmd
gomf
The gomf tool is a command line tool which parses microformats from the specified URL.
The gomf tool is a command line tool which parses microformats from the specified URL.
gomfweb
The gomfweb command runs a simple web server that demonstrates the use of the go microformats library.
The gomfweb command runs a simple web server that demonstrates the use of the go microformats library.
Package ptd implements Post Type Discovery as defined by https://www.w3.org/TR/post-type-discovery/
Package ptd implements Post Type Discovery as defined by https://www.w3.org/TR/post-type-discovery/

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL