Documentation ¶
Overview ¶
Package sherlock is a library for extracting metadata from web pages. It uses as many methods as possible to extract page data, including: - ActivityStreams/JSON-LD - Open Graph - Microformats2
Coming Soon.. - HTML Meta Tags - oEmbed - JSON-LD - Twitter Cards?
Index ¶
Constants ¶
const ContentType = "Content-Type"
ContentType is the string used in the HTTP header to designate a MIME type
const ContentTypeActivityPub = "application/activity+json"
ContentTypeActivityPub is the standard MIME type for ActivityPub content
const ContentTypeAtom = "application/atom+xml"
ContentTypeAtom is the standard MIME Type for Atom Feeds
const ContentTypeForm = "application/x-www-form-urlencoded"
ContentTypeForm is the standard MIME Type for Form encoded content
const ContentTypeHTML = "text/html"
ContentTypeHTML is the standard MIME type for HTML content
const ContentTypeJSON = "application/json"
ContentTypeJSON is the standard MIME Type for JSON content
const ContentTypeJSONFeed = "application/feed+json"
ContentTypeJSONFeed is the standard MIME Type for JSON Feed content https://en.wikipedia.org/wiki/JSON_Feed
const ContentTypeJSONLD = "application/ld+json"
ContentTypeJSONLD is the standard MIME Type for JSON-LD content https://en.wikipedia.org/wiki/JSON-LD
const ContentTypeJSONResourceDescriptor = "application/jrd+json"
ContentTypeJSONResourceDescriptor is the standard MIME Type for JSON Resource Descriptor content which is used by WebFinger: https://datatracker.ietf.org/doc/html/rfc7033#section-10.2
const ContentTypePlain = "text/plain"
ContentTypePlain is the default plaintext MIME type
const ContentTypeRSS = "application/rss+xml"
ContentTypeRSS is the standard MIME Type for RSS Feeds
const ContentTypeXML = "application/xml"
ContentTypeXML is the standard MIME Type for XML content
const FormatActivityStream = "ACTIVITYSTREAM"
const FormatJSONFeed = "JSONFEED"
const FormatMicroFormats = "MICROFORMATS"
const FormatRSS = "RSS"
const HTTPHeaderAccept = "Accept"
HTTPHeaderAccept is the string used in the HTTP header to request a response be encoded as a MIME type
const HTTPHeaderCacheControl = "Cache-Control"
const HTTPHeaderLink = "Link"
const LinkRelationAlternate = "alternate"
const LinkRelationFeed = "feed"
const LinkRelationHub = "hub"
const LinkRelationIcon = "icon"
const LinkRelationSelf = "self"
const LoadDocumentTypeActor = 1
const LoadDocumentTypeCollection = 2
const LoadDocumentTypeDocument = 3
const LoadDocumentTypeUnknown = 0
Variables ¶
This section is empty.
Functions ¶
func IsValidAddress ¶ added in v0.6.5
IsValidAddress returns TRUE for all values that Sherlock THINKS it SHOULD be able to prorcess. This includes: @username@host.tld and https://host.tld/username addresses. IMPORTANT: Just because this function returns TRUE does NOT mean that the address is valid. It just means that it looks like a valid format, but it will still need to be checked.
Types ¶
type Client ¶
type Client struct { UserAgent string // User-Agent string to send with every request RemoteOptions []remote.Option // Additional options to pass to the remote library }
Client implements the hannibal/streams.Client interface, and is used to load JSON-LD documents from remote servers. The sherlock client maps additional meta-data into a standard ActivityStreams document.
func NewClient ¶
func NewClient(options ...ClientOption) Client
NewClient returns a fully initialized Client object
func (Client) Load ¶
Load retrieves a document from a remote server and returns it as a streams.Document It uses either the "Actor" or "Document" methods of generating it ActivityStreams result. "Document" treats the URL as a single ActivityStreams document, translating OpenGraph, MicroFormats, and JSON-LD into an ActivityStreams equivalent. "Actor" treats the URL as an Actor, translating RSS, Atom, JSON, and MicroFormats feeds into an ActivityStream equivalent.
func (*Client) WithOptions ¶ added in v0.6.0
func (client *Client) WithOptions(options ...ClientOption)
WithOptions applies one or more ClientOption functions to the client
type ClientOption ¶ added in v0.6.0
type ClientOption func(*Client)
ClientOption defines a functional option that modifies a Client object
func WithRemoteOptions ¶ added in v0.6.0
func WithRemoteOptions(middleware ...remote.Option) ClientOption
WithRemoteOptions is a ClientOption that appends one or more remote.Option objects to the Client object RemoteOptions are executed on every remote request
func WithUserAgent ¶ added in v0.6.0
func WithUserAgent(userAgent string) ClientOption
WithUserAgent is a ClientOption that sets the UserAgent property on the Client object
type LoadConfig ¶ added in v0.6.0
func NewLoadConfig ¶ added in v0.6.0
func NewLoadConfig(options ...any) LoadConfig
type LoadOption ¶ added in v0.6.0
type LoadOption func(*LoadConfig)
func AsActor ¶ added in v0.6.0
func AsActor() LoadOption
func AsCollection ¶ added in v0.6.0
func AsCollection() LoadOption
func AsDocument ¶ added in v0.6.0
func AsDocument() LoadOption
func WithDefaultValue ¶ added in v0.6.0
func WithDefaultValue(defaultValue map[string]any) LoadOption
func WithMaximumRedirects ¶ added in v0.6.0
func WithMaximumRedirects(maximumRedirects int) LoadOption
Source Files ¶
- actor-.go
- actor-WebFinger.go
- actor-activityStreams.go
- actor-feed-.go
- actor-feed-JSON.go
- actor-feed-RSS.go
- actor-feed-icon.go
- actor-feed-links.go
- actor-feed-microFormats.go
- client-.go
- client-applyLinks.go
- client-clientOption.go
- client-loadOption.go
- constants.go
- document-.go
- document-activityStream.go
- document-html-.go
- document-html-jsonld-.go
- document-html-jsonld-embedded.go
- document-html-jsonld-linked.go
- document-html-microformats.go
- document-html-oembed.go
- document-html-opengraph.go
- document-html-wordpress.go
- sherlock-extras.go
- sherlock.go
- utils.go