iconscraper

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 3, 2023 License: MIT Imports: 19 Imported by: 0

README

Icon-Scraper Package Documentation

iconscraper is a Go package that provides a robust solution to get icons from domains.

Icon Sources

Other sources

These aren't currently scraped, but might be of interest:

Usage

Get icons from multiple domains
import "github.com/MeVitae/iconscraper"

config := Config{
    SquareOnly:            true,
    TargetHeight:          128,
    MaxConcurrentRequests: 32,
    AllowSvg:              false,
}

domains := []string{"mevitae.com", "example.com", "gov.uk", "golang.org", "rust-lang.org"}

icons := iconscraper.GetIcons(config, domains)

for domain, icon := range icons {
	fmt.Println("Domain: " + domain + ", Icon URL: " + icon.URL)
}
Handle errors and warnings.

Errors related to decoding images or resources not being found on a web server (but the connection being ok) will be reported as warnings instead of errors.

By default, errors and warnings are only logged to the console. You can handle errors yourself by adding your own channel in the config, for example:

import "github.com/MeVitae/iconscraper"

config := Config{
    SquareOnly:            true,
    TargetHeight:          128,
    MaxConcurrentRequests: 32,
    AllowSvg:              false,
    Errors:                make(chan error),
}

go func(){
    for err := range config.Errors {
        // Handle err
    }
}()

domains := []string{"mevitae.com", "example.com", "gov.uk", "golang.org", "rust-lang.org"}

icons := iconscraper.GetIcons(config, domains)

for domain, icon := range icons {
	fmt.Println("Domain: " + domain + ", Icon URL: " + icon.URL)
}

Warnings can be similarly handled using the Warnings field.

Get icon from a single domain

Icons can be scraped for a single domain using GetIcon. Errors and warnings are handled in the same way.

Documentation

Overview

package iconscraper provides a robust solution to get icons from domains.

Icon Sources

- `/favicon.ico` - [Icon (`<link rel="icon" href="favicon.ico">`)](https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes/rel#icon) - [Web app manifest (`<link rel="manifest" href="manifest.json">`)](https://developer.mozilla.org/en-US/docs/Web/Manifest) - [`link rel="shortcut icon"`](https://stackoverflow.com/questions/13211206/html5-link-rel-shortcut-icon) - [`link rel="apple-touch-icon"`](https://developer.mozilla.org/en-US/docs/Web/HTML/Attributes/rel#non-standard_values) - [`link rel="msapplication-TileImage"`](https://stackoverflow.com/questions/61686919/what-is-the-use-of-the-msapplication-tileimage-meta-tag) - [`link rel="mask-icon"`](http://microformats.org/wiki/existing-rel-values) - [`link rel="image_src"`](http://microformats.org/wiki/existing-rel-values) (also [this post](https://www.niallkennedy.com/blog/2009/03/enhanced-social-share.html)) - [`meta itemprop="image"`](https://schema.org/image)

Other sources

These aren't currently scraped, but might be of interest:

- [`link rel="apple-touch-startup-image"`](http://microformats.org/wiki/existing-rel-values) - [`meta property="og:image"`](https://ogp.me/)

Get icons from multiple domains

import "github.com/MeVitae/iconscraper"

config := Config{
    SquareOnly:            true,
    TargetHeight:          128,
    MaxConcurrentRequests: 32,
    AllowSvg:              false,
}

domains := []string{"mevitae.com", "example.com", "gov.uk", "golang.org", "rust-lang.org"}

icons := iconscraper.GetIcons(config, domains)

for domain, icon := range icons {
	fmt.Println("Domain: " + domain + ", Icon URL: " + icon.URL)
}

Handle errors and warnings.

Errors related to decoding images or resources not being found on a web server (but the connection being ok) will be reported as warnings instead of errors.

By default, errors and warnings are only logged to the console. You can handle errors yourself by adding your own channel in the config, for example:

import "github.com/MeVitae/iconscraper"

config := Config{
    SquareOnly:            true,
    TargetHeight:          128,
    MaxConcurrentRequests: 32,
    AllowSvg:              false,
    Errors:                make(chan error),
}

go func(){
    for err := range config.Errors {
        // Handle err
    }
}()

domains := []string{"mevitae.com", "example.com", "gov.uk", "golang.org", "rust-lang.org"}

icons := iconscraper.GetIcons(config, domains)

for domain, icon := range icons {
	fmt.Println("Domain: " + domain + ", Icon URL: " + icon.URL)
}

Warnings can be similarly handled using the `Warnings` field.

Get icon from a single domain

Icons can be scraped for a single domain using `GetIcon`. Errors and warnings are handled in the same way.

Index

Constants

This section is empty.

Variables

View Source
var UserAgent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Safari/605.1.15"

Functions

func GetIcons

func GetIcons(config Config, domains []string) map[string]Icon

GetIcons scrapes icons from the provided domains concurrently and returns the results as a map from domain to the best image based on the given target.

It finds the smallest icon taller than targetHeight or, if there are none, the tallest icon.

If no icon is not found for a domain (or no square icon if squareOnly is true), that domain is omitted from the output map.

Types

type Config

type Config struct {
	// SquareOnly determines if only square icons are considered.
	SquareOnly bool

	// TargetHeight of the icon to be fetched. The shortest image larger than this size will be
	// returned and, if none are available, the tallest image smaller than this will be returned.
	TargetHeight int

	// AllowSvg allows SVGs to be returned. An SVG will always supersede a non-vector image.
	AllowSvg bool

	// MaxConcurrentRequests sets the maximum number of concurrent HTTP requests.
	MaxConcurrentRequests int

	// Errors is the channel for receiving errors.
	//
	// If nil, errors will instead by logged to the default logger.
	//
	// The channel must not block.
	Errors chan error

	// Warnings is the channel for receiving warning. Errors related to decoding images or resources
	// not being found on a web server (but the connection being ok) will be reported as warnings
	// instead of errors.
	//
	// If nil, warnings will instead by logged to the default logger.
	//
	// The channel must not block.
	Warnings chan error
}

Config is the config used for GetIcons and GetIcon.

type Icon

type Icon struct {
	// URL is the source location from which the data was fetched or derived.
	URL string

	// Type is the sniffed MIME type of the image.
	Type string

	// Image holds the parsed image config. This is nil for SVGs (type image/svg+xml).
	ImageConfig image.Config

	// Source is the image source as downloaded.
	Source []byte
}

Icon is an icon

func GetIcon

func GetIcon(config Config, domain string) *Icon

GetIcons scrapes icons from the provided domain and finds the smallest icon taller than targetHeight or, if there are none, the tallest icon.

Errors that occur are sent to the config.Errors, unless it's nil, in which case, they are logged.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL