sitemap

package module
v0.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 31, 2022 License: MIT Imports: 8 Imported by: 10

README

go-sitemap

Github Actions CI GoDoc

go-sitemap get sitemap.xml (or sitemapindex.xml) and generate Sitemap object.

Installation

go install github.com/yterajima/go-sitemap

Documentation

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func SetFetch

func SetFetch(f func(URL string, options interface{}) ([]byte, error))

SetFetch change fetch closure

func SetInterval

func SetInterval(time time.Duration)

SetInterval change Time interval to be used in Index.get

Types

type Index

type Index struct {
	XMLName xml.Name `xml:"sitemapindex"`
	Sitemap []parts  `xml:"sitemap"`
}

Index is a structure of <sitemapindex>

func ParseIndex

func ParseIndex(data []byte) (Index, error)

ParseIndex create Index data from text

type Sitemap

type Sitemap struct {
	XMLName xml.Name `xml:"urlset"`
	URL     []URL    `xml:"url"`
}

Sitemap is a structure of <sitemap>

func ForceGet added in v0.3.0

func ForceGet(URL string, options interface{}) (Sitemap, error)

ForceGet is fetch and parse sitemap.xml/sitemapindex.xml. The difference with the Get function is that it ignores some errors.

Errors to Ignore:

・When sitemapindex.xml contains a sitemap.xml URL that cannot be retrieved. ・When sitemapindex.xml contains a sitemap.xml that is empty ・When sitemapindex.xml contains a sitemap.xml that has format problems.

Errors not to Ignore:

・When sitemap.xml/sitemapindex.xml could not retrieved. ・When sitemap.xml/sitemapindex.xml is empty. ・When sitemap.xml/sitemapindex.xml has format problems.

If you want **not** to ignore some errors, use the Get function.

func Get

func Get(URL string, options interface{}) (Sitemap, error)

Get is fetch and parse sitemap.xml/sitemapindex.xml

If sitemap.xml or sitemapindex.xml has some problems, This function return error.

・When sitemap.xml/sitemapindex.xml could not retrieved. ・When sitemap.xml/sitemapindex.xml is empty. ・When sitemap.xml/sitemapindex.xml has format problems. ・When sitemapindex.xml contains a sitemap.xml URL that cannot be retrieved. ・When sitemapindex.xml contains a sitemap.xml that is empty ・When sitemapindex.xml contains a sitemap.xml that has format problems.

If you want to ignore these errors, use the ForceGet function.

Example
smap, err := Get("https://issueoverflow.com/sitemap.xml", nil)
if err != nil {
	fmt.Println(err)
}

for _, URL := range smap.URL {
	fmt.Println(URL.Loc)
}
Output:

Example (ChangeFetch)
SetFetch(func(URL string, options interface{}) ([]byte, error) {
	req, err := http.NewRequest("GET", URL, nil)
	if err != nil {
		return []byte{}, err
	}

	// Set User-Agent
	req.Header.Set("User-Agent", "MyBot")

	// Set timeout
	timeout := time.Duration(10 * time.Second)
	client := http.Client{
		Timeout: timeout,
	}

	// Fetch data
	res, err := client.Do(req)
	if err != nil {
		return []byte{}, err
	}
	defer res.Body.Close()

	body, err := io.ReadAll(res.Body)
	if err != nil {
		return []byte{}, err
	}

	return body, err
})

smap, err := Get("https://issueoverflow.com/sitemap.xml", nil)
if err != nil {
	fmt.Println(err)
}

for _, URL := range smap.URL {
	fmt.Println(URL.Loc)
}
Output:

func Parse

func Parse(data []byte) (Sitemap, error)

Parse create Sitemap data from text

type URL

type URL struct {
	Loc        string  `xml:"loc"`
	LastMod    string  `xml:"lastmod"`
	ChangeFreq string  `xml:"changefreq"`
	Priority   float32 `xml:"priority"`
}

URL is a structure of <url> in <sitemap>

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL