parser

package module
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 13, 2021 License: MIT Imports: 16 Imported by: 0

README

sitemap_parser

Tool for parsing XML sitemap.

  • Creates list of all site pages. Prints list to console by defaults.
  • Can execute backup of loaded pages if backup path provided. Also creates zip archive of all backuped pages.

Expects site has sitemap.xml file, for example, https://alextech18.blogspot.com/sitemap.xml

Prerequisites
  • Go 1.16
Usage
Loading dependencies
go mod tidy
Run
go run parser.go
Settings
Envs

Envs have precedence over command line args

Command line args

Command line args analized if envs are not present

  • -site - (Required, or existence of SITE env) URL of site with sitemal.xml, for example,
go run parser.go -site https://alextech18.blogspot.com
  • -backup - (Optional) path for backuping loaded website pages, for example,
go run parser.go -site https://alextech18.blogspot.com -backup /home/A1esandr/backups

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

func New

func New() *Parser

func (*Parser) Get

func (p *Parser) Get(url string) []URL

func (*Parser) Parse

func (p *Parser) Parse()

type Sitemap

type Sitemap struct {
	XMLName xml.Name `xml:"sitemap"`
	Loc     string   `xml:"loc"`
}

type Sitemapindex

type Sitemapindex struct {
	XMLName xml.Name  `xml:"sitemapindex"`
	Sitemap []Sitemap `xml:"sitemap"`
}

type URL

type URL struct {
	XMLName xml.Name `xml:"url"`
	Loc     string   `xml:"loc"`
	LastMod string   `xml:"lastmod"`
	Title   string
}

type URLSet

type URLSet struct {
	XMLName xml.Name `xml:"urlset"`
	URL     []URL    `xml:"url"`
}

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL