tagging

package
v0.1.361 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 9, 2024 License: GPL-3.0 Imports: 19 Imported by: 0

Documentation

Overview

Package tagging implements helper functions for attaching ISIL to records. This is a "v2" reimplementation, aimed at reducing from friction while adding support for FOLIO.

Example output, tag vs tagger.

$ taskcat AIIntermediateSchema --date 2020-04-15 | head | ./span-tag -unfreeze $(taskoutput AMSLFilterConfigFreeze) 2> /dev/null | jq -rc '[.["finc.id"], .["x.labels"][]] | @tsv' ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTY1 DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTc0 DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTgy DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTkx DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTk5 DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjAy DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjA1 DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjA5 DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjEz DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjE1 DE-105 DE-14 DE-15 DE-82 DE-Brt1 DE-Ch1 DE-D275 DE-Gla1 DE-Zi4 DE-Zwi2

$ taskcat AIIntermediateSchema --date 2020-04-15 | head | ./span-tagger -debug -db amsl.db 2> /dev/null ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTY1 DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTc0 DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTgy DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTkx DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMTk5 DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjAy DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjA1 DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjA5 DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjEz DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2 ai-49-aHR0cDovL2R4LmRvaS5vcmcvMTAuMTI0MS9qb2hva2FucmkuNDkuMjE1 DE-105, DE-14, DE-15, DE-82, DE-Brt1, DE-Ch1, DE-D275, DE-Gla1, DE-Zi4, DE-Zwi2

Package tagging is a rewrite of span-tag for applying licensing information of intermediate schema data. While span-tag uses a declarative approach (a JSON configuration), this package tries to express things in code; maybe uglier, less declarative, but more flexible, in the best case.

Index

Constants

View Source
const (
	// SLUBEZBKBART link to DE-14 KBART, to be included across all sources.
	SLUBEZBKBART         = "https://dbod.de/SLUB-EZB-KBART.zip"
	DE15FIDISSNWHITELIST = "DE15FIDISSNWHITELIST"
)

Variables

View Source
var UBLWISOPROFILE = map[string]struct{}{}/* 198 elements not displayed */

siskin/assets/wiso/645896059854847ce4ccd1416e11ba372e45bfd6.csv - TODO: Load this from file.

Functions

This section is empty.

Types

type ConfigRow

type ConfigRow struct {
	ShardLabel                     string
	ISIL                           string
	SourceID                       string
	TechnicalCollectionID          string
	MegaCollection                 string
	HoldingsFileURI                string
	HoldingsFileLabel              string
	LinkToHoldingsFile             string
	EvaluateHoldingsFileForLibrary string
	ContentFileURI                 string
	ContentFileLabel               string
	LinkToContentFile              string
	ExternalLinkToContentFile      string
	ProductISIL                    string
	DokumentURI                    string
	DokumentLabel                  string
}

ConfigRow describes a single entry (e.g. an attachment request) from AMSL.

type Conjunction

type Conjunction int

Conjunction of terms, or holding files.

const (
	And Conjunction = iota
	Or
)

type HFCache

type HFCache struct {
	// contains filtered or unexported fields
}

HFCache wraps access to entries in multiple holding files. Internally, we map an identifier of a holding file (e.g. a URL) to another map from ISSN to corresponding licensing entries. It will use a cache directory to not redownload files on every use.

func (*HFCache) Covered

func (c *HFCache) Covered(doc *finc.IntermediateSchema, conj Conjunction, hfs ...string) (ok bool, err error)

Covered returns true, if a document is covered by all given kbart files (e.g. like "and" filter in former filterconfig). TODO: Merge Covered and Covers methods.

func (*HFCache) Covers

func (c *HFCache) Covers(hflink string, doc *finc.IntermediateSchema) (ok bool, err error)

Covers returns true, if a holdings file, given by link or filename, covers the document. The cache takes care of downloading the file, if necessary.

type Labeler

type Labeler struct {
	// contains filtered or unexported fields
}

Labeler updates an intermediate schema document. We need mostly: ISIL, SourceID, MegaCollection, TechnicalCollectionID, HoldFileURI, EvaluateHoldingsFileForLibrary

func New

func New(dbFile string) (*Labeler, error)

New returns a initialized labeler using a relational AMSL representation in an sqlite3 file. Will fail here, if database cannot be opened.

func (*Labeler) Labels

func (l *Labeler) Labels(doc *finc.IntermediateSchema) ([]string, error)

Labels returns a list of ISIL that are interested in this document.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL