libraries

package
v0.0.0-...-584605b Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 15, 2022 License: Apache-2.0 Imports: 11 Imported by: 0

README

Generic library functions

CleanHTML
cleans a html source by removing attributes, styles and returns raw content
FileTypeCheck
check if the file type of given source path matches given file type
DateInSlice
check if a given date is in the given slice
DownloadFile
download a file given the source and destination
EnsureDirectory
make directory if not exist 
ExtractDomain
extract the main domain from a given source path
ExtractFileName
extract filename from a given source path
FixUrl
convert relative urls to absolute urls
HTMLStringToDoc
convert html string to a queryable document
Maximum
return maximum of a positive number slice
Minimum
return minimum of a positive number slice
ObjectIdInSlice
check if a given string exists in a given slice
ParseCategoriesString
converts a categories string into a slice
ParsePdf
reads and extract content from a given PDF source filepath
ProcessNameString
standardize titles to make them url compatible by removing error prone characters
StringContainsAnyInSlice
check if a given string is contained in any string in a given slice
StringInSlice
check if a given string exists in a given slice
StringMatchPercentage
check the similarity percentage of two given strings

Documentation

Index

Constants

View Source
const NewPageMarker = "\n*******************\n"

Variables

This section is empty.

Functions

func DateInSlice

func DateInSlice(slice []time.Time, element time.Time) bool

* check if a given date exists in a given date slice

func DownloadFile

func DownloadFile(filePath string, url string) error

* download a file given the source and destination

func EnsureDirectory

func EnsureDirectory(filePath string) error

func ExtractDomain

func ExtractDomain(link string) string

* extract the main domain from a given source path

func ExtractFileName

func ExtractFileName(link string) string

* extract filename from a given source path

func FileTypeCheck

func FileTypeCheck(link string, fileType string) bool

* check if the file type of given source path matches given file type

func FixUrl

func FixUrl(href, base string) string

* convert relative urls to absolute urls

func HTMLStringToDoc

func HTMLStringToDoc(resp string) (*goquery.Document, error)

func Maximum

func Maximum(list []int) int

* return maximum of a positive number slice

func Minimum

func Minimum(list []int) int

* return minimum of a positive number slice

func ObjectIdInSlice

func ObjectIdInSlice(slice []bson.ObjectId, element bson.ObjectId) bool

* check if a given string exists in a given slice

func ParseCategoriesString

func ParseCategoriesString(categoriesString string) []string

func ParsePdf

func ParsePdf(source string) string

* return the string content of a given PDF file

func ProcessNameString

func ProcessNameString(stringValue string) string

func StringContainsAnyInSlice

func StringContainsAnyInSlice(slice []string, element string) bool

* check if a given string exist in any string in a given string slice

func StringInSlice

func StringInSlice(slice []string, element string) bool

* check if a given string exists in a given slice

func StringMatchPercentage

func StringMatchPercentage(string1 string, string2 string) int

* match strings using Levenshtein distance source: https://en.wikipedia.org/wiki/Levenshtein_distance translated from C to Go

func StringsMatch

func StringsMatch(string1 string, string2 string, tolerance int) bool

* Return a boolean value by matching two strings based on a given tolerance

Types

This section is empty.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL