hashtags

package module
v0.6.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 5, 2024 License: GPL-3.0 Imports: 11 Imported by: 1

README

HashTags

Golang GoDoc Go Report Issues Size Tag License


Purpose

Sometimes one might want to search and find socalled #hashtags or @mentions in one's texts (in a broader sense) and store them for later retrieval. This package offers that facility. It provides the THashList class which can be used to parse texts for the occurrence of both #hashtags and @mentions and store the hits in an internal list for later lookup; that list can be stored in a file and later read from that file.

Installation

You can use Go to install this package for you:

go get -u github.com/mwat56/hashtags

Usage

In principle for each #hashtag or @mention a list of IDs is maintained. These IDs can be any (string) data that identifies the text in which the #hashtag or @mention was found, e.g. a filename or some database record reference. The only condition is that it is unique as far as the program using this package is concerned.

Note that both #hashtag and @mention are stored lower-cased to allow for case-insensitive searches.

To get a THashList instance there's a simple way:

fName := "mytags.lst"
htl, err := hashtags.New(fName)
if nil != err {
    log.PrintF("Problem loading file '%s': %v", fName, err)
}
    // …
    // do something with the list
    // …
written, err := htl.Store()
if nil != err {
    log.PrintF("Problem writing file '%s': %v", fName, err)
}

The package provides a boolean configuration variable called UseBinaryStorage which is true by default. It determines whether the data written by Store() and read by Load() use plain text (i.e. hashtags.UseBinaryStorage = false) or a binary data format. The advantage of the plain text format is that it can be inspected by any text related tool (like e.g. grep or diff). The advantage of the binary format is that it is about three to four times as fast when loading/storing data and it uses a few bytes less than the text format. For this reasons it's used by default (i.e. hashtags.UseBinaryStorage == true); during development of your own application using this package, however, you might want to change to text format for diagnostic purposes.

For more details please refer to the package documentation.

Libraries

No external libraries were used building HashTags.

Licence

    Copyright © 2019, 2022 M.Watermann, 10247 Berlin, Germany
                    All rights reserved
                EMail : <support@mwat.de>

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You should have received a copy of the GNU General Public License along with this program. If not, see the GNU General Public License for details.


Documentation

Overview

Package hashtags implements a simple #hashtag/@mentions handler.

Copyright © 2019, 2022 M.Watermann, 10247 Berlin, Germany
                All rights reserved
            EMail : <support@mwat.de>

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

You should have received a copy of the GNU General Public License along with this program. If not, see the [GNU General Public License](http://www.gnu.org/licenses/gpl.html) for details.

Copyright © 2019, 2024 M.Watermann, 10247 Berlin, Germany

    All rights reserved
EMail : <support@mwat.de>

Index

Constants

This section is empty.

Variables

View Source
var (
	// UseBinaryStorage determines whether to use binary storage
	// or not (i.e. plain text).
	//
	// Loading/storing binary data is about three times as fast with
	// the `THashList` data than reading and parsing plain text data.
	UseBinaryStorage = true
)

Functions

This section is empty.

Types

type TCountItem

type TCountItem = struct {
	Count int    // number of IDs for this #hashtag/@mention
	Tag   string // name of #hashtag/@mention
}

TCountItem holds a #hashtag/@mention and its number of occurrences.

@see CountedList()

type THashList

type THashList struct {
	// contains filtered or unexported fields
}

THashList is a list of `#hashtags` and `@mentions` pointing to sources (i.e. IDs).

func New

func New(aFilename string) (*THashList, error)

New returns a new `THashList` instance after reading the given file.

If the hash file doesn't exist that is not considered an error. If there is an error, it will be of type *PathError.

`aFilename` is the name of the file to use for reading and storing.

func (*THashList) Checksum

func (hl *THashList) Checksum() uint32

Checksum returns the list's CRC32 checksum.

This method can be used to get a kind of 'footprint'.

func (*THashList) Clear

func (hl *THashList) Clear() *THashList

Clear empties the internal data structures: all `#hashtags` and `@mentions` are deleted.

func (*THashList) CountedList

func (hl *THashList) CountedList() (rList []TCountItem)

CountedList returns a list of #hashtags/@mentions with their respective count of associated IDs.

func (*THashList) Filename

func (hl *THashList) Filename() string

Filename returns the configured filename for reading/storing this list.

func (*THashList) HashAdd

func (hl *THashList) HashAdd(aHash, aID string) *THashList

HashAdd appends `aID` to the list of `aHash`.

If either `aHash` or `aID` are empty strings they are silently ignored (i.e. this method does nothing).

`aHash` is the list index to lookup.
`aID` is to be added to the hash list.

func (*THashList) HashCount added in v0.5.0

func (hl *THashList) HashCount() int

HashCount returns the number of hashtags in the list.

func (*THashList) HashLen

func (hl *THashList) HashLen(aHash string) int

HashLen returns the number of IDs stored for `aHash`.

`aHash` identifies the ID list to lookup.

func (*THashList) HashList

func (hl *THashList) HashList(aHash string) []string

HashList returns a list of IDs associated with `aHash`.

`aHash` identifies the ID list to lookup.

func (*THashList) HashRemove

func (hl *THashList) HashRemove(aHash, aID string) *THashList

HashRemove deletes `aID` from the list of `aHash`.

`aHash` identifies the ID list to lookup.
`aID` is the source to remove from the list.

func (*THashList) IDlist

func (hl *THashList) IDlist(aID string) (rList []string)

IDlist returns a list of #hashtags and @mentions associated with `aID`.

func (*THashList) IDparse

func (hl *THashList) IDparse(aID string, aText []byte) *THashList

IDparse checks whether `aText` contains strings starting with `[@|#]` and – if found – adds them to the respective list.

`aID` is the ID to add to the list.
`aText` is the text to search.

func (*THashList) IDremove

func (hl *THashList) IDremove(aID string) *THashList

IDremove deletes all @hashtags/@mentions associated with `aID`.

`aID` is to be deleted from all lists.

func (*THashList) IDrename

func (hl *THashList) IDrename(aOldID, aNewID string) *THashList

IDrename replaces all occurrences of `aOldID` by `aNewID`.

This method is intended for rare cases when the ID of a document needs to get changed.

`aOldID` is to be replaced in all lists.
`aNewID` is the replacement in all lists.

func (*THashList) IDupdate

func (hl *THashList) IDupdate(aID string, aText []byte) *THashList

IDupdate checks `aText` removing all #hashtags/@mentions no longer present and adding #hashtags/@mentions new in `aText`.

`aID` is the ID to update.
`aText` is the text to use.

func (*THashList) Len

func (hl *THashList) Len() int

Len returns the current length of the list i.e. how many #hashtags and @mentions are currently stored in the list.

func (*THashList) LenTotal

func (hl *THashList) LenTotal() (rLen int)

LenTotal returns the length of all #hashtag/@mention lists together.

func (*THashList) Load

func (hl *THashList) Load() (*THashList, error)

Load reads the configured file returning the data structure read from the file and a possible error condition.

If the hash file doesn't exist that is not considered an error. If there is an error, it will be of type `*PathError`.

func (*THashList) MentionAdd

func (hl *THashList) MentionAdd(aMention, aID string) *THashList

MentionAdd appends `aID` to the list of `aMention`.

If either `aMention` or `aID` are empty strings they are silently ignored (i.e. this method does nothing).

`aMention` is the list index to lookup.
`aID` is to be added to the hash list.

func (*THashList) MentionCount added in v0.5.0

func (hl *THashList) MentionCount() int

MentionCount returns the number of mentions in the list.

func (*THashList) MentionLen

func (hl *THashList) MentionLen(aMention string) int

MentionLen returns the number of IDs stored for `aMention`.

`aMention` identifies the ID list to lookup.

func (*THashList) MentionList

func (hl *THashList) MentionList(aMention string) []string

MentionList returns a list of IDs associated with `aMention`.

`aMention` identifies the ID list to lookup.

func (*THashList) MentionRemove

func (hl *THashList) MentionRemove(aMention, aID string) *THashList

MentionRemove deletes `aID` from the list of `aMention`.

`aMention` identifies the ID list to lookup.
`aID` is the source to remove from the list.

func (*THashList) SetFilename

func (hl *THashList) SetFilename(aFilename string) *THashList

SetFilename sets `aFilename` to use by this list.

func (*THashList) Store

func (hl *THashList) Store() (int, error)

Store writes the whole list to the configured file returning the number of bytes written and a possible error.

If there is an error, it will be of type `*PathError`.

func (*THashList) String

func (hl *THashList) String() string

String returns the whole list as a linefeed separated string.

func (*THashList) Walk

func (hl *THashList) Walk(aFunc TWalkFunc)

Walk traverses through all entries in the #hashtag/@mention lists calling `aFunc` for each entry.

If `aFunc` returns `false` when called the respective ID will be removed from the associated #hashtag/@mention.

`aFunc` is the function called for each ID in all lists.

func (*THashList) Walker

func (hl *THashList) Walker(aWalker THashWalker)

Walker traverses through all entries in the INI list sections calling `aWalker` for each entry.

`aWalker` is an object implementing the `TIniWalker` interface.

type THashWalker

type THashWalker interface {
	Walk(aHash, aID string) bool
}

THashWalker is used by `Walker()` when visiting an entry in the #hashtag/@mentions lists.

`aHash` is the list index to lookup.
`aID` is to be added to the hash list.

type TWalkFunc

type TWalkFunc func(aHash, aID string) (rValid bool)

TWalkFunc is used by `Walk()` when visiting an entry in the #hashtag/@mention lists.

see `Walk()`

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL