snip

package module
v0.0.0-...-8add2db Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 27, 2023 License: GPL-3.0 Imports: 13 Imported by: 0

README

Snip

A simple personal data tool, backed with SQLite, full-text searchable.

The snip utility stores and retrieves text and binary data. A basic document is plain text, and binary data can be attached to the document. Text in the body of the document is automatically analyzed and pushed to a term-matrix stored in SQLite. This allows for very fast searching. Documents have a UUID that is generated upon creation.

Build / Install

move to binary source directory:

cd cmd/snip
build

This will produce a binary in the current directory named snip

go build
install

This will install the binary to whatever is set in $GOBIN. You can add this to your path, or put the binary wherever you choose after running the build command.

go install

Subcommands / Actions

add

You can add data from standard input, or read from a local file. The data is stored in a local sqlite database in the user's home directory named .snip.sqlite3

echo "This is some data I'd like to remember" | snip add
snip add -f my_quick_note.txt

When a new document is added, it generates a new uuid by which it can be referred. This id will be reported upon creation.

added snip uuid: 26f15658-a648-4e4b-939e-a0500b2b9677
list

You can list all items with either short or full uuids:

sh:~$ snip ls
uuid     name
99bc71c7 Wikipedia - Wren
ca808a9a Interesting files
fff22eb7 Odds of collisions for UUIDs

Add the -l option to display full id if needed.

99bc71c7-573c-403d-a560-996bde675030 Wikipedia - Wren
ca808a9a-ee52-4d1a-aa63-54673241a41b Interesting files
fff22eb7-4b7a-4914-9c1c-7b7c48fe7c26 Odds of collisions for UUIDs
get

Partial ids are allowed for convenience. For non-formatted text, the fold command is often useful.

sh:~$ snip get 99bc7 | fold -sw 80
uuid: 99bc71c7-573c-403d-a560-996bde675030
name: Wikipedia - Wren
timestamp: 2023-06-30T02:43:28.371895-07:00
----
https://en.wikipedia.org/wiki/Wren

Wrens are a family of brown passerine birds in the predominantly New World
family Troglodytidae. The family includes 88 species divided into 19 genera.
Only the Eurasian wren occurs in the Old World, where, in Anglophone regions,
...
----
attachments:
uuid                                      bytes name
ccd1627f-1e51-45be-980e-f6169cf49337      22276 Cistothorus_palustris_Iona.jpg
attach

Attach binary files to a document.

sh:~$ snip attach add 644d6c1b-c16c-4b85-b245-36b389f87476 Cistothorus_palustris_Iona.jpg
attaching files to snip 644d6c1b-c16c-4b85-b245-36b389f87476 Wikipedia - Wren
attached Cistothorus_palustris_Iona.jpg 22276 bytes
sh:~$ snip attach add ca808a9a-ee52-4d1a-aa63-54673241a41b "Glacier National Park.pdf"
attaching files to snip ca808a9a-ee52-4d1a-aa63-54673241a41b Interesting files
attached Glacier National Park.pdf 165448 bytes

You can list all known attachments.

sh:~$ snip attach ls
uuid                                       size name
ccd1627f-1e51-45be-980e-f6169cf49337      22276 Cistothorus_palustris_Iona.jpg
d0d68511-4f71-4346-9f56-a61fe92e1a9c     165448 Glacier National Park.pdf

You can write an attachment to a local file using the saved name, or a custom name.

sh:~$ snip attach write ccd1627f-1e51-45be-980e-f6169cf49337
Cistothorus_palustris_Iona.jpg written -> Cistothorus_palustris_Iona.jpg 22276 bytes
sh:~$ snip attach write ccd1627f-1e51-45be-980e-f6169cf49337 wren_picture.jpg
Cistothorus_palustris_Iona.jpg written -> wren_picture.jpg 22276 bytes

All documents are analyzed and stemmed terms are stored in a document term-matrix via SQLite. The results will show matches and context of the match, along with word counts and total word count of the document.

sh:~$ snip search bird nature zealand
Wikipedia - Wren
  99bc71c7 (score: 0.347756, words: 148) [bird: 2, zealand: 1]
    [3-15] "are a family of brown passerine birds in the predominantly New World family"
    [58-70] "has been applied to other, unrelated birds, particularly the New Zealand wrens (Acanthisittidae)"
    [62-74] "other, unrelated birds, particularly the New Zealand wrens (Acanthisittidae) and the Australian wrens"

Interesting files
  ca808a9a (score: 0.266667, words: 23) [natur: 1]
    [11-23] "later. This is mostly information about nature, the environment, and other ecological conerns."

Notes

database location

The utility honors the environmental variable SNIP_DB for the location of the sqlite file. You can modify this in order to store the database file in a different directory than HOME.

interesting things
sqlite3 -table .snip.sqlite3 "select uuid, term, count, positions from snip_index" | fzf --no-sort --tac --preview "snip get {2} | grep -Ei --color=always '{4}\w*|$' | fold -sw 100"

In fzf use spaces between terms in any order

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CreateNewDatabase

func CreateNewDatabase() error

CreateNewDatabase creates a new sqlite3 database

func CumulativeTermsCount

func CumulativeTermsCount(id uuid.UUID) (int, error)

CumulativeTermsCount returns a total of all occurrences of all known terms in a document's search index

func DownCase

func DownCase(words []string) []string

DownCase returns a slice of strings that have been cased down

func DropIndex

func DropIndex() error

DropIndex drops the search index from the database

func FlattenString

func FlattenString(input string) string

FlattenString returns a string with all newline, tabs, and spaces squeezed

func GetAllSnipIDs

func GetAllSnipIDs() ([]uuid.UUID, error)

GetAllSnipIDs returns a slice of all known snip uuids

func GetAttachmentsAll

func GetAttachmentsAll() ([]uuid.UUID, error)

GetAttachmentsAll returns a slice of uuids for all attachments in the system

func GetAttachmentsUUID

func GetAttachmentsUUID(snipUUID uuid.UUID) ([]uuid.UUID, error)

GetAttachmentsUUID returns a slice of attachment uuids associated with supplied snip uuid

func GetIndexTermCount

func GetIndexTermCount(term string, id uuid.UUID) (int, error)

GetIndexTermCount returns the index count for a term matching id

func InsertSnip

func InsertSnip(s Snip) error

InsertSnip adds a new Snip to the database

func IsWord

func IsWord(word string) bool

IsWord determines if a string is a valid word using unicode functions

func Remove

func Remove(id uuid.UUID) error

Remove removes a snip from the database

func RemoveAttachment

func RemoveAttachment(id uuid.UUID) error

RemoveAttachment deletes an attachment from the database

func ScoreCounts

func ScoreCounts(id uuid.UUID, terms []string, counts []SearchCount) (float64, error)

ScoreCounts returns a floating point score for search result validity

func SearchIndexTerm

func SearchIndexTerm(terms []string, requireAll bool) (map[uuid.UUID][]SearchCount, error)

SearchIndexTerm searches the index and returns results matching the given term

func ShortenUUID

func ShortenUUID(id uuid.UUID) []string

func SplitWords

func SplitWords(data string) []string

SplitWords splits words using unicode standard splitting functions

func StripPunctuation

func StripPunctuation(words []string) []string

StripPunctuation strips all commas, periods, etc. from a slice of strings

func WriteAttachment

func WriteAttachment(id uuid.UUID, outfile string, forceWrite bool) (int, error)

WriteAttachment writes the attached file to the current working directory

Types

type Attachment

type Attachment struct {
	UUID      uuid.UUID
	Data      []byte
	Size      int
	SnipUUID  uuid.UUID
	Timestamp time.Time
	Name      string
}

Attachment represents data (binary safe) associated with a specific snip

func GetAttachmentFromUUID

func GetAttachmentFromUUID(searchUUID string) (Attachment, error)

func GetAttachmentMetadata

func GetAttachmentMetadata(searchUUID uuid.UUID) (Attachment, error)

GetAttachmentMetadata returns all fields except Data for analysis without large memory use

func GetAttachments

func GetAttachments(searchUUID uuid.UUID) ([]Attachment, error)

GetAttachments returns a slice of Attachment associated with the supplied snip uuid

func NewAttachment

func NewAttachment() Attachment

NewAttachment returns a new attachment struct with current defaults

type SearchCount

type SearchCount struct {
	Term  string
	Stem  string
	Count int
}

SearchCount contains info about a search term frequency from the index

type SearchResult

type SearchResult struct {
	UUID  uuid.UUID
	Terms []SearchCount
}

type SearchScore

type SearchScore struct {
	UUID         uuid.UUID
	Score        float64
	SearchCounts []SearchCount
}

type Snip

type Snip struct {
	Attachments []Attachment
	Data        string
	Timestamp   time.Time
	Name        string
	UUID        uuid.UUID
}

Snip represents a snippet of data with additional metadata

func GetFromUUID

func GetFromUUID(searchUUID string) (Snip, error)

GetFromUUID retrieves a single Snip by its unique identifier

func List

func List(limit int) ([]Snip, error)

List returns a slice of all Snips in the database

func New

func New() Snip

New returns a new snippet and generates a new UUID for it

func SearchDataTerm

func SearchDataTerm(term string) ([]Snip, error)

SearchDataTerm returns a slice of Snips whose data matches supplied terms

func SearchUUID

func SearchUUID(term string) ([]Snip, error)

SearchUUID returns a slice of Snips with uuids matching partial search term

func (*Snip) Attach

func (s *Snip) Attach(name string, data []byte) error

Attach adds files associated with a snip

func (*Snip) CountWords

func (s *Snip) CountWords() int

CountWords returns an integer estimating the number of words in data

func (*Snip) GatherContext

func (s *Snip) GatherContext(term string, adjacent int) ([]TermContext, error)

GatherContext returns the surrounding words matching the given term

func (*Snip) GenerateName

func (s *Snip) GenerateName(wordCount int) string

GenerateName returns a clean string derived from processing the data field

func (*Snip) GetPositions

func (s *Snip) GetPositions(term string) (string, error)

GetPositions gets the position indicators for a given term

func (*Snip) Index

func (s *Snip) Index() error

Index stems all data and writes it to a search table

func (*Snip) Rename

func (s *Snip) Rename(newName string) error

Rename updates the name field of a snip

func (*Snip) SetIndexTermCount

func (s *Snip) SetIndexTermCount(term string, count int) error

SetIndexTermCount inserts or updates the count of a term indexed

func (*Snip) SetPositions

func (s *Snip) SetPositions(term string, positions []int) error

SetPositions writes the word positions of a given term

func (*Snip) Update

func (s *Snip) Update() error

Update writes all fields, overwriting existing snip data

type TermContext

type TermContext struct {
	Before      []string
	BeforeStart int
	Term        string
	After       []string
	AfterEnd    int
}

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL