atom

package module
v0.0.0-...-4110d32 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 12, 2016 License: Apache-2.0 Imports: 3 Imported by: 0

README

Overview Travis Build

Package atom provides integer codes (also known as atoms) for a fixed set of frequently occurring strings

An atom is an unique ID of a frequently occurring name, it could be more effective in store and compare in O(1).

hello := atom.New("hello")

The atom value will be unique in the same process base on the creating order.

hello.Value()

The atom name/value mapping could be saved as a snapshot and serialized to the disk / database.

data, cache := atom.Save()

When new process start next time or on the remote, load those will restore the atom name/value mapping.

atom.Load(data, cache)

Install

Get and build package source.

go get -u github.com/flier/atom

Get and install generate atom command.

go install github.com/flier/atom/cmd/genatoms

Internal

An atom is a 4 bytes uint32, which contains the offset and length of the atom name in a global buffer, or the embedded short name (less than 4 bytes) in the value.

The highest bit set to 0 means the atom contains a long string in the global buffer.

+--------+--------------------------+
|0| len  |          offset          |
+--------+--------------------------+
|01234567|01234567|01234567|01234567|
+--------+--------------------------+

The hightest bit set to 1 means the atom embedded a short string in the value. The first byte must less than 0x80.

+--------+--------------------------+
|1|str[0]| str[1] | str[2] | str[3] |
+--------+--------------------------+
|1|  3   | str[0] | str[1] | str[2] |
+--------+--------------------------+
|1|  2   |        | str[0] | str[1] |
+--------+--------------------------+
|1|  1   |        |        | str[0] |
+--------+--------------------------+
|01234567|01234567|01234567|01234567|
+--------+--------------------------+

It means the maximum length of atom name is 127 bytes.

Atom.IsEmbedded will reports whether the atom embedded the name

Pregenerated Atoms

A build-in atoms buffer and cache could be generated with command:

$ genatoms -i atoms.txt -o atom.go -p atom -test

It will scan and extract all the Golang identifier from input file atoms.txt, the atom data and cache will be save to the output file atom.go with package name atom.

-case-insensitive
      case-insensitive atom (default true)
-format
      format the generated code (default true)
-i string
      read atom from the input file (default STDIN)
-o string
      write atom table to the output file (default STDOUT)
-p string
      generated package name (default "atom")
-test
      generate test table for the atom data

The extracted atom will be case insensitive by default.

Documentation

Overview

Package atom provides integer codes (also known as atoms) for a fixed set of frequently occurring strings

An atom is an unique ID of a frequently occurring name, it could be more effective in store and compare in O(1).

hello := atom.New("hello")

The atom value will be unique in the same process base on the creating order.

hello.Value()

The atom name/value mapping could be saved as a snapshot and serialized to the disk / database.

data, cache := atom.Save()

When new process start next time or on the remote, load those will restore the atom name/value mapping.

atom.Load(data, cache)

Internal

An atom is a 4 bytes uint32, which contains the offset and length of the atom name in a global buffer, or the embedded short name (less than 4 bytes) in the value.

The highest bit set to 0 means the atom contains a long string in the global buffer.

+--------+--------------------------+
|0| len  |          offset          |
+--------+--------------------------+
|01234567|01234567|01234567|01234567|
+--------+--------------------------+

The hightest bit set to 1 means the atom embedded a short string in the value. The first byte must less than 0x80.

+--------+--------------------------+
|1|str[0]| str[1] | str[2] | str[3] |
+--------+--------------------------+
|1|  3   | str[0] | str[1] | str[2] |
+--------+--------------------------+
|1|  2   |        | str[0] | str[1] |
+--------+--------------------------+
|1|  1   |        |        | str[0] |
+--------+--------------------------+
|01234567|01234567|01234567|01234567|
+--------+--------------------------+

It means the maximum length of atom name is 127 bytes.

Atom.IsEmbedded will reports whether the atom embedded the name

Pregenerated Atoms

A build-in atoms buffer and cache could be generated with command:

$ genatoms -i atoms.txt -o atom.go -p atom -test

It will scan and extract all the Golang identifier from input file `atoms.txt`, the atom data and cache will be save to the output file `atom.go` with package name `atom`.

-case-insensitive
      case-insensitive atom (default true)
-format
      format the generated code (default true)
-i string
      read atom from the input file (default STDIN)
-o string
      write atom table to the output file (default STDOUT)
-p string
      generated package name (default "atom")
-test
      generate test table for the atom data

The extracted atom will be case insensitive by default.

Index

Constants

View Source
const Empty = Atom(0)

Empty is an empty atom without name

View Source
const MaxAtomLen = 127

MaxAtomLen is the maximum length of an atom name

Variables

This section is empty.

Functions

func Load

func Load(data []byte, cache Cache)

Load load atoms data and cache

Types

type Atom

type Atom uint32

Atom is an integer code for a string.

The zero value maps to "".

func Lookup

func Lookup(s string) Atom

Lookup returns the atom whose name is s.

It returns Empty if there is no such atom. The lookup is case sensitive.

func New

func New(s string) Atom

New return an exists atom or create it whose name is s.

It returns Empty if s is longer than MaxAtomLen

func (Atom) Bytes

func (a Atom) Bytes() []byte

Bytes returns the bytes for the atom

func (Atom) Hash

func (a Atom) Hash() uint64

Hash return the hash value of atom name

func (Atom) IsEmbedded

func (a Atom) IsEmbedded() bool

IsEmbedded reports whether the atom contains an embedded string

func (Atom) IsEmpty

func (a Atom) IsEmpty() bool

IsEmpty reports whether the atom is empty

func (Atom) Len

func (a Atom) Len() int

Len return the atom name length

func (Atom) String

func (a Atom) String() string

func (Atom) Value

func (a Atom) Value() uint32

Value return the atom value

type Cache

type Cache map[uint64]Atom

Cache hash based atom caches

func Save

func Save() ([]byte, Cache)

Save save atoms data and cache

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL