verbaflow

package module

v0.0.0-...-4d121eb Latest Latest Go to latest Published: Feb 22, 2023 License: BSD-2-Clause Imports: 14 Imported by: 1

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/nlpodyssey/verbaflow

Links

Open Source Insights

README ¶

VerbaFlow

Welcome to VerbaFlow, a neural architecture written in Go designed specifically for language modeling tasks. Built on the robust RWKV RNN, this model is optimized for efficient performance on standard CPUs, enabling smooth running of relatively large language models even on consumer hardware.

With the ability to utilize pretrained models on the Pile dataset, VerbaFlow performs comparably to GPT-like Transformer models in predicting the next token, as well as in other tasks such as text summarization, text classification, question answering, and general conversation.

Installation

Requirements:

Go 1.20

Clone this repo or get the library:

go get -u github.com/nlpodyssey/verbaflow

Usage

To start using VerbaFlow, we recommend using the pre-trained model RWKV-4-Pile-1B5-Instruct, available on the Hugging Face Hub. This model has been fine-tuned using the Pile dataset and has been specially designed to understand and execute human instructions, as fine-tuned on the xP3 dataset. The original RWKV-4-Pile-1B5-Instruct-test2-20230209 model, from which this model is derived, can be accessed here.

The library is optimized to run in x86-64 CPUs. If you want to run it on a different architecture, you can use the GOARCH=amd64 environment variable.

The following commands can be used to build and use VerbaFlow:

go build ./cmd/verbaflow

This command builds the go program and creates an executable named verbaflow.

./verbaflow -model-dir models/nlpodyssey/RWKV-4-Pile-1B5-Instruct download

This command downloads the model specified (in this case, "nlpodyssey/RWKV-4-Pile-1B5-Instruct" under the "models" directory)

./verbaflow -model-dir models/nlpodyssey/RWKV-4-Pile-1B5-Instruct convert

This command converts the downloaded model to the format used by the program.

./verbaflow -log-level trace -model-dir models/nlpodyssey/RWKV-4-Pile-1B5-Instruct inference --address :50051

This command runs the gRPC inference endpoint on the specified model.

Please make sure to have the necessary dependencies installed before running the above commands.

Examples

One of the most interesting features of the LLM is the ability to react based on the prompt.

Run the verbaflow gRPC endpoint with the command in inference, then run the prompttester example entering the following prompts:

Example 1

Prompt:

echo '\nQ: Briefly: The Universe is expanding, its constituent galaxies flying apart like pieces of cosmic shrapnel in the aftermath of the Big Bang. Which section of a newspaper would this article likely appear in?\n\nA:' | go run ./examples/prompttester --dconfig ./examples/prompttester/config.yaml

Expected output:

Science and Technology

Example 2

Prompt:

echo '\nQ:Translate the following text from French to English Je suis le père le plus heureux du monde\n\nA:' | go run ./examples/prompttester --dconfig ./examples/prompttester/config.yaml

Expected output:

I am the happiest father in the world.

Dependencies

A list of the main dependencies follows:

Spago - Machine Learning framework
RWKV - RWKV RNN implementation
GoTokenizers - Tokenizers library
GoPickle - Pickle library for Go

Roadmap

Download pretrained models from the Hugging Face models hub
Effective "prompts" catalog
Better sampling
Beam search
Better Tokenizer
Unit tests
Code refactoring
Documentation
gRPC ~~/HTTP~~ API

Credits

Thanks PENG Bo for creating the RWKV RNN and all related resources, including pre-trained models!

Trivia about the project's name

"VerbaFlow" combines "verba", which is the Latin word for words, and "flow", which alludes to the characteristics of recurrent neural networks by evoking the idea of a fluent and continuous flow of words, which is made possible by the network's ability to maintain an internal state and "remember" previous words and context when generating new words.

Documentation ¶

Index ¶

func BuildPromptFromTemplate(input InputPrompt, pt *template.Template) (string, error)
func BuildPromptFromTemplateFile(input InputPrompt, filename string) (string, error)
type InputPrompt
type VerbaFlow
- func Load(modelDir string) (*VerbaFlow, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func BuildPromptFromTemplate ¶

func BuildPromptFromTemplate(input InputPrompt, pt *template.Template) (string, error)

BuildPromptFromTemplate builds a prompt applying the given input to the template.

func BuildPromptFromTemplateFile ¶

func BuildPromptFromTemplateFile(input InputPrompt, filename string) (string, error)

BuildPromptFromTemplateFile builds a prompt applying the given input to the template file.

Types ¶

type InputPrompt ¶

type InputPrompt struct {
	Text           string `json:"text"`
	Question       string `json:"question,omitempty"`
	TargetLanguage string `json:"target_language,omitempty"`
}

InputPrompt is the input for the prompt generation.

type VerbaFlow ¶

type VerbaFlow struct {
	Model     *rwkvlm.Model
	Tokenizer tokenizer.Tokenizer
	// contains filtered or unexported fields
}

VerbaFlow is the core struct of the library.

func Load ¶

func Load(modelDir string) (*VerbaFlow, error)

Load loads a VerbaFlow model from the given directory.

func (*VerbaFlow) Close ¶

func (vf *VerbaFlow) Close() error

Close closes the model resources.

func (*VerbaFlow) Generate ¶

func (vf *VerbaFlow) Generate(ctx context.Context, nt *ag.NodesTracker, prompt string, chGen chan decoder.GeneratedToken, opts decoder.DecodingOptions) error

Generate generates a text from the given prompt. The "out" channel is used to stream the generated text. The generated text will be at most `maxTokens` long (in addition to the prompt).

func (*VerbaFlow) TokenByID ¶

func (vf *VerbaFlow) TokenByID(id int) (string, error)

TokenByID returns the token string for the given token ID.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
api
cmd
verbaflow
decoder
downloader
encoder
examples
prompttester Module
rwkvlm
service
sliceutils Package sliceutils provides types and functions for various operations over sliceutils of different types.	Package sliceutils provides types and functions for various operations over sliceutils of different types.
tokenizer
internal/bpetokenizer

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL