gollama

package module
v0.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 19, 2023 License: BSD-3-Clause Imports: 4 Imported by: 0

README

GoLlama: Llama.cpp IPC Library

=======================================================================

GoLLama is a lightweight inter-process communication library for developing LLM applications using Go and llama.cpp. It provides a simple and intuitive way to interact with the LLM runtime on top of llama.cpp using stdin/stdout.

Diagram

Usage

To use GoLlama, you can import the gollama package and create an instance of the LLM struct. You can then use the PromptModel method to send and receive data between your application and any LLM supported by llama.cpp. Here is an example:


package main

import (
"fmt"
"github.com/CenturySturgeon/gollama"
)

func  main() {
	llm := gollama.LLM{Model: "../llama.cpp/models/llama-2-13b-chat.ggmlv3.q4_0.bin", Llamacpp: "../llama.cpp", Ngl: 30}

	outputs, err := llm.PromptModel([]string{"Hi how are you ?"})

	if err != nil {
		fmt.Println("Error occured on propmt: ", err)
	}
	fmt.Println(outputs)
}

This example demonstrates how to use Go to interact with a llama-2-13b-chat LLM instance, which is running on top of llama.cpp, using GoLlama's PromptModel method. It prompts the LLM with the question "Hi, how are you?" and then reads the response back into the outputs variable to print it afterwards.

Cloning the repo

GoLlama requires a running instance of Llama.cpp in order to communicate with any LLM. Run the following command to clone the repo alonside the llama.cpp submodule:

git clone --recursive https://github.com/CenturySturgeon/gollama.git

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type LLM

type LLM struct {
	Model            string   // Path to the model.bin
	Llamacpp         string   // Path to the llama.cpp folder
	CudaDevices      []int    // Array of indices of the Cuda devices that will be used
	CtxSize          int      // Size of the prompt context
	Temp             float32  // Temperature
	TopK             int      // Top-k sampling
	RepeatPenalty    float32  // Penalize repeat sequence of tokens
	Ngl              int      // Number of layers to store in VRAM
	CpuCores         int      // Number of physical cpu cores
	MaxTokens        int      // Max number of tokens for model response
	Stop             []string // Array of generation-stopping strings
	InstructionBlock string   // Instructions to format the model response
}

func (*LLM) BufferPromptModel

func (llm *LLM) BufferPromptModel(prompt string, outputChan chan<- string)

BufferPromptModel prompts the model expecting the real time output, allowing you to use its response as it's being generated. It sends the LLM response tokens as strings to the provided channel.

func (*LLM) GetLLMProps

func (llm *LLM) GetLLMProps()

GetLLMProps reads the properties currently set to the LLM struct.

func (*LLM) PromptModel

func (llm *LLM) PromptModel(prompts []string) ([]string, error)

PromptModel method orderly prompts the LLM with the provided prompts in the array, engaging in a sort of conversation. It returns an array with the respones of the LLM, each response matching with the index of its prompt.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL