llama

module
v0.0.0-...-85e67b9 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 8, 2024 License: MIT

README

LLM Platform

Open Source LLM Platform to build and deploy applications at scale

Logo

Integrations & Configuration

LLM Providers
OpenAI Platform

https://platform.openai.com/docs/api-reference

providers:
  - type: openai
    token: sk-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    models:
      gpt-3.5-turbo:
        id: gpt-3.5-turbo-1106

      gpt-4:
        id: gpt-4-1106-preview
        
      text-embedding-ada-002:
        id: text-embedding-ada-002
Azure OpenAI Service

https://azure.microsoft.com/en-us/products/ai-services/openai-service

providers:
  - type: openai
    url: https://xxxxxxxx.openai.azure.com
    token: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    models:
      gpt-3.5-turbo:
        id: gpt-35-turbo-16k

      gpt-4:
        id: gpt-4-32k
        
      text-embedding-ada-002:
        id: text-embedding-ada-002
Anthropic

https://www.anthropic.com/api

providers:
  - type: anthropic
    token: sk-ant-apixx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    models:
      claude-3-opus:
        id: claude-3-opus-20240229
Ollama

https://ollama.ai

$ ollama start
$ ollama run mistral
providers:
  - type: ollama
    url: http://localhost:11434

    models:
      mistral-7b-instruct:
        id: mistral
LLAMA.CPP

https://github.com/ggerganov/llama.cpp/tree/master/examples/server

# using taskfile.dev
$ task llama:server

# LLAMA.CPP Server
$ llama-server --port 9081 --log-disable --model ./models/mistral-7b-instruct-v0.2.Q4_K_M.gguf

# LLAMA.CPP Server (Multimodal Model)
$ llama-server --port 9081 --log-disable --model ./models/llava-v1.5-7b-Q4_K.gguf --mmproj ./models/llava-v1.5-7b-mmproj-Q4_0.gguf

# using Docker (might be slow)
$ docker run -it --rm -p 9081:9081 -v ./models/:/models/ ghcr.io/ggerganov/llama.cpp:server --host 0.0.0.0 --port 9081 --model /models/mistral-7b-instruct-v0.2.Q4_K_M.gguf
providers:
  - type: llama
    url: http://localhost:9081

    models:
      mistral-7b-instruct:
        id: /models/mistral-7b-instruct-v0.2.Q4_K_M.gguf
WHISPER.CPP

https://github.com/ggerganov/whisper.cpp/tree/master/examples/server

# using taskfile.dev
$ task whisper:server

# WHISPER.CPP Server
$ whisper-server --port 9083 --convert --model ./models/whisper-ggml-medium.bin
providers:
  - type: whisper
    url: http://localhost:9085

    models:
      whisper:
        id: whisper
Hugging Face

https://huggingface.co/

providers:
  - type: huggingface
    url: https://api-inference.huggingface.co/models/mistralai/Mistral-7B-Instruct-v0.1

    models:
      mistral-7B-instruct:
        id: tgi
Mimic 3
mkdir -p mimic3
chmod 777 mimic3
docker run -it -p 59125:59125 -v $(pwd)/models/mimic3:/home/mimic3/.local/share/mycroft/mimic3 mycroftai/mimic3
providers:
  - type: mimic
      url: http://localhost:59125
  
      models:
        tts-1:
          id: mimic-3
LangChain / LangServe

https://python.langchain.com/docs/langserve

providers:
  - type: langchain
    url: http://your-langchain-server:8000

    models:
      langchain:
        id: default
Vector Databses / Indexes
Chroma

https://www.trychroma.com

# using Docker
$ docker run -it --rm -p 9083:8000 -v chroma-data:/chroma/chroma ghcr.io/chroma-core/chroma
indexes:
  docs:
    type: chroma
    url: http://localhost:9083
    namespace: docs
    embedding: text-embedding-ada-002
Weaviate

https://weaviate.io

# using Docker
$ docker run -it --rm -p 9084:8080 -v weaviate-data:/var/lib/weaviate -e AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED=true -e PERSISTENCE_DATA_PATH=/var/lib/weaviate semitechnologies/weaviate
indexes:
  docs:
    type: weaviate
    url: http://localhost:9084
    namespace: Document
    embedding: text-embedding-ada-002
In-Memory
indexes:
  docs:
    type: memory   
    embedding: text-embedding-ada-002
OpenSearch / Elasticsearch
# using Docker
docker run -it --rm -p 9200:9200 -v opensearch-data:/usr/share/opensearch/data -e "discovery.type=single-node" -e DISABLE_SECURITY_PLUGIN=true opensearchproject/opensearch:latest
indexes:
  docs:
    type: elasticsearch
    url: http://localhost:9200
    namespace: docs
Extractors
Text
extractors:
  text:
    type: text
Code

Supported Languages:

  • C#
  • C++
  • Go
  • Java
  • Kotlin
  • Java Script
  • Type Script
  • Python
  • Ruby
  • Rust
  • Scala
  • Swfit
extractors:
  code:
    type: code
Tesseract

https://tesseract-ocr.github.io

# using Docker
docker run -it --rm -p 9086:8884 hertzg/tesseract-server:latest
extractors:
  tesseract:
    type: tesseract
    url: http://localhost:9086
Unstructured

https://unstructured.io

# using Docker
docker run -it --rm -p 9085:8000 quay.io/unstructured-io/unstructured-api:0.0.64 --port 8000 --host 0.0.0.0
extractors:
  unstructured:
    type: unstructured
    url: http://localhost:9085
Classifications
LLM Classifier
classifiers:
  {classifier-id}:
    type: llm
    model: mistral-7b-instruct
    classes:
      class-1: "...Description when to use Class 1..."
      class-2: "...Description when to use Class 2..."

Use Cases

Retrieval Augmented Generation (RAG)
Configuration
chains:
  qa:
    type: rag
    index: docs
    model: mistral-7b-instruct

    # limit: 10
    # distance: 1

    # filters:
    #  {metadata-key}:
    #    classifier: {classifier-id}
Index Documents

Using Extractor

POST http://localhost:8080/v1/index/{index-name}/{extractor}
Content-Type: application/pdf
Content-Disposition: attachment; filename="filename.pdf"

Using Documents

POST http://localhost:8080/v1/index/{index-name}
[
    {
        "id": "id1",
        "content": "content of document...",
        "metadata": {
          "key1": "value1",
          "key2": "value2"
        }
    },
    {
        "id": "id2",
        "content": "content of document...",
        "metadata": {
          "key1": "value1",
          "key2": "value2"
        }
    }
]
Function Calling
Hermes Function Calling
providers:
  - type: llama
    url: http://localhost:9081

    models:
      hermes-2-pro:
        id: /models/Hermes-2-Pro-Mistral-7B.Q4_K_M.gguf
        adapter: hermesfn

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL