chunk

package
v0.0.0-...-2e32b89 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 27, 2023 License: MIT Imports: 18 Imported by: 0

Documentation

Index

Constants

View Source
const EmbeddingModel = "text-embedding-ada-002"
View Source
const MaxTokensPerChunk = 1500

MaxTokensPerChunk is the maximum number of tokens allowed in a single chunk for OpenAI embeddings MaxTokensPerChunk is the maximum number of tokens allowed in a single chunk for OpenAI embeddings

Variables

This section is empty.

Functions

func ExtractFilesFromZip

func ExtractFilesFromZip(f multipart.File) ([]string, []string, error)

ExtractFilesFromZip extracts the text from files within a zip file

func ExtractTextFromPDF

func ExtractTextFromPDF(f multipart.File, fileSize int64) (string, error)

extract human-readable text from a given pdf with support for spaces/whitespace.

func GetTextFromFile

func GetTextFromFile(f multipart.File) (string, error)

Types

type Chunk

type Chunk struct {
	Start int
	End   int
	Title string
	Text  string
}

func CreateChunks

func CreateChunks(fileContent string, title string) ([]Chunk, error)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL