cld3

package
v0.0.0-...-cc40e88 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 31, 2021 License: Apache-2.0 Imports: 3 Imported by: 2

Documentation

Overview

Package cld3 implements language detection using the Compact Language Detector v3.

This packages includes the relevant sources from the CLD3 project, so it doesn't require any external dependencies. For more information on CLD3, see https://github.com/google/cld3/ .

Index

Constants

View Source
const UnknownLang = "und"

UnknownLang is the value of Result.Language returned if FindLanguage can't determine what language the text was written in.

Variables

View Source
var (
	ErrMaxLessThanOrEqToZero  = errors.New("cld3: maxNumBytes passed to NewLanguageIdentifier must be greater than 0")
	ErrMinLessThanZero        = errors.New("cld3: minNumBytes passed to NewLanguageIdentifier must be greater than or equal to 0")
	ErrMaxSmallerOrEqualToMin = errors.New("cld3: maxNumBytes passed to NewLanguageIdentifier must be larger than minNumBytes")
)

Functions

func FreeLanguageIdentifier

func FreeLanguageIdentifier(li LanguageIdentifier)

Types

type LanguageIdentifier

type LanguageIdentifier struct {
	// contains filtered or unexported fields
}

func NewLanguageIdentifier

func NewLanguageIdentifier(minNumBytes, maxNumBytes int) (LanguageIdentifier, error)

NewLanguageIdentifier returns a LanguageIdentifier. minNumBytes is the minimum numbers of bytes to consider in the text before making a decision and maxNumBytes is the maximum of the same. Chromium uses 0 and 512, respectively for its i18n work. LanguageIdentifier must be deallocated explicitly with FreeLanguageIdentifier.

func (LanguageIdentifier) FindLanguage

func (li LanguageIdentifier) FindLanguage(text string) Result

FindLanguage detects the language in a given text. The Result's Language will be set to the value of the constant UnknownLang if it is unknown.

type Result

type Result struct {
	Language string

	// Probability is the probability from 0 to 1 of the text being in the
	// returned Language.
	Probability float32

	// IsReliable is true when the prediction is reliable.
	IsReliable bool

	// Proportion of bytes associated with the language. If FindLanguage is
	// called, this variable is set to 1.
	Proportion float32
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL