dupedetection

package
v0.0.0-...-9b9b30a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 28, 2021 License: MIT Imports: 21 Imported by: 0

Documentation

Overview

Package dupedetection provides functions to compute dupe detection fingerprints for specific image

Index

Constants

View Source
const (
	// DefaultInputDir that dd-service monitors for the new file to generate fingerprints
	DefaultInputDir = "input"
	// DefaultOutputDir that dd-service uses to put new fingerprints
	DefaultOutputDir = "output"
	// DefaultDataFile is path to the SQLite database used by dd-service
	DefaultDataFile = "dupe_detection_image_fingerprint_database.sqlite"
)

Variables

View Source
var InterfaceTypeError = errors.Errorf("Calculation function returned value of unexpected type.")

InterfaceTypeError indicates unexpected variable type is returned by invoked correlation calculation func

Functions

func Blomqvist

func Blomqvist(data1, data2 []float64) (float64, error)

Blomqvist calculates Blomqvist Beta correlation between arrays of input data

func ComputeRandomizedDependence

func ComputeRandomizedDependence(x, y []float64) float64

ComputeRandomizedDependence computes RDC correlation between input arrays of data

func FromFloat32To64

func FromFloat32To64(input []float32) []float64

FromFloat32To64 accepts input array of float32 values and returns float64 array of the input values

func GetMemoizer

func GetMemoizer() *memoize.Memoizer

GetMemoizer returns memoizer used to cache correlations calculations results

func HSIC

func HSIC(data1, data2 []float64) (float64, error)

HSIC Hilbert-Schmidt Independence Criterion between arrays of input data

func HoeffdingD

func HoeffdingD(data1, data2 []float64) (float64, error)

HoeffdingD calculates HoeffdingD correlation between arrays of input data

func Kendall

func Kendall(data1, data2 []float64) (float64, error)

Kendall calculates Kendall Tau correlation between arrays of input data

func MI

func MI(data1, data2 []float64) (float64, error)

MI calculates Mutual Information correlation between arrays of input data

func MeasureImageSimilarity

func MeasureImageSimilarity(candidateImageFingerprint []float32, fingerprintsArrayToCompareWith [][]float64, memoizationData MemoizationImageData, config ComputeConfig) (int, error)

MeasureImageSimilarity calculates similarity between candidateImageFingerprint and each value in fingerprintsArrayToCompareWith

func Pearson

func Pearson(data1, data2 []float64) (float64, error)

Pearson calculates Pearson R correlation between arrays of input data

func Spearman

func Spearman(data1, data2 []float64) (float64, error)

Spearman calculates Spearman Rho correlation between arrays of input data

Types

type ComputeConfig

type ComputeConfig struct {
	CorrelationMethodNameArray        []string
	StableOrderOfCorrelationMethods   []string
	UnstableOrderOfCorrelationMethods []string
	CorrelationMethodsOrder           string
	PearsonDupeThreshold              float64
	SpearmanDupeThreshold             float64
	KendallDupeThreshold              float64
	RandomizedDependenceDupeThreshold float64
	BlomqvistDupeThreshold            float64
	HoeffdingDupeThreshold            float64
	HoeffdingRound1DupeThreshold      float64
	HoeffdingRound2DupeThreshold      float64
	MIThreshold                       float64

	RootDir                  string
	NumberOfImagesToValidate int
	TrimByPercentile         bool

	FileStorage storage.FileStorage
}

ComputeConfig contains configurable parameters to calculate AUPRC of image similariy measurement

func NewComputeConfig

func NewComputeConfig() ComputeConfig

NewComputeConfig retirieves new ComputeConfig with default values

type Config

type Config struct {
	// input directory for monitoring the new file to generate fingerprints
	InputDir string `mapstructure:"input_dir" json:"input_dir,omitempty"`
	// output directory for using to put new fingerprints
	OutputDir string `mapstructure:"output_dir" json:"output_dir,omitempty"`
	// data directory for dupe detection
	DataFile string `mapstructure:"data_file" json:"data_file,omitempty"`
}

Config contains settings of the dupe detection service

func NewConfig

func NewConfig() *Config

NewConfig returns a new Config instance.

func (*Config) SetWorkDir

func (config *Config) SetWorkDir(workDir string)

SetWorkDir applies `workDir` to DataFile, OutputDir and InputDir if it was not specified as an absolute path.

type MemoizationImageData

type MemoizationImageData struct {
	SHA256HashOfFetchedImages []string
	SHA256HashOfCurrentImage  string
}

MemoizationImageData provides images data for memoization of correlation calculation methods results

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL