probe

package module
v0.0.0-...-9b9b30a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 28, 2021 License: MIT Imports: 10 Imported by: 0

README

dupe-detection

Used ML models can be downloaded from there.

Hardware requirements:

  • at least 8GB of RAM

Ubuntu

Install swig

sudo apt-get install -y swig

Install relevant tensoflow C library:

wget https://storage.googleapis.com/tensorflow/libtensorflow/libtensorflow-cpu-linux-x86_64-2.4.0.tar.gz
sudo tar -C /usr/local -xzf ./libtensorflow-cpu-linux-x86_64-2.4.0.tar.gz
sudo /sbin/ldconfig -v

Download ML models and images:

pip install gdown
~/.local/bin/gdown https://drive.google.com/uc?id=1U6tpIpZBxqxIyFej2EeQ-SbLcO_lVNfu
unzip ./SavedMLModels.zip -d ./

wget https://www.dropbox.com/s/6ohzgvz418rhl4l/Animecoin_All_Finished_Works.zip
unzip Animecoin_All_Finished_Works.zip -d ./allRegisteredWorks

wget https://www.dropbox.com/s/4uajzyh09bc0rp3/dupe_detector_test_images.zip
unzip dupe_detector_test_images.zip -d ./dupes

wget https://www.dropbox.com/s/yjqsxsz97msai4e/non_duplicate_test_images.zip
unzip non_duplicate_test_images.zip -d ./originals

Download the latest test corpus of images:

~/.local/bin/gdown https://drive.google.com/uc?id=1BslINgdqs8ik7PiDjKKL1wlrRfQanVjQ
unzip test_corpus_opens_1.zip -d ./test_corpus

Goptuna Optimizer

Optimizer application is configured with cmaes sampler to find the maximum AUPRC.

Pre-requisites

goptuna cli tool executable should be downloaded from the project's GitHub Releases page.

Workflow

Goptuna studies results are saved into MySQL database.

Default setup would work with Docker MySQL images:

docker pull mysql:8.0

export GOPTUNA_CONTAINER=$(docker run   -d   --rm   -p 3306:3306   -e MYSQL_USER=goptuna   -e MYSQL_DATABASE=goptuna   -e MYSQL_PASSWORD=password   -e MYSQL_ALLOW_EMPTY_PASSWORD=yes   --name goptuna-mysql   mysql:8.0)

Create initial goptuna database structure and empty study:

./goptuna create-study --storage mysql://goptuna:password@localhost:3306/goptuna --study dupe-detection-aurpc

Run goptuna dashboard to observe studies results:

./goptuna dashboard --storage mysql://goptuna:password@127.0.0.1:3306/goptuna

Backup goptuna database data before it is vanished with termination of docker container:

docker exec $GOPTUNA_CONTAINER sh -c 'exec mysqldump --no-tablespaces --databases goptuna -ugoptuna -ppassword' > backup.sql

Restore goptuna database data from backup:

docker exec -i $GOPTUNA_CONTAINER sh -c 'exec mysql -ugoptuna -ppassword' < ./backup.sql

Run optimizer with imageCount parameter to limit number of analyzed images per trial;
runCount defines the number of trials per run;
studyName defines the name of Goptuna study;
rootDir defines the directory from where to load the corpus of images and where during the first run to generate sqlite database with fingerprints.

go run ./cmd/optimizer/ -rootDir "./test_corpus" -imageCount 30 -runCount 100 -studyName "dupe-detection-aurpc"

Regenerate swig wrapper files

cd wdm/wrapper
swig -go -cgo -c++ -intgosize 64 wrapper.i

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Fingerprint

type Fingerprint []float32

Fingerprint represents an image fingerprint.

func FingerprintFromBytes

func FingerprintFromBytes(data []byte) Fingerprint

FingerprintFromBytes returns a new Fingerprint instance by converting the given bytes to its content.

func FingerprintFromTruncatedLSB

func FingerprintFromTruncatedLSB(data []byte) Fingerprint

FingerprintFromTruncatedLSB returns a new Fingerprint instance by converting the given least significant byte (LSB) truncated bytes to its content.

func (Fingerprint) Bytes

func (fg Fingerprint) Bytes() []byte

Bytes converts content to bytes.

func (Fingerprint) LSBTruncatedBytes

func (fg Fingerprint) LSBTruncatedBytes() []byte

LSBTruncatedBytes converts content to bytes and truncates least significant byte (LSB) from each float32 value in the array

type Fingerprints

type Fingerprints []Fingerprint

Fingerprints is multiple Fingerprint

func (Fingerprints) Single

func (fgs Fingerprints) Single() Fingerprint

Single combines into single Fingerprint type.

type Tensor

type Tensor interface {
	// Fingerprints computes and returns fingerprints for the given image by models.
	Fingerprints(ctx context.Context, img image.Image) (Fingerprints, error)

	// LoadModels loads all models.
	LoadModels(ctx context.Context) error
}

Tensor represents image analysis based on machine learning methods.

func NewTensor

func NewTensor(baseDir string, tfModelConfigs []tfmodel.Config) Tensor

NewTensor returns a new Tensor interface implementation.

Directories

Path Synopsis
cmd
pkg
dupedetection
Package dupedetection provides functions to compute dupe detection fingerprints for specific image
Package dupedetection provides functions to compute dupe detection fingerprints for specific image
wdm

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL