dogsvscats

package
v0.9.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 20, 2024 License: Apache-2.0 Imports: 27 Imported by: 0

README

Kaggle's Dogs vs Cats Competition Dataset

https://www.kaggle.com/c/dogs-vs-cats

Downloaded from:

https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_5340.zip

This includes a library to download, unzip, create datasets and augment images for Kaggle's Dogs vs Cats competition.

Finally it includes two different demos:

  1. a Jupyter Notebook (using GoNB) with a demo of how to use the library.
  2. a small demo stand-alone program.

Documentation

Index

Constants

View Source
const (
	DownloadURL   = "https://download.microsoft.com/download/3/E/1/3E1C3F21-ECDB-4869-8368-6DEBA77B919F/kagglecatsanddogs_5340.zip"
	LocalZipFile  = "kagglecatsanddogs_5340.zip"
	LocalZipDir   = "PetImages"
	InvalidSubDir = "invalid"

	DownloadChecksum = "b7974bd00a84a99921f36ee4403f089853777b5ae8d151c76a86e64900334af9"
)
View Source
const (
	PreGeneratedTrainFileName      = "train_data.bin"
	PreGeneratedTrainPairFileName  = "train_pair_data.bin"
	PreGeneratedTrainEvalFileName  = "train_eval_data.bin"
	PreGeneratedValidationFileName = "validation_eval_data.bin"
)

Variables

View Source
var (
	ImgSubDirs   = []string{"Dog", "Cat"}
	BadDogImages = map[int]bool{11233: true, 11702: true, 11912: true, 2317: true, 9500: true}
	BadCatImages = map[int]bool{10404: true, 11095: true, 12080: true, 5370: true, 6435: true, 666: true}
	BadImages    = [2]map[int]bool{BadDogImages, BadCatImages}

	MaxCount = 12500
	NumDogs  = MaxCount - len(BadDogImages)
	NumCats  = MaxCount - len(BadCatImages)
	NumValid = [2]int{NumDogs, NumCats}
)
View Source
var (
	DefaultConfig = &Configuration{
		DType:           shapes.Float32,
		BatchSize:       16,
		EvalBatchSize:   100,
		ModelImageSize:  inceptionv3.MinimumImageSize,
		NumFolds:        5,
		TrainFolds:      []int{0, 1, 2, 3},
		ValidationFolds: []int{4},
		FoldsSeed:       0,
		UseParallelism:  true,
		BufferSize:      32,
		NumSamples:      -1,
	} // DType used for model.

)

Functions

func AssertNoError

func AssertNoError(err error)

AssertNoError log.Fatal if err is not nil.

func BytesToTensor

func BytesToTensor[T shapes.NumberNotComplex](buffer []byte, numImages, width, height int) (t *tensor.Local)

BytesToTensor converts a batch of saved images as bytes to a tensor.Local with 4 channels: R,G,B and A. It assumes all images have the exact same size. There should be one byte with the label before each image.

func CreateDatasets

func CreateDatasets(config *Configuration) (trainDS, trainEvalDS, validationEvalDS train.Dataset)

CreateDatasets used for training and evaluation. If the pre-generated files with augmented/scaled images exist use that, otherwise dynamically generate the images -- typically much slower than training, hence makes the training much, much slower.

func Download

func Download(baseDir string) error

Download Dogs vs Cats Dataset to baseDir, unzips it, and checks for mal-formed files (there are a few).

func FilterValidImages

func FilterValidImages(baseDir string) error

FilterValidImages tries to open every image, and moves invalid (that are not readable) to a separate directory. One should use it only if the database of images change, otherwise use the PrefilterValidImages, which uses the static list of invalid images.

func GetImageFromFilePath

func GetImageFromFilePath(imagePath string) (image.Image, error)

func PreGenerate

func PreGenerate(config *Configuration, numEpochsForTraining int, force bool)

PreGenerate create datasets that reads the original images, but then saves the scaled down and augmented for training images in binary format, for faster consumption later.

It will only run if files don't already exist.

func PrefilterValidImages

func PrefilterValidImages(baseDir string) error

PrefilterValidImages is like FilterValidImages, but uses pre-generated list of images known to be invalid.

func ResizeWithPadding

func ResizeWithPadding(img image.Image, width, height int) image.Image

ResizeWithPadding will

Types

type Configuration

type Configuration struct {
	// DataDir, where downloaded and generated data is stored.
	DataDir string

	// DType of the images when converted to Tensor.
	DType shapes.DType

	// BatchSize for training and evaluation batches.
	BatchSize, EvalBatchSize int

	// ModelImageSize is use for height and width of the generated images.
	ModelImageSize int

	// YieldImagePairs if to yield an extra input with the paired image: same image, different random augmentation.
	// Only applies for Train dataset.
	YieldImagePairs bool

	// NumFolds for cross-validation.
	NumFolds int

	// Folds to use for train and validation.
	TrainFolds, ValidationFolds []int

	// FoldsSeed used when randomizing the folds assignment, so it can be done deterministically.
	FoldsSeed int32

	// AngleStdDev for angle perturbation of the image. Only active if > 0.
	AngleStdDev float64

	// FlipRandomly the image, for data augmentation. Only active if true.
	FlipRandomly bool

	// ForceOriginal will make CreateDatasets not use the pre-generated augmented datasets, even if
	// they are present.
	ForceOriginal bool

	// UseParallelism when using Dataset.
	UseParallelism bool

	// BufferSize used for data.ParallelDataset, to cache intermediary batches. This value is used
	// for each dataset.
	BufferSize int

	// NumSamples is the maximum number of samples the model is allowed to see. If set to -1
	// model can see all samples.
	NumSamples int
}

Configuration of the many pre-designed tasks.

type Dataset

type Dataset struct {
	BaseDir string
	// contains filtered or unexported fields
}

Dataset implements train.Dataset so it can be used by a train.Loop object to train/evaluate, and offers a few more functionality for sampling images (as opposed to tensors).

var (
	AssertDatasetIsTrainDataset *Dataset
)

func NewDataset

func NewDataset(name, baseDir string, batchSize int, infinite bool, shuffle *rand.Rand,
	numFolds int, folds []int, foldsSeed int32,
	width, height int, angleStdDev float64, flipRandomly bool, dtype shapes.DType) *Dataset

NewDataset creates a train.Dataset that yields images from Dogs vs Cats Dataset.

It takes the following arguments:

  • batchSize: how many images are returned by each Yield call.
  • infinite: if it is set keeps looping, never ends. Typically used for training with `train.Loop.RunSteps()`. Set this to false for evaluation datasets, or if training with `train.Loop.RunEpochs()`.
  • shuffle: if set (not nil) use this `*rand.Rand` object to shuffle. If infinite it samples with replacement.
  • numFolds, folds and foldsSeed: splits whole data into numFolds folds (using foldsSeed) and take only `folds` in this Dataset. Can be used to split train/test/validation datasets or run a cross-validation scheme.
  • width, height: resize images to this size. It will not distort the scale, and extra padding will be included with 0s, including on the alpha channel (so padding is transparent).
  • angleStdDev and flipRandomly: if set to true, it will randomly transform the images, augmenting so to say the Dataset. It serves as a type of regularization. Set angleStdDev to 0 and flipRandomly to false not to include any augmentation.

func (*Dataset) Augment

func (ds *Dataset) Augment(img image.Image) image.Image

Augment image according to specification of the Dataset.

func (*Dataset) Name

func (ds *Dataset) Name() string

Name implements train.Dataset.

func (*Dataset) Reset

func (ds *Dataset) Reset()

Reset restarts the Dataset from the beginning. Can be called after io.EOF is reached, for instance when running another evaluation on a test Dataset.

func (*Dataset) Save

func (ds *Dataset) Save(numEpochs int, verbose bool, writers ...io.Writer) error

Save will generate numEpochs of the dataset, with configured augmentations and resizing, and saves to the given file(s).

If more than one file is given, the same image but with a different augmentations is saved in each file.

If dataset is set to infinite it fails.

If verbose is set to true, it will output a progress bar.

func (*Dataset) WithImagePairs added in v0.8.0

func (ds *Dataset) WithImagePairs(yieldPairs bool) *Dataset

WithImagePairs configures the dataset to yield image pairs: the same image with different augmentation. Used for BYOL training.

Returns itself, to allow chain of method calls.

func (*Dataset) Yield

func (ds *Dataset) Yield() (spec any, inputs, labels []tensor.Tensor, err error)

Yield implements `train.Dataset`. It returns:

  • spec: not used, left as nil.
  • inputs: two tensors, the first is the images batch (shaped `[batch_size, height, width, depth==4]`) and the second holds the indices of the images as int (I64), shaped `[batch_size]`.

func (*Dataset) YieldImages

func (ds *Dataset) YieldImages() (images []image.Image, labels []DorOrCat, indices []int, err error)

YieldImages yields a batch of images, their labels (Dog or Cat) and their indices. These are the raw images that can be used for displaying. See Yield below to get tensors that can be used for training.

If WithImagePairs is set to true, it will return double the number of images: the second half is a repeat of the first, just with a different augmentation.

type DorOrCat

type DorOrCat int8
const (
	Dog DorOrCat = iota
	Cat
)

func (DorOrCat) String

func (t DorOrCat) String() string

type PreGeneratedDataset

type PreGeneratedDataset struct {
	// contains filtered or unexported fields
}

PreGeneratedDataset implements train.Dataset by reading the images from the pre-generated (scaled and optionally augmented) images data. See Dataset.Save for saving these pre-generated files.

func NewPreGeneratedDataset

func NewPreGeneratedDataset(name, filePath string, batchSize int, infinite bool, width, height int, dtype shapes.DType) *PreGeneratedDataset

NewPreGeneratedDataset creates a PreGeneratedDataset that yields dogsvscats images and labels.

func (*PreGeneratedDataset) Name

func (pds *PreGeneratedDataset) Name() string

Name implements train.Dataset.

func (*PreGeneratedDataset) Reset

func (pds *PreGeneratedDataset) Reset()

Reset implements train.Dataset.

func (*PreGeneratedDataset) WithImagePairs added in v0.8.0

func (pds *PreGeneratedDataset) WithImagePairs(pairFilePath string) *PreGeneratedDataset

WithImagePairs configures the dataset to yield image pairs (with the different augmentation).

It takes a second file path `pairFilePath` that points to the pair images. If `pairFilePath` is empty, it disables yielding image pairs.

func (*PreGeneratedDataset) WithMaxSteps added in v0.3.1

func (pds *PreGeneratedDataset) WithMaxSteps(numSteps int) *PreGeneratedDataset

WithMaxSteps configures the dataset to exhaust after those many steps, returning `io.EOF`.

This is useful for testing.

func (*PreGeneratedDataset) Yield

func (pds *PreGeneratedDataset) Yield() (spec any, inputs, labels []tensor.Tensor, err error)

Yield implements train.Dataset.

Directories

Path Synopsis
demo for Dogs vs Cats library: you can run this program in 3 different ways:
demo for Dogs vs Cats library: you can run this program in 3 different ways:

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL