coco

package module
v0.0.0-...-bc882ea Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 30, 2022 License: BSD-2-Clause Imports: 3 Imported by: 0

README

coco

coco dataset api for go

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IoUBB

func IoUBB(dt, gt BB, iscrowd []byte) (out []float64)

IoUBB -Compute intersection over union between bounding boxes. void bbIou( BB dt, BB gt, siz m, siz n, byte *iscrowd, double *o );

func IoURLE

func IoURLE(dt, gt *RLE, iscrowd []byte) (out []float64)

IoURLE Compute intersection over union between masks. void rleIou( RLE *dt, RLE *gt, siz m, siz n, byte *iscrowd, double *o );

func NonMaxSupBB

func NonMaxSupBB(dt BB, thresh float64) (keep []bool)

NonMaxSupBB non-maximum suppression between bounding boxes void bbNms( BB dt, siz n, uint *keep, double thr );

Types

type BB

type BB []float64

BB bounding box

func (BB) ToRLE

func (b BB) ToRLE(h, w, n uint32) *RLE

ToRLE Convert bounding boxes to encoded masks. void rleFrBbox( RLE *R, const BB bb, siz h, siz w, siz n );

type Char

type Char struct {
	Cc unsafe.Pointer
}

Char contains a pointer to a c char string

func (*Char) ToRLE

func (c *Char) ToRLE(h, w uint32) *RLE

ToRLE Convert from compressed string representation of encoded mask. void rleFrString( RLE *R, char *s, siz h, siz w );

type Edge

type Edge [2]uint32

Edge desribes a 2 point edge Probably [x,y] I haven't tested it yet

type ICAnnotation

type ICAnnotation struct {
	ID      int    `json:"id,omitempty"`
	ImageID int    `json:"image_id,omitempty"`
	Caption string `json:"caption,omitempty"`
}

ICAnnotation is used for image capture annotation

type Image

type Image struct {
	ID           int    `json:"id,omitempty"`
	Width        int    `json:"width,omitempty"`
	Height       int    `json:"height,omitempty"`
	FileName     string `json:"file_name,omitempty"`
	License      int    `json:"license,omitempty"`
	FlickrURL    string `json:"flickr_url,omitempty"`
	CocoURL      string `json:"coco_url,omitempty"`
	DateCaptured string `json:"date_captured,omitempty"`
}

Image is the image information and is shared between all the dataformats

type ImageCaption

type ImageCaption struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []ICAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
}

ImageCaption is used for the json annotation for image captioning from cocodataset.org documentation These annotations are used to store image captions. Each caption describes the specified image and each image has at least 5 captions (some images have more). See also the captioning task.

type ImageCaptioning

type ImageCaptioning struct {
	ImageID int    `json:"image_id"`
	Caption string `json:"caption"`
}

ImageCaptioning is the form for results for Image Captioning

type Information

type Information struct {
	Year        int    `json:"year,omitempty"`
	Version     string `json:"version,omitempty"`
	Description string `json:"description,omitempty"`
	Contributor string `json:"contributor,omitempty"`
	URL         string `json:"url,omitempty"`
	DateCreated string `json:"date_created,omitempty"`
}

Information is basic image and Coco information and is shared between all the data formats

type KPAnnotation

type KPAnnotation struct {
	Keypoints    []float32  `json:"keypoints,omitempty"`
	NumKeypoints int        `json:"num_keypoints,omitempty"`
	ID           int        `json:"id,omitempty"`
	ImageID      int        `json:"image_id,omitempty"`
	CategoryID   int        `json:"category_id,omitempty"`
	Segmentation Segment    `json:"segmentation,omitempty"`
	Area         float32    `json:"area,omitempty"`
	Bbox         [4]float32 `json:"bbox,omitempty"`
	Iscrowd      byte       `json:"iscrowd,omitempty"`
}

KPAnnotation contains the keypoint annotation information

type KPCategories

type KPCategories struct {
	Keypoints     []string `json:"keypoints,omitempty"`
	Skeleton      []Edge   `json:"skeleton,omitempty"`
	ID            int      `json:"id,omitempty"`
	Name          string   `json:"name,omitempty"`
	Supercategory string   `json:"supercategory,omitempty"`
}

KPCategories are the the catagories for key point

type KeyPoint

type KeyPoint struct {
	ImageID    int       `json:"image_id"`
	CategoryID int       `json:"category_id"`
	Keypoints  []float32 `json:"keypoints"`
	Score      float32   `json:"score"`
}

KeyPoint is the form for results of Keypoint Detection From cocodataset.org documentation: Note: keypoint coordinates are floats measured from the top left image corner (and are 0-indexed). We recommend rounding coordinates to the nearest pixel to reduce file size. Note also that the visibility flags vi are not currently used (except for controlling visualization), we recommend simply setting vi=1.

type KeypointDetection

type KeypointDetection struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []KPAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
	Categories  []KPAnnotation `json:"categories,omitempty"`
}

KeypointDetection are Key Point Detection data types in the json format From cocodataset.org Documentation:

A keypoint annotation contains all the data of the object annotation (including id, bbox, etc.)

and two additional fields. First, "keypoints" is a length 3k array where k is the total number of keypoints defined for the category. Each keypoint has a 0-indexed location x,y and a visibility flag v defined as v=0: not labeled (in which case x=y=0), v=1: labeled but not visible, and v=2: labeled and visible. A keypoint is considered visible if it falls inside the object segment. "num_keypoints" indicates the number of labeled keypoints (v>0) for a given object (many objects, e.g. crowds and small objects, will have num_keypoints=0). Finally, for each category, the categories struct has two additional fields: "keypoints," which is a length k array of keypoint names, and "skeleton", which defines connectivity via a list of keypoint edge pairs and is used for visualization. Currently keypoints are only labeled for the person category (for most medium/large non-crowd person instances). See also the keypoint task.

type License

type License struct {
	ID   int    `json:"id,omitempty"`
	Name string `json:"name,omitempty"`
	URL  string `json:"url,omitempty"`
}

License is the license information and is shared between all the formats

type ODAnnotation

type ODAnnotation struct {
	ID           int        `json:"id,omitempty"`
	ImageID      int        `json:"image_id,omitempty"`
	CategoryID   int        `json:"category_id,omitempty"`
	Segmentation Segment    `json:"segmentation,omitempty"`
	Area         float32    `json:"area,omitempty"`
	Bbox         [4]float32 `json:"bbox,omitempty"`
	Iscrowd      byte       `json:"iscrowd,omitempty"`
}

ODAnnotation is the object detection annotation

type ODCategories

type ODCategories struct {
	ID            int    `json:"id,omitempty"`
	Name          string `json:"name,omitempty"`
	Supercategory string `json:"supercategory,omitempty"`
}

ODCategories is the object detection categories.

type ObjectBB

type ObjectBB struct {
	ImageID    int        `json:"image_id"`
	CategoryID int        `json:"category_id"`
	BBox       [4]float32 `json:"bbox"` // x,y widith, height,
	Score      float32    `json:"score"`
}

ObjectBB is the form for results for Bounding Boxes Object detection. From cocodataset.org documentation: Note: box coordinates are floats measured from the top left image corner (and are 0-indexed). We recommend rounding coordinates to the nearest tenth of a pixel to reduce resulting JSON file size.

type ObjectDetection

type ObjectDetection struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []ODAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
	Categories  []ODCategories `json:"categories,omitempty"`
}

ObjectDetection is used for object detection jason format from cocodataset.org documentation

Each object instance annotation contains a series of fields, including the category

id and segmentation mask of the object. The segmentation format depends on whether the instance represents a single object (iscrowd=0 in which case polygons are used) or a collection of objects (iscrowd=1 in which case RLE is used). Note that a single object (iscrowd=0) may require multiple polygons, for example if occluded. Crowd annotations (iscrowd=1) are used to label large groups of objects (e.g. a crowd of people). In addition, an enclosing bounding box is provided for each object (box coordinates are measured from the top left image corner and are 0-indexed). Finally, the categories field of the annotation structure stores the mapping of category id to category and supercategory names. See also the detection task.

type ObjectSeg

type ObjectSeg struct {
	ImageID      int     `json:"image_id"`
	CategoryID   int     `json:"category_id"`
	Segmentation RLE     `json:"segmentation"`
	Score        float32 `json:"score"`
}

ObjectSeg is the form for results for Segmentation Object detection. From cocodataset.org documentation: Note: a binary mask containing an object segment should be encoded to RLE using the MaskApi function encode().

type PSAnnotation

type PSAnnotation struct {
	ImageID      int             `json:"image_id,omitempty"`
	FileName     string          `json:"file_name,omitempty"`
	SegmentsInfo []PSSegmentInfo `json:"segments_info,omitempty"`
}

PSAnnotation is for the Panoptic segmentation

type PSCategories

type PSCategories struct {
	ID            int       `json:"id,omitempty"`
	Name          string    `json:"name,omitempty"`
	Supercategory string    `json:"supercategory,omitempty"`
	Isthing       byte      `json:"isthing,omitempty"`
	Color         [3]uint32 `json:"color,omitempty"`
}

PSCategories contains category information for the PanopticSegmentation json file

type PSSegmentInfo

type PSSegmentInfo struct {
	ID         int        `json:"id,omitempty"`
	CategoryID int        `json:"category_id,omitempty"`
	Area       int        `json:"area,omitempty"`
	Bbox       [4]float32 `json:"bbox,omitempty"`
	Iscrowd    byte       `json:"iscrowd,omitempty"`
}

PSSegmentInfo contains segment info for the annotation

type PanopticSeg

type PanopticSeg struct {
	ImageID      int                 `json:"image_id"`
	FileName     string              `json:"file_name"`
	SegmentsInfo []SegmentInfoResult `json:"segments_info"`
}

PanopticSeg is the form for results for Panoptic Segmentation From cocodataset.org documentation: For the panoptic task, each per-image annotation should have two parts: (1) a PNG that stores the class-agnostic image segmentation (2) a JSON struct that stores the semantic information for each image segment. The PNGs should be located in the folder annotations/name/*, where annotations/name.json is the JSON file. For details see the ground truth format for panoptic segmentation. Results for evaluation should contain both the JSON and the PNGs.

type PanopticSegmentation

type PanopticSegmentation struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []PSAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
	Categories  []PSCategories `json:"categories,omitempty"`
}

PanopticSegmentation is used for Panoptic Segmentation task From cocodataset.org Documentation:

For the panoptic task, each annotation struct is a per-image annotation rather than

a per-object annotation. Each per-image annotation has two parts: (1) a PNG that stores the class-agnostic image segmentation (2) a JSON struct that stores the semantic information for each image segment. In more detail:

1. To match an annotation with an image, use the image_id field (that is annotation.image_id==image.id).

  1. For each annotation, per-pixel segment ids are stored as a single PNG at annotation.file_name. The PNGs are in a folder with the same name as the JSON, i.e., annotations/name/ for annotations/name.json. Each segment (whether it's a stuff or thing segment) is assigned a unique id. Unlabeled pixels (void) are assigned a value of 0. Note that when you load the PNG as an RGB image, you will need to compute the ids via ids=R+G*256+B*256^2.
  1. For each annotation, per-segment info is stored in annotation.segments_info. segment_info.id stores the unique id of the segment and is used to retrieve the corresponding mask from the PNG (ids==segment_info.id). category_id gives the semantic category and iscrowd indicates the segment encompasses a group of objects (relevant for thing categories only). The bbox and area fields provide additional info about the segment.
  1. The COCO panoptic task has the same thing categories as the detection task, whereas the stuff categories differ from those in the stuff task (for details see the panoptic evaluation page). Finally, each category struct has two additional fields: isthing that distinguishes stuff and thing categories and color that is useful for consistent visualization.

type Polygon

type Polygon [][]float32

Polygon from what ive seen looks to be in the form of [1][x0, x1,x2,x3 . . .] but it could be more on the first part.

type RLE

type RLE struct {
	// contains filtered or unexported fields
}

RLE contains a pointer to an array of C.RLE

func CompressRLE

func CompressRLE(cnts []uint32, h, w uint32) *RLE

CompressRLE cnts using RLE. void rleInit( RLE *R, siz h, siz w, siz m, uint *cnts );

func EncodeRLE

func EncodeRLE(mask []byte, h, w, n uint32) *RLE

EncodeRLE binary masks using RLE. void rleEncode( RLE *R, const byte *mask, siz h, siz w, siz n );

func InitRLEs

func InitRLEs(size uint32) *RLE

InitRLEs creates an array of *RLE which holds a pointer to an array of C.RLEs

func RLEFromPoly

func RLEFromPoly(poly *float64, k, h, w uint32) *RLE

RLEFromPoly Convert polygon to encoded mask. void rleFrPoly( RLE *R, const double *xy, siz k, siz h, siz w );

func (*RLE) AreaRLE

func (r *RLE) AreaRLE() []uint32

AreaRLE - Compute area of encoded masks. void rleArea( const RLE *R, siz n, uint *a );

func (*RLE) Decode

func (r *RLE) Decode() (mask []byte)

Decode binary masks encoded via RLE void rleDecode( const RLE *R, byte *mask, siz n );

func (*RLE) MergeFrom

func (r *RLE) MergeFrom(m *RLE, intersect bool)

MergeFrom - Compute union or intersection of encoded masks. merges m into r

func (*RLE) NonMaxSup

func (r *RLE) NonMaxSup(thresh float64) (keep []bool)

NonMaxSup - Compute non-maximum suppression between bounding masks void rleNms( RLE *dt, siz n, uint *keep, double thr );

func (*RLE) ToBB

func (r *RLE) ToBB() (bb BB)

ToBB bounding boxes surrounding encoded masks. void rleToBbox( const RLE *R, BB bb, siz n );

func (*RLE) ToChar

func (r *RLE) ToChar() *Char

ToChar Get compressed string representation of encoded mask. char* rleToString( const RLE *R );

type RLEgo

type RLEgo struct {
	Counts []uint32
	Size   []uint32
}

RLEgo is a struct that will take the info from json to an RLE format

type SGAnnotation

type SGAnnotation struct {
	ID           int        `json:"id,omitempty"`
	ImageID      int        `json:"image_id,omitempty"`
	CategoryID   int        `json:"category_id,omitempty"`
	Segmentation Segment    `json:"segmentation,omitempty"`
	Area         float32    `json:"area,omitempty"`
	Bbox         [4]float32 `json:"bbox,omitempty"`
}

SGAnnotation is the object detection annotation

type Segment

type Segment interface {
}

Segment interface a placeholder for Segmentation data structures It can either be an RLE or a Polygon

type SegmentInfoResult

type SegmentInfoResult struct {
	ID         int `json:"id"`
	CategoryID int `json:"category_id"`
}

SegmentInfoResult is used in PanopticSeg

type SegmentationHelper

type SegmentationHelper struct {
	Poly Polygon
	Rle  RLEgo
}

SegmentationHelper is used for segmentation

type StuffSeg

type StuffSeg struct {
	ImageID      int `json:"image_id"`
	CategoryID   int `json:"category_id"`
	Segmentation RLE `json:"segmentation"`
}

StuffSeg is the form for results of Stuff Segmentation From cocodataset.org documentation: The stuff segmentation format is identical to the object segmentation format except the score field is not necessary. Note: We recommend encoding each label that occurs in an image with a single binary mask. Binary masks should be encoded via RLE using the MaskApi function encode().

type StuffSegmentation

type StuffSegmentation struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []SGAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
	Categories  []ODCategories `json:"categories,omitempty"`
}

StuffSegmentation is a lot like object detection but the annotations don't include is crowd. From cocodataset.org Documentation:

The stuff annotation format is identical and fully compatible to the object detection format above

(except iscrowd is unnecessary and set to 0 by default). We provide annotations in both JSON and png format for easier access, as well as conversion scripts between the two formats. In the JSON format, each category present in an image is encoded with a single RLE annotation The category_id represents the id of the current stuff category. For more details on stuff categories and supercategories see the stuff evaluation page. See also the stuff task.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL