coco

package module

v0.0.0-...-bc882ea Latest Latest Go to latest Published: Dec 30, 2022 License: BSD-2-Clause Imports: 3 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/aidezone/golang-coco

Links

Open Source Insights

README ¶

coco

coco dataset api for go

Documentation ¶

Index ¶

func IoUBB(dt, gt BB, iscrowd []byte) (out []float64)
func IoURLE(dt, gt *RLE, iscrowd []byte) (out []float64)
func NonMaxSupBB(dt BB, thresh float64) (keep []bool)
type BB
- func (b BB) ToRLE(h, w, n uint32) *RLE
type Char
- func (c *Char) ToRLE(h, w uint32) *RLE
type Edge
type ICAnnotation
type Image
type ImageCaption
type ImageCaptioning
type Information
type KPAnnotation
type KPCategories
type KeyPoint
type KeypointDetection
type License
type ODAnnotation
type ODCategories
type ObjectBB
type ObjectDetection
type ObjectSeg
type PSAnnotation
type PSCategories
type PSSegmentInfo
type PanopticSeg
type PanopticSegmentation
type Polygon
type RLE
- func CompressRLE(cnts []uint32, h, w uint32) *RLE
- func EncodeRLE(mask []byte, h, w, n uint32) *RLE
- func InitRLEs(size uint32) *RLE
- func RLEFromPoly(poly *float64, k, h, w uint32) *RLE
- func (r *RLE) AreaRLE() []uint32
- func (r *RLE) Decode() (mask []byte)
- func (r *RLE) MergeFrom(m *RLE, intersect bool)
- func (r *RLE) NonMaxSup(thresh float64) (keep []bool)
- func (r *RLE) ToBB() (bb BB)
- func (r *RLE) ToChar() *Char
type RLEgo
type SGAnnotation
type Segment
type SegmentInfoResult
type SegmentationHelper
type StuffSeg
type StuffSegmentation

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func IoUBB ¶

func IoUBB(dt, gt BB, iscrowd []byte) (out []float64)

IoUBB -Compute intersection over union between bounding boxes. void bbIou( BB dt, BB gt, siz m, siz n, byte *iscrowd, double *o );

func IoURLE ¶

func IoURLE(dt, gt *RLE, iscrowd []byte) (out []float64)

IoURLE Compute intersection over union between masks. void rleIou( RLE *dt, RLE *gt, siz m, siz n, byte *iscrowd, double *o );

func NonMaxSupBB ¶

func NonMaxSupBB(dt BB, thresh float64) (keep []bool)

NonMaxSupBB non-maximum suppression between bounding boxes void bbNms( BB dt, siz n, uint *keep, double thr );

Types ¶

type BB ¶

type BB []float64

BB bounding box

func (BB) ToRLE ¶

func (b BB) ToRLE(h, w, n uint32) *RLE

ToRLE Convert bounding boxes to encoded masks. void rleFrBbox( RLE *R, const BB bb, siz h, siz w, siz n );

type Char ¶

type Char struct {
	Cc unsafe.Pointer
}

Char contains a pointer to a c char string

func (*Char) ToRLE ¶

func (c *Char) ToRLE(h, w uint32) *RLE

ToRLE Convert from compressed string representation of encoded mask. void rleFrString( RLE *R, char *s, siz h, siz w );

type Edge ¶

type Edge [2]uint32

Edge desribes a 2 point edge Probably [x,y] I haven't tested it yet

type ICAnnotation ¶

type ICAnnotation struct {
	ID      int    `json:"id,omitempty"`
	ImageID int    `json:"image_id,omitempty"`
	Caption string `json:"caption,omitempty"`
}

ICAnnotation is used for image capture annotation

type Image ¶

type Image struct {
	ID           int    `json:"id,omitempty"`
	Width        int    `json:"width,omitempty"`
	Height       int    `json:"height,omitempty"`
	FileName     string `json:"file_name,omitempty"`
	License      int    `json:"license,omitempty"`
	FlickrURL    string `json:"flickr_url,omitempty"`
	CocoURL      string `json:"coco_url,omitempty"`
	DateCaptured string `json:"date_captured,omitempty"`
}

Image is the image information and is shared between all the dataformats

type ImageCaption ¶

type ImageCaption struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []ICAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
}

ImageCaption is used for the json annotation for image captioning from cocodataset.org documentation These annotations are used to store image captions. Each caption describes the specified image and each image has at least 5 captions (some images have more). See also the captioning task.

type ImageCaptioning ¶

type ImageCaptioning struct {
	ImageID int    `json:"image_id"`
	Caption string `json:"caption"`
}

ImageCaptioning is the form for results for Image Captioning

type Information ¶

type Information struct {
	Year        int    `json:"year,omitempty"`
	Version     string `json:"version,omitempty"`
	Description string `json:"description,omitempty"`
	Contributor string `json:"contributor,omitempty"`
	URL         string `json:"url,omitempty"`
	DateCreated string `json:"date_created,omitempty"`
}

Information is basic image and Coco information and is shared between all the data formats

type KPAnnotation ¶

type KPAnnotation struct {
	Keypoints    []float32  `json:"keypoints,omitempty"`
	NumKeypoints int        `json:"num_keypoints,omitempty"`
	ID           int        `json:"id,omitempty"`
	ImageID      int        `json:"image_id,omitempty"`
	CategoryID   int        `json:"category_id,omitempty"`
	Segmentation Segment    `json:"segmentation,omitempty"`
	Area         float32    `json:"area,omitempty"`
	Bbox         [4]float32 `json:"bbox,omitempty"`
	Iscrowd      byte       `json:"iscrowd,omitempty"`
}

KPAnnotation contains the keypoint annotation information

type KPCategories ¶

type KPCategories struct {
	Keypoints     []string `json:"keypoints,omitempty"`
	Skeleton      []Edge   `json:"skeleton,omitempty"`
	ID            int      `json:"id,omitempty"`
	Name          string   `json:"name,omitempty"`
	Supercategory string   `json:"supercategory,omitempty"`
}

KPCategories are the the catagories for key point

type KeyPoint ¶

type KeyPoint struct {
	ImageID    int       `json:"image_id"`
	CategoryID int       `json:"category_id"`
	Keypoints  []float32 `json:"keypoints"`
	Score      float32   `json:"score"`
}

KeyPoint is the form for results of Keypoint Detection From cocodataset.org documentation: Note: keypoint coordinates are floats measured from the top left image corner (and are 0-indexed). We recommend rounding coordinates to the nearest pixel to reduce file size. Note also that the visibility flags vi are not currently used (except for controlling visualization), we recommend simply setting vi=1.

type KeypointDetection ¶

type KeypointDetection struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []KPAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
	Categories  []KPAnnotation `json:"categories,omitempty"`
}

KeypointDetection are Key Point Detection data types in the json format From cocodataset.org Documentation:

A keypoint annotation contains all the data of the object annotation (including id, bbox, etc.)

and two additional fields. First, "keypoints" is a length 3k array where k is the total number of keypoints defined for the category. Each keypoint has a 0-indexed location x,y and a visibility flag v defined as v=0: not labeled (in which case x=y=0), v=1: labeled but not visible, and v=2: labeled and visible. A keypoint is considered visible if it falls inside the object segment. "num_keypoints" indicates the number of labeled keypoints (v>0) for a given object (many objects, e.g. crowds and small objects, will have num_keypoints=0). Finally, for each category, the categories struct has two additional fields: "keypoints," which is a length k array of keypoint names, and "skeleton", which defines connectivity via a list of keypoint edge pairs and is used for visualization. Currently keypoints are only labeled for the person category (for most medium/large non-crowd person instances). See also the keypoint task.

type License ¶

type License struct {
	ID   int    `json:"id,omitempty"`
	Name string `json:"name,omitempty"`
	URL  string `json:"url,omitempty"`
}

License is the license information and is shared between all the formats

type ODAnnotation ¶

type ODAnnotation struct {
	ID           int        `json:"id,omitempty"`
	ImageID      int        `json:"image_id,omitempty"`
	CategoryID   int        `json:"category_id,omitempty"`
	Segmentation Segment    `json:"segmentation,omitempty"`
	Area         float32    `json:"area,omitempty"`
	Bbox         [4]float32 `json:"bbox,omitempty"`
	Iscrowd      byte       `json:"iscrowd,omitempty"`
}

ODAnnotation is the object detection annotation

type ODCategories ¶

type ODCategories struct {
	ID            int    `json:"id,omitempty"`
	Name          string `json:"name,omitempty"`
	Supercategory string `json:"supercategory,omitempty"`
}

ODCategories is the object detection categories.

type ObjectBB ¶

type ObjectBB struct {
	ImageID    int        `json:"image_id"`
	CategoryID int        `json:"category_id"`
	BBox       [4]float32 `json:"bbox"` // x,y widith, height,
	Score      float32    `json:"score"`
}

ObjectBB is the form for results for Bounding Boxes Object detection. From cocodataset.org documentation: Note: box coordinates are floats measured from the top left image corner (and are 0-indexed). We recommend rounding coordinates to the nearest tenth of a pixel to reduce resulting JSON file size.

type ObjectDetection ¶

type ObjectDetection struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []ODAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
	Categories  []ODCategories `json:"categories,omitempty"`
}

ObjectDetection is used for object detection jason format from cocodataset.org documentation

Each object instance annotation contains a series of fields, including the category

id and segmentation mask of the object. The segmentation format depends on whether the instance represents a single object (iscrowd=0 in which case polygons are used) or a collection of objects (iscrowd=1 in which case RLE is used). Note that a single object (iscrowd=0) may require multiple polygons, for example if occluded. Crowd annotations (iscrowd=1) are used to label large groups of objects (e.g. a crowd of people). In addition, an enclosing bounding box is provided for each object (box coordinates are measured from the top left image corner and are 0-indexed). Finally, the categories field of the annotation structure stores the mapping of category id to category and supercategory names. See also the detection task.

type ObjectSeg ¶

type ObjectSeg struct {
	ImageID      int     `json:"image_id"`
	CategoryID   int     `json:"category_id"`
	Segmentation RLE     `json:"segmentation"`
	Score        float32 `json:"score"`
}

ObjectSeg is the form for results for Segmentation Object detection. From cocodataset.org documentation: Note: a binary mask containing an object segment should be encoded to RLE using the MaskApi function encode().

type PSAnnotation ¶

type PSAnnotation struct {
	ImageID      int             `json:"image_id,omitempty"`
	FileName     string          `json:"file_name,omitempty"`
	SegmentsInfo []PSSegmentInfo `json:"segments_info,omitempty"`
}

PSAnnotation is for the Panoptic segmentation

type PSCategories ¶

type PSCategories struct {
	ID            int       `json:"id,omitempty"`
	Name          string    `json:"name,omitempty"`
	Supercategory string    `json:"supercategory,omitempty"`
	Isthing       byte      `json:"isthing,omitempty"`
	Color         [3]uint32 `json:"color,omitempty"`
}

PSCategories contains category information for the PanopticSegmentation json file

type PSSegmentInfo ¶

type PSSegmentInfo struct {
	ID         int        `json:"id,omitempty"`
	CategoryID int        `json:"category_id,omitempty"`
	Area       int        `json:"area,omitempty"`
	Bbox       [4]float32 `json:"bbox,omitempty"`
	Iscrowd    byte       `json:"iscrowd,omitempty"`
}

PSSegmentInfo contains segment info for the annotation

type PanopticSeg ¶

type PanopticSeg struct {
	ImageID      int                 `json:"image_id"`
	FileName     string              `json:"file_name"`
	SegmentsInfo []SegmentInfoResult `json:"segments_info"`
}

PanopticSeg is the form for results for Panoptic Segmentation From cocodataset.org documentation: For the panoptic task, each per-image annotation should have two parts: (1) a PNG that stores the class-agnostic image segmentation (2) a JSON struct that stores the semantic information for each image segment. The PNGs should be located in the folder annotations/name/*, where annotations/name.json is the JSON file. For details see the ground truth format for panoptic segmentation. Results for evaluation should contain both the JSON and the PNGs.

type PanopticSegmentation ¶

type PanopticSegmentation struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []PSAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
	Categories  []PSCategories `json:"categories,omitempty"`
}

PanopticSegmentation is used for Panoptic Segmentation task From cocodataset.org Documentation:

For the panoptic task, each annotation struct is a per-image annotation rather than

a per-object annotation. Each per-image annotation has two parts: (1) a PNG that stores the class-agnostic image segmentation (2) a JSON struct that stores the semantic information for each image segment. In more detail:

1. To match an annotation with an image, use the image_id field (that is annotation.image_id==image.id).

For each annotation, per-pixel segment ids are stored as a single PNG at annotation.file_name. The PNGs are in a folder with the same name as the JSON, i.e., annotations/name/ for annotations/name.json. Each segment (whether it's a stuff or thing segment) is assigned a unique id. Unlabeled pixels (void) are assigned a value of 0. Note that when you load the PNG as an RGB image, you will need to compute the ids via ids=R+G*256+B*256^2.

For each annotation, per-segment info is stored in annotation.segments_info. segment_info.id stores the unique id of the segment and is used to retrieve the corresponding mask from the PNG (ids==segment_info.id). category_id gives the semantic category and iscrowd indicates the segment encompasses a group of objects (relevant for thing categories only). The bbox and area fields provide additional info about the segment.

The COCO panoptic task has the same thing categories as the detection task, whereas the stuff categories differ from those in the stuff task (for details see the panoptic evaluation page). Finally, each category struct has two additional fields: isthing that distinguishes stuff and thing categories and color that is useful for consistent visualization.

type Polygon ¶

type Polygon [][]float32

Polygon from what ive seen looks to be in the form of [1][x0, x1,x2,x3 . . .] but it could be more on the first part.

type RLE ¶

type RLE struct {
	// contains filtered or unexported fields
}

RLE contains a pointer to an array of C.RLE

func CompressRLE ¶

func CompressRLE(cnts []uint32, h, w uint32) *RLE

CompressRLE cnts using RLE. void rleInit( RLE *R, siz h, siz w, siz m, uint *cnts );

func EncodeRLE ¶

func EncodeRLE(mask []byte, h, w, n uint32) *RLE

EncodeRLE binary masks using RLE. void rleEncode( RLE *R, const byte *mask, siz h, siz w, siz n );

func InitRLEs ¶

func InitRLEs(size uint32) *RLE

InitRLEs creates an array of *RLE which holds a pointer to an array of C.RLEs

func RLEFromPoly ¶

func RLEFromPoly(poly *float64, k, h, w uint32) *RLE

RLEFromPoly Convert polygon to encoded mask. void rleFrPoly( RLE *R, const double *xy, siz k, siz h, siz w );

func (*RLE) AreaRLE ¶

func (r *RLE) AreaRLE() []uint32

AreaRLE - Compute area of encoded masks. void rleArea( const RLE *R, siz n, uint *a );

func (*RLE) Decode ¶

func (r *RLE) Decode() (mask []byte)

Decode binary masks encoded via RLE void rleDecode( const RLE *R, byte *mask, siz n );

func (*RLE) MergeFrom ¶

func (r *RLE) MergeFrom(m *RLE, intersect bool)

MergeFrom - Compute union or intersection of encoded masks. merges m into r

func (*RLE) NonMaxSup ¶

func (r *RLE) NonMaxSup(thresh float64) (keep []bool)

NonMaxSup - Compute non-maximum suppression between bounding masks void rleNms( RLE *dt, siz n, uint *keep, double thr );

func (*RLE) ToBB ¶

func (r *RLE) ToBB() (bb BB)

ToBB bounding boxes surrounding encoded masks. void rleToBbox( const RLE *R, BB bb, siz n );

func (*RLE) ToChar ¶

func (r *RLE) ToChar() *Char

ToChar Get compressed string representation of encoded mask. char* rleToString( const RLE *R );

type RLEgo ¶

type RLEgo struct {
	Counts []uint32
	Size   []uint32
}

RLEgo is a struct that will take the info from json to an RLE format

type SGAnnotation ¶

type SGAnnotation struct {
	ID           int        `json:"id,omitempty"`
	ImageID      int        `json:"image_id,omitempty"`
	CategoryID   int        `json:"category_id,omitempty"`
	Segmentation Segment    `json:"segmentation,omitempty"`
	Area         float32    `json:"area,omitempty"`
	Bbox         [4]float32 `json:"bbox,omitempty"`
}

SGAnnotation is the object detection annotation

type Segment ¶

type Segment interface {
}

Segment interface a placeholder for Segmentation data structures It can either be an RLE or a Polygon

type SegmentInfoResult ¶

type SegmentInfoResult struct {
	ID         int `json:"id"`
	CategoryID int `json:"category_id"`
}

SegmentInfoResult is used in PanopticSeg

type SegmentationHelper ¶

type SegmentationHelper struct {
	Poly Polygon
	Rle  RLEgo
}

SegmentationHelper is used for segmentation

type StuffSeg ¶

type StuffSeg struct {
	ImageID      int `json:"image_id"`
	CategoryID   int `json:"category_id"`
	Segmentation RLE `json:"segmentation"`
}

StuffSeg is the form for results of Stuff Segmentation From cocodataset.org documentation: The stuff segmentation format is identical to the object segmentation format except the score field is not necessary. Note: We recommend encoding each label that occurs in an image with a single binary mask. Binary masks should be encoded via RLE using the MaskApi function encode().

type StuffSegmentation ¶

type StuffSegmentation struct {
	Info        Information    `json:"info,omitempty"`
	Images      []Image        `json:"images,omitempty"`
	Annotations []SGAnnotation `json:"annotations,omitempty"`
	Licenses    []License      `json:"licenses,omitempty"`
	Categories  []ODCategories `json:"categories,omitempty"`
}

StuffSegmentation is a lot like object detection but the annotations don't include is crowd. From cocodataset.org Documentation:

The stuff annotation format is identical and fully compatible to the object detection format above

(except iscrowd is unnecessary and set to 0 by default). We provide annotations in both JSON and png format for easier access, as well as conversion scripts between the two formats. In the JSON format, each category present in an image is encoded with a single RLE annotation The category_id represents the id of the current stuff category. For more details on stuff categories and supercategories see the stuff evaluation page. See also the stuff task.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL