cluster

package module
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 27, 2019 License: MIT Imports: 3 Imported by: 6

README

Point Clustering

Build Status codecov GoDoc FOSSA Status

(Lat, lon) points fast clustering using DBScan algorithm in Go.

Given set of geo points, this library can find clusters according to specified params. There are several optimizations applied:

  • distance calculation is using "fast" implementations of sine/cosine, with sqrt being removed
  • to find points within eps distance k-d tree is being used
  • edge case handling of identical points being present in the set

Usage

Build list of points:

    points := cluster.PointList{{30.258387, 59.951557}, {30.434124, 60.029499}, ...}

Pick settings for DBScan algorithm:

  • eps is clustering radius (in kilometers)
  • minPoints is number of points in eps-radius of base point to consider it being part of the cluster

eps and minPoints together define minimum density of the cluster.

Run DBScan:

    clusters, noise := cluster.DBScan(points, 0.8, 10) // eps is 800m, 10 points minimum in eps-neighborhood

DBScan function returns list of clusters (each Cluster being reference to the list of source points) and list of point indexes which don't fit into any cluster (noise).

License

FOSSA Status

Documentation

Overview

Package cluster implements DBScan clustering on (lat, lon) using K-D Tree

Index

Constants

View Source
const (
	// DegreeRad is coefficient to translate from degrees to radians
	DegreeRad = math.Pi / 180.0
	// EarthR is earth radius in km
	EarthR = 6371.0
)

Variables

This section is empty.

Functions

func DistanceSpherical

func DistanceSpherical(p1, p2 *Point) float64

DistanceSpherical is a spherical (optimized) distance between two points

Result is distance in kilometers

func DistanceSphericalFast

func DistanceSphericalFast(p1, p2 *Point) float64

DistanceSphericalFast calculates spherical distance with fast cosine without sqrt and normalization to Earth radius/radians

To get real distance in km, take sqrt and multiply result by EarthR*DegreeRad

In this library eps (distance) is adjusted so that we don't need to do sqrt and multiplication

func FastCos

func FastCos(x float64) float64

FastCos calculates cosinus from sinus

func FastSine

func FastSine(x float64) float64

FastSine caclulates sinus approximated to parabola

Taken from: http://forum.devmaster.net/t/fast-and-accurate-sine-cosine/9648

func Inside

func Inside(innerMin, innerMax, outerMin, outerMax *Point) bool

Inside checks if (innerMin, innerMax) rectangle is inside (outerMin, outMax) rectangle

func RegionQuery

func RegionQuery(points PointList, P *Point, eps float64) []int

RegionQuery is simple way O(N) to find points in neighbourhood

It is roughly equivalent to kdTree.InRange(points[i], eps, nil)

Types

type Cluster

type Cluster struct {
	C      int
	Points []int
}

Cluster is a result of DBScan work

func DBScan

func DBScan(points PointList, eps float64, minPoints int) (clusters []Cluster, noise []int)

DBScan clusters incoming points into clusters with params (eps, minPoints)

eps is clustering radius in km minPoints in minimum number of points in eps-neighbourhood (density)

func (*Cluster) CentroidAndBounds

func (c *Cluster) CentroidAndBounds(points PointList) (center, min, max Point)

CentroidAndBounds calculates center and cluster bounds

type EpsFunction

type EpsFunction func(pt Point) float64

EpsFunction is a function that returns eps based on point pt

type KDTree

type KDTree struct {
	Points PointList
	Root   *T
}

KDTree is implementation of K-D Tree, with Points separated from nodes.

Nodes (T) hold only indices into Points slice

func NewKDTree

func NewKDTree(points PointList) *KDTree

NewKDTree returns a new K-D tree built using the given nodes.

func (*KDTree) Height

func (tree *KDTree) Height() int

Height returns the height of the K-D tree.

func (*KDTree) InRange

func (tree *KDTree) InRange(pt Point, dist float64, nodes []int) []int

InRange appends all nodes in the K-D tree that are within a given distance from the given point to the given slice, which may be nil. To avoid allocation, the slice can be pre-allocated with a larger capacity and re-used across multiple calls to InRange.

func (*KDTree) Insert

func (tree *KDTree) Insert(point Point)

Insert returns a new K-D tree with the given node inserted. Inserting a node that is already a member of a K-D tree invalidates that tree.

type Point

type Point [2]float64

Point is longitue, latittude

func (*Point) GreaterEq

func (a *Point) GreaterEq(b *Point) bool

GreaterEq - a >= b

func (*Point) LessEq

func (a *Point) LessEq(b *Point) bool

LessEq - a <= b

type PointList

type PointList []Point

PointList is a slice of Points

type T

type T struct {
	// Point is the K-dimensional point associated with the
	// data of this node.
	PointID  int
	EqualIDs []int
	// contains filtered or unexported fields
}

A T is a the node of a K-D tree. A *T is the root of a K-D tree, and nil is an empty K-D tree.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL