parser

package
v1.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 30, 2017 License: Apache-2.0 Imports: 8 Imported by: 0

Documentation

Overview

Copyright 2017 Applatix, Inc.

Copyright 2017 Applatix, Inc.

Index

Constants

View Source
const ParserVersion = 1

ParserVersion is the version of this parser library, which we record in the event that billing data needs to be reingested in later versions of this software. Until we support upgrades, this value can remain at 1. When upgrade is supported, we will need to increment this value every time we make incompatible changes to the parser.

Variables

View Source
var (
	// Fields
	ColumnUnblendedCost = Column{"lineItem/UnblendedCost", "", "", asFloatField} // 1.04  (will be zero for ReservedInstance)
	ColumnBlendedCost   = Column{"lineItem/BlendedCost", "", "", asFloatField}   // 1.04
	ColumnUsageAmount   = Column{"lineItem/UsageAmount", "", "", asFloatField}   // 20
	ColumnUnblendedRate = Column{"lineItem/UnblendedRate", "", "", asFloatField} // 0.052
	ColumnBlendedRate   = Column{"lineItem/BlendedRate", "", "", asFloatField}   // 0.052
	ColumnLineItemID    = Column{"identity/LineItemId", "", "", asStringField}   // qwhyoiu7gg3wow4eskqclrsuzfthtct4pwqnjkismbrsvqkepmxq
	ColumnResourceID    = Column{"lineItem/ResourceId", "", "", asStringField}   // i-abcd1234, vol-abcd1234, my-billing-bucket
	ColumnUsageType     = Column{"lineItem/UsageType", "", "", usageTypeParser}  // USW2-BoxUsage:t1.micro

	// Tags
	ColumnPayerAccountID = Column{"bill/PayerAccountId", "", "", asTag}                                // 012345678910
	ColumnUsageAccountID = Column{"lineItem/UsageAccountId", "accounts", "Accounts", asTag}            // 246810121416
	ColumnProductCode    = Column{"lineItem/ProductCode", "products", "Products", asTag}               // AmazonEC2, a6vjvrelz10rgvvemklxv2dow, awskms, AWSCloudTrail
	ColumnOperation      = Column{"lineItem/Operation", "operations", "Operations", asTag}             // RunInstances, Hourly, GetObject, NatGateway, Send, Unknown
	ColumnProductFamily  = Column{"product/productFamily", "productfamilies", "Product Family", asTag} // * Compute Instance, Storage, Storage Snapshot, NAT Gateway
	ColumnPricingUnit    = Column{"pricing/unit", "", "", asTag}                                       // * Hrs, Queries, Requests, GB, GB-Mo, Events, IOs, Keys, Count, ReadCapacityUnit-Hrs, WriteCapacityUnit-Hrs

	// Meta
	ColumnPricingTerm            = Column{"pricing/term", "", "", asMeta}                 // * OnDemand, Reserved (empty if lineItem/UnblendedCost is 0.0)
	ColumnBillingPeriodStartDate = Column{"bill/BillingPeriodStartDate", "", "", asMeta}  // 2016-12-01T00:00:00Z
	ColumnBillingPeriodEndDate   = Column{"bill/BillingPeriodEndDate", "", "", asMeta}    // 2017-01-01T00:00:00Z
	ColumnAvailabilityZone       = Column{"lineItem/AvailabilityZone", "", "", asMeta}    // us-east-1d
	ColumnProductLocation        = Column{"product/location", "", "", asMeta}             // US East (N. Virginia)
	ColumnDescription            = Column{"lineItem/LineItemDescription", "", "", asMeta} // m4.large Linux/UNIX Spot Instance-hour in US East (Virginia) in VPC Zone #1

	// Claudia specific DB columns
	ColumnBillingPeriod      = Column{"claudia/BillingPeriod", "", "", nil}                                    // 20161201-20170101
	ColumnBillingBucket      = Column{"claudia/BillingBucket", "", "", nil}                                    // my-billing-bucket
	ColumnBillingReportPath  = Column{"claudia/BillingReportPath", "", "", nil}                                // report/path
	ColumnService            = Column{"claudia/Service", "services", "Services", nil}                          // * AWS EC2 Instance
	ColumnS3Bucket           = Column{"claudia/S3Bucket", "s3buckets", "Buckets", nil}                         // * my-billing-bucket
	ColumnUsageFamily        = Column{"claudia/UsageFamily", "usagefamilies", "Usage Family", nil}             // * Requests-Tier1, AWS-Out-Bytes, NatGateway-Bytes
	ColumnRegion             = Column{"claudia/Region", "regions", "Regions", nil}                             // * us-east-1, us-east-2
	ColumnEC2InstancePricing = Column{"claudia/EC2Pricing", "instancepricing", "Instance Pricing", nil}        // * OnDemand, Reserved, Spot
	ColumnEC2InstanceFamily  = Column{"claudia/EC2InstanceFamily", "instancefamilies", "Instance Family", nil} // * m3
	ColumnEC2InstanceType    = Column{"claudia/EC2InstanceType", "instancetypes", "Instance Type", nil}        // * m3.large
	ColumnDataTransferSource = Column{"claudia/DataTransferSource", "txsource", "Data Transfer Source", nil}   // * External, us-west-1
	ColumnDataTransferDest   = Column{"claudia/DataTransferDest", "txdest", "Data Transfer Dest", nil}         // * External, us-west-1
)

For more details of AWS columns: http://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/detailed-billing-reports.html

View Source
var (
	ResourceTagMatcher = regexp.MustCompile("^resourceTags/(user|aws):.*")

	DataTransferFamilyMatcher = regexp.MustCompile("^" + dataTransferFamilyStr + "$")
)

Other Regexes used during parsing

View Source
var RegionMapping map[string]AWSRegion

RegionMapping is a mapping of region names to AWSRegion (e.g. us-west-2 -> AWSRegion{"us-west-2", "USW2", "US West (Oregon)"})

Functions

func APINameToColumnName

func APINameToColumnName(apiname string) *string

APINameToColumnName returns the database column name given an API name

Types

type AWSRegion

type AWSRegion struct {
	Name        string `json:"name"`
	Code        string `json:"code"`
	DisplayName string `json:"display_name"`
	// contains filtered or unexported fields
}

AWSRegion contains the name, code (as it appears in billing reports) and display name of an AWS region (e.g. us-west-1, USW1, US West (N. California)) Some regions have an alternative display name that appears in the description. This is contained in the 'alias' field.

type Column

type Column struct {
	ColumnName  string
	APIName     string
	DisplayName string
	Parser      ColumnParser
}

The Column struct represents: * the column name of an AWS Cost & Usage report line item (e.g. lineItem/ProductCode) * the database tag/field name (e.g. claudia/Region) * the API shorthand API name (e.g. "regions") Also, parser how to parse the

func APINameToColumn

func APINameToColumn(apiname string) *Column

APINameToColumn returns the database column name given an API name

func GetColumnByName

func GetColumnByName(columnName string) *Column

GetColumnByName returns the database column given an column name

type ColumnParser

type ColumnParser func(string, string) (*parsedValues, error)

ColumnParser returns a ParsedValues structure, which consists of fields, tags, and metadata Fields are the units by which we want to measure. They can be numbers or strings. Tags are string-based column names which are indexed and can be filtered/grouped. The number of Tags should be limited in order to limit database cardinality, but sufficient enough to provide desired querying capabilities. NOTE: influxdb uses tag sets to prevent duplicate points, but we circumvent this by using nanosecond sequence numbers Added to each timestamp. https://docs.influxdata.com/influxdb/v1.1/troubleshooting/frequently-asked-questions/#how-does-influxdb-handle-duplicate-points Metadata is used as a temporary holding area in which we need to process line items

type LineItem

type LineItem struct {
	Timestamp time.Time
	Tags      map[string]string
	Fields    map[string]interface{}
}

LineItem is the parsed form of a single line in a AWS Cost & Usage billing report. It indicates out what columns should be stored (indexed) as tags vs. fields when stored to the database.

func ParseLine

func ParseLine(columnNames []string, line []string) (*LineItem, error)

ParseLine parses a CSV line and return a InfluxDB point

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL