rndrec

package module

v0.0.0-...-12cf073 Latest Latest Go to latest Published: Nov 25, 2018 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/jung-kurt/rndrec

Links

Open Source Insights

README ¶

rndrec

Package rndrec is used to randomly select records from a pool based on their relative weight. For example, if the relative weight of one record is 50, on average it will be selected five times more often than a record with a relative weight of 10. This is useful for generating plausible data sets for testing purposes, for example names based on frequency or regions based on population.

Example

Given a file named "continent_population.csv" with the following contents,

Africa|1,030,400,000
Antarctica|0
Asia|4,157,300,000
Australia|36,700,000
Europe|738,600,000
North America|461,114,000
South America|390,700,000

the following call will create a weighted record sample source:

var r *SrcType
var err error

r, err = NewRandomRecordSourceFromFile("continent_population.csv", 1, '|', 0)

The integer argument following the filename is the zero-based column that contains the relative weights in numeric form. Note that the commas in these values are disregarded. The rune argument following the weight column specifies the field separator. All input records are assumed to be delimited with newlines. The final argument is the seed value for the instance's random number source. This can be used to generate repeatable sequences. time.Now().Unix() can be used if repeatable sequences are not desired.

Call r.Record() to randomly retrieve weighted records:

for row := 0; row < 8; row++ {
	for col := 0; col < 8; col++ {
		if col > 0 {
			fmt.Printf(" | ")
		}
		rec = r.Record()
		fmt.Printf("%s", rec[0])
	}
	fmt.Println("")
}

This will generate the following ouput:

South America | Asia | Asia | Africa | Asia | Asia | Asia | Asia
North America | Asia | Asia | North America | Europe | Asia | Asia | Asia
Europe | Africa | Europe | Europe | Asia | Asia | Asia | Asia
Asia | Asia | Asia | Asia | Asia | Asia | Africa | Asia
Asia | Asia | Asia | Asia | Asia | Asia | Asia | Africa
Asia | Africa | Asia | Asia | Europe | Africa | North America | North America
Asia | Europe | Africa | Europe | Asia | South America | Africa | Europe
Asia | Europe | Africa | Asia | Asia | Asia | Asia | Africa

Installation

To install the package on your system, run

go get github.com/jung-kurt/rndrec

License

rndrec is released under the MIT License.

Documentation ¶

Overview ¶

Package rndrec is used to randomly select records from a pool based on their relative weight. For example, if the relative weight of one record is 50, on average it will be selected five times more often than a record with a relative weight of 10. This is useful for generating plausible data sets for testing purposes, for example names based on frequency or regions based on population.

Index ¶

type SrcType
- func (r *SrcType) Record() []string
- func (r SrcType) String() string

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type SrcType ¶

type SrcType struct {
	// contains filtered or unexported fields
}

SrcType is used to generate plausible random records based on a list of weighted records.

Example (EqualWeight) ¶

Demonstrate imolicit equal weight of records

var list = [][]string{
	{"red"},
	{"green"},
	{"blue"},
}
report(list, -1)

Output:

blue: 0.33
green: 0.33
red: 0.33

Example (File) ¶

Demonstrate selection of records from a file

var r *SrcType
var err error

r, err = NewRandomRecordSourceFromFile("data/continent_population.csv", 1, '|', 0)
if err == nil {
	srcReport(r, 1)
} else {
	fmt.Printf("%s\n", err)
}

Output:

Africa: 0.15
Asia: 0.61
Australia: 0.01
Europe: 0.11
North America: 0.07
South America: 0.06

Example (Names) ¶

Generate dummy names based on 1990 US census data

const (
	cnLast = iota
	cnFemale
	cnMale
	cnCount
)
var filenameList = [cnCount]string{
	"data/us/name_last.csv",
	"data/us/name_first_female.csv",
	"data/us/name_first_male.csv",
}
var srcList [cnCount]*SrcType
var err error
var rnd *rand.Rand
var first, mid, last []string
var j, k int

for j = 0; j < cnCount && err == nil; j++ {
	srcList[j], err = NewRandomRecordSourceFromFile(filenameList[j], 1, '|', 0)
}
if err == nil {
	rnd = rand.New(rand.NewSource(0))
	for j = 0; j < 16; j++ {
		if rnd.Intn(5) < 4 {
			k = cnFemale
		} else {
			k = cnMale
		}
		first = srcList[k].Record()
		mid = srcList[k].Record()
		last = srcList[cnLast].Record()
		fmt.Printf("%s %s %s\n", first[0], mid[0][0:1], last[0])
	}
}
if err != nil {
	fmt.Printf("%s\n", err)
}

Output:

Kendall T Creel
Earl J Cox
Jasmin A Stein
Yolanda L Brown
Evelyn M Perkins
Sharon Y Foster
Lea S Carter
Martha A Potts
Jeannie V Ayres
Veronica B Wright
Harriet M Simmons
Janie L Colburn
Anthony P Pulliam
Teresa D Coleman
Florence C Sweeney
Sarah B Ramirez

Example (Population) ¶

Demonstrate selection of records from a structured data source

var list = [][]string{
	{"Africa", "1,030,400,000"},
	{"Antarctica", "0"},
	{"Asia", "4,157,300,000"},
	{"Australia", "36,700,000"},
	{"Europe", "738,600,000"},
	{"North America", "461,114,000"},
	{"South America", "390,700,000"},
}
report(list, 1)

Output:

Africa: 0.15
Asia: 0.61
Australia: 0.01
Europe: 0.11
North America: 0.07
South America: 0.06

Example (Readme) ¶

Simple demonstration for readme file

var r *SrcType
var err error
var rec []string

r, err = NewRandomRecordSourceFromFile("data/continent_population.csv", 1, '|', 0)
if err == nil {
	for row := 0; row < 8; row++ {
		for col := 0; col < 8; col++ {
			if col > 0 {
				fmt.Printf(" | ")
			}
			rec = r.Record()
			fmt.Printf("%s", rec[0])
		}
		fmt.Println("")
	}
} else {
	fmt.Printf("%s\n", err)
}

Output:

South America | Asia | Asia | Africa | Asia | Asia | Asia | Asia
North America | Asia | Asia | North America | Europe | Asia | Asia | Asia
Europe | Africa | Europe | Europe | Asia | Asia | Asia | Asia
Asia | Asia | Asia | Asia | Asia | Asia | Africa | Asia
Asia | Asia | Asia | Asia | Asia | Asia | Asia | Africa
Asia | Africa | Asia | Asia | Europe | Africa | North America | North America
Asia | Europe | Africa | Europe | Asia | South America | Africa | Europe
Asia | Europe | Africa | Asia | Asia | Asia | Asia | Africa

Example (Simple) ¶

Simple example of selection from weighted records.

var list = [][]string{
	{"20%", "20"},
	{"30%", "30"},
	{"10%", "10"},
	{"40%", "40"},
}
report(list, 1)

Output:

10%: 0.10
20%: 0.20
30%: 0.30
40%: 0.40

func NewRandomRecordSource ¶

func NewRandomRecordSource(recs [][]string, weightColPos int, seed int64) (src *SrcType, err error)

NewRandomRecordSource processes a list of multi-field records in which each field is a string. With one exception, one column must be an integer weight. In this column, specified by weightColPos, each occurrence of an underscore or comma is removed and the remaining string is parsed as an integer. The values in this column are relative weights; that is, a record that has a weight twice that of some other record will be selected by Record() on average twice as often. The sum of these weights does not have to be any special value. The exception to the requirement that one field be a weight is when all records are weighted equally. In this case, weightColPos can be set to -1 and records do not need to have a weight column. Records returned by the Record() method depend on a local pseudo-random number generator; seed is used to seed this generator. If any value in the column specified by weightColPos can not be parsed as an integer, or the cumulative value of weights is zero, or the number of records is zero, an error is returned. Otherwise, err is nil and src may be used to retrieve records that are distributed according to their relative weights.

func NewRandomRecordSourceFromFile ¶

func NewRandomRecordSourceFromFile(fileStr string, weightColPos int, fieldSep rune, seed int64) (src *SrcType, err error)

NewRandomRecordSourceFromFile processes a list of multi-field records in the form of a comma-separated-value file with the filename specified by fileStr. Each record must be separated by a newline. Each field is separated by the value specified by fieldSep. For more information on the return value and the other arguments, see NewRandomRecordSource().

func NewRandomRecordSourceFromReader ¶

func NewRandomRecordSourceFromReader(r io.Reader, weightColPos int, fieldSep rune, seed int64) (src *SrcType, err error)

NewRandomRecordSourceFromReader processes a list of multi-field records in the form of a comma-separated-value buffer that can be read with the io.Reader r. Each record must be separated by a newline. Each field is separated by the value specified by fieldSep. For more information on the return value and the other arguments, see NewRandomRecordSource().

func (*SrcType) Record ¶

func (r *SrcType) Record() []string

Record returns a random record based on its relative weight. For example, a record with a relative weight of 40 will be returned, on average, four times as often as a record with the relative weight of 10. The returned record will be in the form of a slice of strings taken directly from the original list used to initialize the SrcType instance.

func (SrcType) String ¶

func (r SrcType) String() string

String implements the fmt.Stringer interface

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
names This command reads various United States census files ands generates files that are compatible with the rndrec package.	This command reads various United States census files ands generates files that are compatible with the rndrec package.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL