go-rosalind: github.com/charlesreid1/go-rosalind/rosalind Index | Files

package rosalind

import "github.com/charlesreid1/go-rosalind/rosalind"

Index

Package Files

rosalind_ba1.go rosalind_ba2.go rosalind_ba3.go rosalind_datastructures.go rosalind_stronghold.go utils.go

func Binomial Uses

func Binomial(n, k int) int

Returns value of Binomial Coefficient Binom(n, k).

func Bitmasks2DNA Uses

func Bitmasks2DNA(bitmasks map[string][]bool) (string, error)

Convert four bitmasks (one each for ATGC) into a DNA string.

func CheckIsDNA Uses

func CheckIsDNA(input string) bool

Given an alleged DNA input string, iterate through it character by character to ensure that it only contains ATGC. Returns true if this is DNA (ATGC only), false otherwise.

func Complement Uses

func Complement(input string) (string, error)

Given a DNA input string, find the complement. The complement swaps Gs and Cs, and As and Ts.

func CountHammingNeighbors Uses

func CountHammingNeighbors(n, d, c int) (int, error)

Given an input string of DNA of length n, a maximum Hamming distance of d, and a number of codons c, determine the number of Hamming neighbors of distance less than or equal to d using a combinatorics formula.

func CountKmersMismatches Uses

func CountKmersMismatches(input string, k, d int) (int, error)

Count the number of times a given kmer and any Hamming neighbors (distance d or less) occur in the input string.

func CountNucleotides Uses

func CountNucleotides(dna string) (map[string]int, error)

Count the number of each type of nucleotide ACGT.

func CountNucleotidesArray Uses

func CountNucleotidesArray(dna string) ([]int, error)

Count the number of each type of nucleotide ACGT and return as an array in order A, C, G, T.

func DNA2Bitmasks Uses

func DNA2Bitmasks(input string) (map[string][]bool, error)

Convert a DNA string into four bitmasks: one each for ATGC. That is, for the DNA string AATCCGCT, it would become:

bitmask[A] = 11000000 bitmask[T] = 00100001 bitmask[C] = 00011010 bitmask[G] = 00000100

func EqualBoolSlices Uses

func EqualBoolSlices(a, b []bool) bool

Utility function: check if two boolean arrays/array slices are equal. This is necessary because of squirrely behavior when comparing arrays (of type [1]bool) and slices (of type []bool).

func EqualIntSlices Uses

func EqualIntSlices(a, b []int) bool

Check if two int arrays/array slices are equal.

func EqualStringSlices Uses

func EqualStringSlices(a, b []string) bool

Utility function: check if two string arrays/array slices are equal. This is necessary because of squirrely behavior when comparing arrays (of type [1]string) and slices (of type []string).

func Factorial Uses

func Factorial(n int) int

Compute the factorial of an integer.

func FindApproximateOccurrences Uses

func FindApproximateOccurrences(pattern, text string, d int) ([]int, error)

Given a large string (text) and a string (pattern), find the zero-based indices where we have an occurrence of pattern or a string with Hamming distance d or less from pattern.

func FindClumps Uses

func FindClumps(genome string, k, L, t int) ([]string, error)

Find k-mers (patterns) of length k occuring at least t times over an interval of length L in a genome.

func FindMotifs Uses

func FindMotifs(dna []string, k, d int) ([]string, error)

Given a collection of strings Dna and an integer d, a k-mer is a (k,d)-motif if it appears in every string from Dna with at most d mismatches.

func FindOccurrences Uses

func FindOccurrences(pattern, genome string) ([]int, error)

Given a large string (genome) and a string (pattern), find the zero-based indices where pattern occurs in genome.

func FrequencyArray Uses

func FrequencyArray(input string, k int) ([]int, error)

Generate and return the frequency array for an input string for all kmers of a given length k.

To do this, we assemble the kmer histogram map, then convert that into the frequency array.

func HammingDistance Uses

func HammingDistance(p, q string) (int, error)

Compute the Hamming distance between two strings. The Hamming distance is defined as the number of characters different between two strings.

func KeySetIntersection Uses

func KeySetIntersection(input []map[string]int) ([]string, error)

Find the intersection of the key sets for a slice of string to integer maps.

func KmerComposition Uses

func KmerComposition(input string, k int) ([]string, error)

Given an input DNA string, generate a set of all k-mers of length k in the input string.

func KmerHistogram Uses

func KmerHistogram(input string, k int) (map[string]int, error)

Return the histogram of kmers of length k found in the given input

func KmerHistogramMismatches Uses

func KmerHistogramMismatches(input string, k, d int) (map[string]int, error)

Return the histogram of all kmers of length k that are in the input, or whose Hamming neighbors within distance d are in the input.

func MedianString Uses

func MedianString(dna []string, k int) ([]string, error)

func MinKmerDistance Uses

func MinKmerDistance(pattern, text string) (int, error)

Given a k-mer pattern and a longer string text, find the minimum distance from k-mer pattern to any possible k-mer in text.

func MinKmerDistances Uses

func MinKmerDistances(pattern string, inputs []string) (int, error)

Given a k-mer pattern and a set of strings, find the sum (L1 norm) of the shortest distances from k-mer pattern to each input string.

func MinSkewPositions Uses

func MinSkewPositions(genome string) ([]int, error)

The skew of a genome is the difference between the number of G and C codons that have occurred cumulatively in a given strand of DNA. This function computes the positions in the genome at which the cumulative skew is minimized.

func MoreFrequentThanNKmers Uses

func MoreFrequentThanNKmers(input string, k, N int) ([]string, error)

Find the kmer(s) in the kmer histogram exceeding a count of N, and return as a string array slice

func MostFrequentKmers Uses

func MostFrequentKmers(input string, k int) ([]string, error)

Find the most frequent kmer(s) in the kmer histogram, and return as a string array slice

func MostFrequentKmersMismatches Uses

func MostFrequentKmersMismatches(input string, k, d int) ([]string, error)

Find the most frequent kmer(s) of length k in the given input string. Include mismatches of Hamming distance <= d.

func MostFrequentKmersMismatchesRevComp Uses

func MostFrequentKmersMismatchesRevComp(input string, k, d int) ([]string, error)

Find the most frequent kmer(s) of length k in the given input string and its reverse complement. Include mismatches of Hamming distance <= d.

func NumberToPattern Uses

func NumberToPattern(n, k int) (string, error)

NumberToPattern converts an integer n and a kmer length k into the corresponding kmer string.

NOTE: We should be a little more careful about integer overflow, as that can easily happen for large k.

func PatternCount Uses

func PatternCount(input string, pattern string) int

Count occurrences of a substring pattern in a string input

func PatternToNumber Uses

func PatternToNumber(input string) (int, error)

PatternToNumber transforms a kmer of a given length into a corresponding integer indicating its lexicographic ordering among all kmers of length k.

A = 0 C = 1 G = 2 T = 3

Example for k = 3: C G T | | | | | T - - > 3 * 4^{k-3} | G - - - > 2 * 4^{k-2} C - - - - > 1 * 4^{k-1}

This basically boils down to transforming a number between base 10 (integer) and base 4 (DNA)

func ReadLines Uses

func ReadLines(path string) ([]string, error)

ReadLines reads a whole file into memory and returns a slice of its lines.

func ReconstructGenomeFromPath Uses

func ReconstructGenomeFromPath(contigs []string) (string, error)

Given a genome path, i.e., a set of k-mers that overlap by some unknown number (up to k-1) of characters each, assemble the paths into a single string containing the genome.

func ReverseComplement Uses

func ReverseComplement(input string) (string, error)

Given a DNA input string, find the reverse complement. The complement swaps Gs and Cs, and As and Ts. The reverse complement reverses that.

func ReverseString Uses

func ReverseString(s string) string

Reverse returns its argument string reversed rune-wise left to right. https://github.com/golang/example/blob/master/stringutil/reverse.go

func VisitHammingNeighbors Uses

func VisitHammingNeighbors(input string,
    d int) ([]string, error)

Given an input string of DNA, generate variations of said string that are a Hamming distance of less than or equal to d.

func WriteLines Uses

func WriteLines(lines []string, path string) error

WriteLines writes the lines to the given file.

type DirGraph Uses

type DirGraph struct {
    // contains filtered or unexported fields
}

Directed graph type

func OverlapGraph Uses

func OverlapGraph(patterns []string) (DirGraph, error)

Given a set of k-mers, construct an overlap graph where each k-mer is represented by a node, and each directed edge represents a pair of k-mers such that the suffix (k-1 chars) of the k-mer at the source of the edge overlaps with the prefix (k-1 chars) of the k-mer at the head of the edge.

func (*DirGraph) AddEdge Uses

func (g *DirGraph) AddEdge(n1, n2 *Node)

Add a directed edge

func (*DirGraph) AddNode Uses

func (g *DirGraph) AddNode(n *Node)

Add a node to the directed graph

func (*DirGraph) EdgeCount Uses

func (g *DirGraph) EdgeCount() int

Get a total count of edges in the graph

func (*DirGraph) GetNode Uses

func (g *DirGraph) GetNode(label string) *Node

Get a node, given a label

func (*DirGraph) String Uses

func (g *DirGraph) String() string

Return a sorted edge list representation of the graph

type Node Uses

type Node struct {
    // contains filtered or unexported fields
}

Graph node

func (*Node) String Uses

func (n *Node) String() string

Convert a node to a string

Package rosalind imports 8 packages (graph) and is imported by 4 packages. Updated 2019-01-17. Refresh now. Tools for package owners.