godropbox: github.com/dropbox/godropbox/hash2 Index | Files

package hash2

import "github.com/dropbox/godropbox/hash2"

This package implements a set of utility hash functions.

Index

Package Files

checksum.go consistent_hash.go doc.go

func ComputeMd5Checksum Uses

func ComputeMd5Checksum(data []byte) []byte

This returns the data's MD5 checksum.

WARNING: Do NOT Use MD5 in security contexts (defending against intentional manipulations of data from untrusted sources); use only for checking data integrity against machine errors.

func ConsistentHash Uses

func ConsistentHash(key uint64, numShards uint16) uint16

ConsistentHash is a space efficient permutation-based consistent hashing function. This implementation supports up to a maximum of (1 << 16 - 1), 65535, number of shards.

Implementation details:

Unlike the standard ring-based algorithm (e.g., as described in dynamo db), this algorithm relays on shard permutations to determine the key's shard mapping. The idea is as follow:

1. Assume there exist a set of shard ids, S, which contains every possible
   shard ids in the universe (in this case 0 .. 65535).
2. Now suppose, A (a subset of S), is the set of available shard ids, and we
   want to find the shard mapping for key, K
3. Use K as the pseudorandom generator's seed, and generate a random
   permutation of S using variable-base permutation encoding (see
   http://stackoverflow.com/questions/1506078/fast-permutation-number-permutation-mapping-algorithms
   for additional details)
4. Ignore all shard ids in the permutation that are not in set A
5. Finally, use the first shard id as K's shard mapping.

NOTE: Because each key generates a different permutation, the data distribution is generally more uniform than the standard algorithm (The standard algorithm works around this issue by adding more points to the ring, which unfortunately uses even more memory).

Complexity: this algorithm is O(1) in theory (because the max shard id is known), but O(n) in practice.

Example:

1. Assume S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, and A = {0, 1, 2, 3, 4}.
2. Now suppose K = 31415 and perm(S, K) = (3, 1, 9, 4, 7, 5, 8, 2, 0, 6).
3. After ignoring S - A, the remaining ids are (3, 1, 4, 2, 0)
4. Therefore, the key belongs to shard 3.

func ValidateMd5Checksum Uses

func ValidateMd5Checksum(data []byte, sum []byte) bool

This returns true iff the data matches the provided checksum.

Package hash2 imports 2 packages (graph). Updated 2018-02-28. Refresh now. Tools for package owners.