`import "github.com/dropbox/godropbox/hash2"`

This package implements a set of utility hash functions.

- func ComputeMd5Checksum(data []byte) []byte
- func ConsistentHash(key uint64, numShards uint16) uint16
- func ValidateMd5Checksum(data []byte, sum []byte) bool

checksum.go consistent_hash.go doc.go

This returns the data's MD5 checksum.

WARNING: Do NOT Use MD5 in security contexts (defending against intentional manipulations of data from untrusted sources); use only for checking data integrity against machine errors.

ConsistentHash is a space efficient permutation-based consistent hashing function. This implementation supports up to a maximum of (1 << 16 - 1), 65535, number of shards.

Implementation details:

Unlike the standard ring-based algorithm (e.g., as described in dynamo db), this algorithm relays on shard permutations to determine the key's shard mapping. The idea is as follow:

1. Assume there exist a set of shard ids, S, which contains every possible shard ids in the universe (in this case 0 .. 65535). 2. Now suppose, A (a subset of S), is the set of available shard ids, and we want to find the shard mapping for key, K 3. Use K as the pseudorandom generator's seed, and generate a random permutation of S using variable-base permutation encoding (see http://stackoverflow.com/questions/1506078/fast-permutation-number-permutation-mapping-algorithms for additional details) 4. Ignore all shard ids in the permutation that are not in set A 5. Finally, use the first shard id as K's shard mapping.

NOTE: Because each key generates a different permutation, the data distribution is generally more uniform than the standard algorithm (The standard algorithm works around this issue by adding more points to the ring, which unfortunately uses even more memory).

Complexity: this algorithm is O(1) in theory (because the max shard id is known), but O(n) in practice.

Example:

1. Assume S = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, and A = {0, 1, 2, 3, 4}. 2. Now suppose K = 31415 and perm(S, K) = (3, 1, 9, 4, 7, 5, 8, 2, 0, 6). 3. After ignoring S - A, the remaining ids are (3, 1, 4, 2, 0) 4. Therefore, the key belongs to shard 3.

This returns true iff the data matches the provided checksum.

Package hash2 imports 2 packages (graph). Updated 2018-02-28. Refresh now. Tools for package owners.