This library provides a consistent hashing function which simultaneously achieves both uniformity and consistency.
For detailed information about the concept, you should take a look at the following resources:
In this package's context, the keys are distributed among partitions and partitions are distributed among members as well.
When you create a new consistent instance or call Add/Remove
:
Average load cannot be exceeded. So if all members are loaded at the maximum while trying to add a new member, it panics.
When you want to locate a key by calling LocateKey
:
MOD(hash result, partition count)
- is the partition in which the key will be located,LocateKey
. So it returns the partition owner immediately.No memory is allocated by consistent
except hashing when you want to locate a key.
Note that the number of partitions cannot be changed after creation.
With a correctly configured Go environment:
go get github.com/buraksezer/consistent
You will find some useful usage samples in examples folder.
type Config struct { // Hasher is responsible for generating unsigned, 64 bit hash of provided byte slice. Hasher Hasher // Keys are distributed among partitions. Prime numbers are good to // distribute keys uniformly. Select a big PartitionCount if you have // too many keys. PartitionCount int // Members are replicated on consistent hash ring. This number controls // the number each member is replicated on the ring. ReplicationFactor int // Load is used to calculate average load. See the code, the paper and Google's // blog post to learn about it. Load float64 }
Any hash algorithm can be used as hasher which implements Hasher interface. Please take a look at the Sample section for an example.
LocateKey
function finds a member in the cluster for your key:
// With a properly configured and initialized consistent instance key := []byte("my-key") member := c.LocateKey(key)
It returns a thread-safe copy of the member you added before.
The second most frequently used function is GetClosestN
.
// With a properly configured and initialized consistent instance key := []byte("my-key") members, err := c.GetClosestN(key, 2)
This may be useful to find backup nodes to store your key.
On an early 2015 Macbook:
BenchmarkAddRemove-4 100000 22006 ns/op BenchmarkLocateKey-4 5000000 252 ns/op BenchmarkGetClosestN-4 500000 2974 ns/op
The most basic use of consistent package should be like this. For detailed list of functions, visit godoc.org. More sample code can be found under _examples.
package main import ( "fmt" "github.com/buraksezer/consistent" "github.com/cespare/xxhash" ) // In your code, you probably have a custom data type // for your cluster members. Just add a String function to implement // consistent.Member interface. type myMember string func (m myMember) String() string { return string(m) } // consistent package doesn't provide a default hashing function. // You should provide a proper one to distribute keys/members uniformly. type hasher struct{} func (h hasher) Sum64(data []byte) uint64 { // you should use a proper hash function for uniformity. return xxhash.Sum64(data) } func main() { // Create a new consistent instance cfg := consistent.Config{ PartitionCount: 7, ReplicationFactor: 20, Load: 1.25, Hasher: hasher{}, } c := consistent.New(nil, cfg) // Add some members to the consistent hash table. // Add function calculates average load and distributes partitions over members node1 := myMember("node1.olric.com") c.Add(node1) node2 := myMember("node2.olric.com") c.Add(node2) key := []byte("my-key") // calculates partition id for the given key // partID := hash(key) % partitionCount // the partitions are already distributed among members by Add function. owner := c.LocateKey(key) fmt.Println(owner.String()) // Prints node2.olric.com }
Another useful example is _examples/relocation_percentage.go
. It creates a consistent
object with 8 members and distributes partitions among them. Then adds 9th member, here is the result with a proper configuration and hash function:
bloom:consistent burak$ go run _examples/relocation_percentage.go partID: 218 moved to node2.olric from node0.olric partID: 173 moved to node9.olric from node3.olric partID: 225 moved to node7.olric from node0.olric partID: 85 moved to node9.olric from node7.olric partID: 220 moved to node5.olric from node0.olric partID: 33 moved to node9.olric from node5.olric partID: 254 moved to node9.olric from node4.olric partID: 71 moved to node9.olric from node3.olric partID: 236 moved to node9.olric from node2.olric partID: 118 moved to node9.olric from node3.olric partID: 233 moved to node3.olric from node0.olric partID: 50 moved to node9.olric from node4.olric partID: 252 moved to node9.olric from node2.olric partID: 121 moved to node9.olric from node2.olric partID: 259 moved to node9.olric from node4.olric partID: 92 moved to node9.olric from node7.olric partID: 152 moved to node9.olric from node3.olric partID: 105 moved to node9.olric from node2.olric 6% of the partitions are relocated
Moved partition count is highly dependent on your configuration and quailty of hash function. You should modify the configuration to find an optimum set of configurations for your system.
_examples/load_distribution.go
is also useful to understand load distribution. It creates a consistent
object with 8 members and locates 1M key. It also calculates average load which cannot be exceeded by any member. Here is the result:
Maximum key count for a member should be around this: 147602 member: node2.olric, key count: 100362 member: node5.olric, key count: 99448 member: node0.olric, key count: 147735 member: node3.olric, key count: 103455 member: node6.olric, key count: 147069 member: node1.olric, key count: 121566 member: node4.olric, key count: 147932 member: node7.olric, key count: 132433
Average load can be calculated by using the following formula:
load := (consistent.AverageLoad() * float64(keyCount)) / float64(config.PartitionCount)
Please don't hesitate to fork the project and send a pull request or just e-mail me to ask questions and share ideas.
MIT License, - see LICENSE for more details.