Overview
ScaleCube provides a fault-tolerant decentralized peer-to-peer based cluster membership service with no single point of failure or single point of bottleneck. It does this using gossip protocol and an scalable and efficient failure detection algorithm.
Basic Concepts
Cluster
A set of nodes (servers) joined together through the membership protocol.
Member
A member is a single node (server) that is part of your cluster. There could be multiple members on a single physical machine which running transport on different ports and defined by a member id.
Transport
Transport component is responsible to maintain point-to-point connections between the members. It is used as a backbone for asynchronous messaging between the members of the cluster.
Cluster Membership
Distributed peer-to-peer applications require weakly-consistent knowledge of cluster membership information at all participating members. ScaleCube provides scalable and efficient implementation of cluster membership algorithm.
Cluster Membership component is responsible for managing information about existing members of the cluster. It is running Java implementation of SWIM protocol for distributed group membership, with a few minor adaptations. It uses suspicion mechanism over the failure detector events. Also it uses separate membership updates dissemination component (gossip protocol). But it introduces a separate gossip protocol component instead of piggybacking membership updates on top of failure detector messages. It is done in order to reuse gossip component for other platform events and have more fine grained control over time intervals used for gossiping and failure detection pinging. New members to the cluster joins via the configuration provided well known server addresses (seed members). And also it extends SWIM protocol with the introduction of periodic SYNC messages in order to improve recovery from network partitioning and message losses.
Seed Members
The seed members is a set of configured contact points for new members joining the cluster. When a new member is started it sends to all seed nodes SYNC message in order to synchronize membership tables until one of them answers, if all failed to answer it is decided that this member is the first member in the cluster. Seed members will be used further in order to restore cluster from network partitioning. They will continued to be SYNCed by cluster members even if they are marked as failed. It is possible to configure any node as a seed member in order to join the cluster, but it is recommended to configure nodes with most probability for maximum up-time so in such case cluster can be restored from network partitioning through SYNC messages with seed members (e.g. API Gateways can serve as a seed members of the cluster). The seed nodes can be started in any order and it is not necessary to have all/or any seed nodes running, but when initially starting a cluster, the cluster will be initialized only when one or more seed-nodes will become initialized else no other node can join the cluster and nodes can only see themselves. Note that you can only join to an existing cluster member, which means that for bootstrapping some node must join itself, and then the following nodes could join them to make up a cluster.
Failure Detector
Failure Detector component is responsible for monitoring availability of other members in the cluster and for detecting if a node is unreachable from the rest of the cluster. The nodes in the cluster monitor each other using random-probing based failure detector algorithm with suspicion mechanism. So when any node detects the node as unreachable that information will spread to the rest of the cluster through the gossip protocol. In other words, only one node needs to mark a node unreachable to have the rest of the cluster mark that node unreachable. Node marked as unreachable are not removed until configured timeout. The failure detector continue to monitor this node and detect if the node becomes reachable again.
Gossip Protocol
Gossip Protocol component responsible for spreading information (gossips) over the cluster members using infection-style dissemination algorithms. It provides reliable cross-cluster broadcast.
Getting Started
Maven Repository
The ScaleCube Cluster is hosted on Maven Central. In order to use it just add dependency on the latest version in your pom.xml file:
<dependency> <groupId>io.scalecube</groupId> <artifactId>scalecube-cluster<artifactId> <version>0.9.1-SNAPSHOT<version> </dependency>
Join Cluster
Setting up cluster consisting from two member nodes A and B both running on a same machine, but different ports. Member A is a seed node of the cluster. Member B use Member A as seed to join the cluster:
// Start seed node ICluster seedNode = Cluster.joinAwait(); // Join to cluster another member B Cluster.joinAwait(seedNode.address());
Spread Gossip
Member A listens for the gossips disseminated through the cluster and Member B spreads gossip message:
// Listen for gossips on member A seedNode.listenGossips().subscribe(gossip -> System.out.println("Heard gossip: " + gossip.data())); // Spread gossip from another member clusterB.gossip().spread( Message.builder() .data(new Greetings("Greetings from ClusterMember B") .build() );
Send Messages between Cluster Members
Member A listens to Greetings messages using filter:
// Listen for greetings messages clusterA.listen().filter(message-> { return message.data() instanceof Greetings.class; }).subscribe(message-> { System.out.println(message.data()); });
Member sends Greetings message to all other members of the cluster:
// Send greetings message to other members Message greetingsMessage = Message.builder() .data(new Greetings("Greetings from ClusterMember B")) .build(); clusterB.otherMembers().forEach(member->{ clusterB.send(member, greetingsMessage); };
Member gets notified when member other members left or join the cluster:
Cluster.joinAwait().listenMembership() .subscribe(event -> System.out.println(" Alice received membership: " + event));
Bugs and Feedback
For bugs, questions and discussions please use the GitHub Issues.