Overview
GMS stands for GroupMembershipService, the core concept on the Virtual Synchrony model developed
by Ken Birman. Using this model, members sharing a common interface can become part of a group;
In this group, any member can send a message to all of the group members (multicast), with total
reliability: or every group member receives the message, or none does. In addition, every member
observes the same sequence of messages (same order); as a result, as far as they share a common
initial state, they evolve to the same final state.
This model is also at the core of the Fault Tolerance service specified for CORBA. There are
quite a few implementations of this model, some of them quite mature and exhibiting good
performance. The links section points out to some of these implementations.
Then, why to develop still another implementation? The single most important reason has been to be
able to support both CORBA and JavaRMI without using any kind of wrappers. We started using Ensemble,
which is itself developed in Ocaml and features C, Java and C++ interfaces (the C++ and Java, built on
top of the C interface). But the communications, that were done in CORBA, had to be marshalled
into C structs, sent through the Ensemble channels, and then demarshalled again into CORBA calls,
tunnelling completely the ORB. We found more interesting the implementation of our own system, directly
built on top of the ORB and making use of the CORBA facilities, that implementing and testing all the
wrapping mechanisms (that would have required, very probably, the same amount of time).
What offers SenseiGMS? A CORBA and JavaRMI interfaces, both sharing the same functionality,
which an application can directly use. Using SenseiGMS, the application can be made fault tolerant,
by running several instances on different hosts. Each time an instance is going to change its state
due to a client interaction, it must send a multicast message to the group of instances: every
instance observes therefore the same sequence of changes, evolving therefore to the same final
state. If any of the instances falls down, any other of the replicas can still serve client requests,
achieving the required fault tolerance.
The SenseiGMS interface is shown below; it supports:
- To send reliable multicast messages to the group of replicas. It is guaranteed that all the
replicas observe the same messages on the same order.
- To send point to point messages to other replica in the group. The order with respect to the
multicast messages is also guaranteed.
- The delivery of events to the group members; these events are the group messages, but also the
changes on the group composition.
Requirements
There are three main requirements on Sensei's design:
- To support a basic functionality to facilitate its implementation. The objective of this project
is not the development of other group communication system, but the implementation of our own theory
on state transfer protocols and the development of a framework and a set of tools to simplify the
implementation of replicated applications.
- To limit its functionality, which must be strictly compatible with the synchrony virtual model.
This requirement is imposed to be able to replace this system with any of the existing group communication systems.
- To present an object-oriented interface, supporting typed messages, to facilitate the development
of the other project's layers, built on top of SenseiGMS
Algorithms
SenseiGMS achieve reliable group communications and total ordering by means of a logical ring, constituted
by the group replicas, where a token is transferred from replica to replica. Only the replica owning the
token is able to communicate with the group; the exception to this rule is when a replica suspects that
the token is lost, and must therefore initiate talks to the other members in order to recreate the token.
This algorithm is quite simple, its only difficulty is to ensure that the token is not lost or duplicated.
The implementation of the algorithm has been carefully done to obtain a good performance on the group
communications.
The idea of the token and the ring is not new, as has been already used if other group communication systems.
The main difference of our implementation is that it is implemented on top of point to point communications,
instead of unreliable multicast (UDP).
A complete description of the algorithm can be found on the chapter 8 of my Thesis
(only in spanish).
Public interface
There are five important types in the interface. The three basic ones are:
- Members identities: each member has a unique identity inside the group, defined with the type GroupMemberId.
- Messages: the type of communication used between members. It is defined with the type Message, being
the application usually responsible to define their own specific messages, by inheritance from the specific type.
- Views: a view is the composition of the group on a given moment. Basically, it includes
the list of members (of member identities), and an id. Whenever the group's composition changes,
every member receives an event with the new view, and it is guaranteed that every member receives
the same sequence of views.
In OMG/IDL, these types are:
typedef long GroupMemberId;
typedef sequence <GroupMemberId> GroupMemberIdList;
struct View
{
long viewId;
GroupMemberIdList members;
GroupMemberIdList newMembers;
GroupMemberIdList expulsedMembers;
};
valuetype Message {};
|
Using JavaRMI, the GroupMemberId is defined as an int, being the other two types:
final public class View implements java.io.Serializable
{
public int viewId;
public int[] members;
public int[] newMembers;
public int[] expulsedMembers;
}
public abstract class Message implements java.io.Serializable { }
|
An application must create an object that inherits from GroupMember to be able to join
a group. This interface is used by SenseiGMS to send events to the application, and is defined in IDL as:
interface GroupMember
{
void processPTPMessage(in GroupMemberId sender, in Message msg);
void processCastMessage(in GroupMemberId sender, in Message msg);
void memberAccepted(in GroupMemberId identity, in GroupHandler handler, in View theView);
void changingView();
void installView(in View theView);
void excludedFromGroup();
};
|
When a member requests to join a group, it receives eventually an event memberAccepted,
which contains its member identity and the group's composition, and an object of type
GroupHandler, which is the gateway of this member to establish communications with the group:
interface GroupHandler
{
boolean castMessage(in Message msg);
boolean sendMessage(in GroupMemberId target, in Message msg);
boolean leaveGroup();
GroupMemberId getGroupMemberId();
boolean isValidGroup();
};
|
These interfaces are, in JavaRMI, completely equivalent:
public interface GroupHandler extends java.rmi.Remote
{
public boolean castMessage(Message message)
throws RemoteException;
public boolean sendMessage(int target, Message message)
throws RemoteException;
public boolean leaveGroup()
throws RemoteException;
public int getGroupMemberId()
throws RemoteException;
public boolean isValidGroup()
throws RemoteException;
}
public interface GroupMember extends java.rmi.Remote
{
public void processPTPMessage(int sender, Message message)
throws RemoteException;
public void processCastMessage(int sender, Message message)
throws RemoteException;
public void memberAccepted(int identity, GroupHandler handler, View view)
throws RemoteException;
public void changingView()
throws RemoteException;
public void installView(View view)
throws RemoteException;
public void excludedFromGroup()
throws RemoteException;
}
|
A commented listing of the interfaces are shown
here
(opens a different window).
Note that these interfaces do not show how a member can create or join a group.
This functionality is implemented in SenseiGMNS (GMNS includes both, a directory service
and basic functionality to create or expand groups). There is not obscurity here:
SenseiGMS has a public interface, shown already, and a private interface, used to pass
the token, recover it, send messages and so on: this private interface defines operations
to join the group, but we have opted to make this operation public only at the GMNS level.
Using SenseiGMS
There is currently no programming guide, and the only way to learn to use SenseiGMS is by
looking at the examples. The simplest example is the
RandomCounter, containing
the following files:
- RandomCounter.idl (in CORBA) or RandomCounterMessage.java and
RandomCounterServer.java (in JavaRMI): define the internal messages in the group,
and the specification of the server.
- Client.java: the implementation of the client, which access the GMNS server to obtain a
reference to the server.
- Server.java: the implementation of the server, which must use the group communications
supported in Sensei.
- Main.java: common main method for both the client and the server, it performs some common
initialization
- SetNumbers.java: helper class, it implements the real logic of the server (returning
always a different random number), but without any replication or group communication
|