Load-Balancing

The principal goal of a load-balancer is to allocate a portion of the overall workload to each server according to that server’s capabilities. ScaleOut StateServer’s load-balancer uses the number of objects in the store as its measure of the total workload, and it seeks to balance the number of objects stored on each server. It then assigns each data object to a server using a statistical function that maps the object’s 256-bit key to the host’s IP address. The load-balancer runs in a fully distributed manner on all hosts to avoid performance bottlenecks or single points of failure.

Note

The load-balancer is designed to efficiently balance thousands of stored objects across a server farm. The storage load becomes more evenly balanced as objects are added to the distributed store.

Using the *store_weight* Parameter

Each host’s store_weight parameter provides a user-defined measure of the relative amount of memory available on that host. By default, this parameter is set to 100, which corresponds to an abstract measure of the amount of memory available on the host to store data objects. This parameter is intended to be used as an approximate and relative (not an absolute) measure of memory capacity on each host as an input parameter to the dynamic load-balancer. ScaleOut StateServer does not stop accepting access requests to store objects when this value is reached, and the load-balancer’s actual dynamic behavior may not closely match the input parameter.

If all hosts use the default parameter, the load-balancer will allocate approximately the same number of objects to each host. Alternatively, if you change the value of this parameter on a host, this host will be assigned relatively more or fewer data objects. For example, the following table shows the percentage of data objects assigned to each host in a four host farm after changing the values for store_weight:

Host

store_weight

% objects

1

100

25

2

200

50

3

100

25

4

0

0

Note that setting the parameter to zero causes all objects (but not replicas) to be removed from the host. When a new host joins the store, the load percentages are recalculated by incorporating the new host’s parameter value into the overall total.

Several factors cause the amount of memory actually used by each host to dynamically vary from the relative weighting specified by the store_weight parameter. Variations in object size modify the amount of memory consumed by the load-balancer’s weighted distribution of objects. Also, the number and placement of replicas for each data object consumes additional memory that affects the relative memory consumption by the hosts. Finally, transient membership changes, including recovery from failures, impacts memory usage in a highly dynamic manner.

The Load-Balancing Algorithm

ScaleOut StateServer’s load-balancing algorithm is designed to avoid unnecessary movement of data between hosts to maintain load-balance as objects are stored and removed. This would consume network bandwidth and degrade overall performance. The load-balancer evaluates whether to move data objects once every load-balancing interval of three seconds, and it only attempts to correct a portion of the load-imbalance in any rebalancing action. All hosts simultaneously participate in adjusting the load-balance. It may take several seconds or minutes to fully rebalance the storage load after changes are made to the host membership or their store_weight parameters.

The load-balancer coordinates its actions with data replication; replicas are distributed across hosts so as to maintain the overall load-balance. More details on data replication can be found in the following section.