GeoServer Option

The ScaleOut GeoServer option’s push access mode replicates object updates to a maximum of eight remote SOSS stores. Object updates include:

newly created objects,
updates to existing objects, and
requests to remove existing objects. Objects created using the APIs optionally can be tagged to not be replicated. ASP.NET session objects are always replicated to remote stores.

Objects are automatically created on the remote site if necessary. Requests to remove non-existing objects are silently ignored. All object attributes, including timeouts, LRU behavior, and dependency relationships are replicated to the remote site.

Object replication is unidirectional. The GeoServer option is configured to send objects to a remote SOSS store. In order to receive updates from the remote farm, the GeoServer option must be configured on the remote store to send object updates to the local store.

To avoid adding latency to local access requests, object replication completes asynchronously with respect to local updates. If a client application updates an object on the local store and then attempts to retrieve the updated object at the remote store, it cannot be assured that the retrieved object contains the latest update. (This race condition is inherent to asynchronous replication.) As such, client applications must provide their own means of synchronizing updates when simultaneously accessing multiple sites. This situation can occur in web applications that use a global load balancer to direct requests to web server farms at multiple sites.

The GeoServer option does not support cascading data replication, for example replicating objects from site A to B and then forwarding site A’s objects to site C. Objects sent to a remote site are tagged to disallow further replication upon their arrival at the remote site. To replicate objects from site A to sites B and C, configure site A to replicate objects directly to both sites B and C. Details on configuring the GeoServer option can be found in the section Configuring the GeoServer Option.

Scalable, Highly Available Replication

Unlike most replication solutions, GeoServer uses scalable and highly available connections that take full advantage of all servers in each farm. To maximize performance and availability, all servers within both the local and remote StateServer farms participate in data replication, as shown in the following diagram:

As servers are added or removed at each farm, GeoServer automatically reconfigures its network connections to maintain the best possible replication performance without the need for manual intervention.

When replication to a remote store is started, all local hosts are notified to begin replication. Each local host first establishes a TCP connection to one of the gateway addresses stored in the client parameters file for the remote store (see Configuration Parameters). By storing multiple gateway addresses in the parameters file, you can ensure that the initial connection is successful, even if one or more remote hosts are offline. Note that you should have configured the gateway addresses to use a secure communications channel.

Once a connection is established, the local SOSS hosts automatically download a list of the gateway addresses for all hosts in the active store. In addition, they download load-balancing information that is used to direct object updates most efficiently to individual hosts in the remote store. This information is automatically updated whenever a membership or load-balancing change occurs at the remote store.

All local SOSS hosts next establish TCP connections to every host in the remote store and then begin forwarding object updates to the remote store. Since all local hosts participate in handling local object updates, they can concurrently replicate these updates to the remote store. This maximizes both the throughput and scalability of data replication. Likewise, it helps ensure that data replication is not interrupted if a host in either the local or remote store should fail.

If the communications link to a remote store fails or if all hosts in the remote store fail, object replication pauses, and the local store signals an Unknown status. Each host in the local store repeatedly attempts to reconnect to all known gateway addresses for the remote store. Object replication automatically resumes once the TCP connections are re-established.

Global Data Access

ScaleOut GeoServer^® Pro gives applications global access to objects across multiple ScaleOut StateServer stores when using pull access mode. It is designed to support two usage models:

Mostly read access with flexible coherency policies to remote objects that host slowly changing data, such as product descriptions
Synchronized, read/write access to objects shared across multiple datacenters, such as shopping carts, project plans, and financial data.

To maximize performance and minimize use of the wide area network (WAN) connecting multiple data centers, ScaleOut GeoServer Pro maintains copies (called proxies) of objects accessed from remote data centers. By locally caching data, proxies eliminate the need to read the data stored in remote objects. For mostly read access, applications can select a polling coherency policy with a specified refresh interval to periodically update the contents of a proxy object from its remote primary copy. Applications can also directly access the primary copy of an object across the WAN when needed.

Applications can only perform updates to objects on the primary copy. Hence, the primary copy must be local to the store in which the update is made; otherwise, inconsistent changes to the contents of an object could occur. Attempts to update a proxy object return an exception. However, to maintain consistency with existing applications, which often need to remove objects outside of a lock, a delete operation on a proxy object will be replicated to the primary copy.

Applications which need to maintain synchronized access to objects across datacenters can use standard object locking APIs to acquire primaryship of an object, optionally update it, and then release the lock. This ensures that the application always sees the latest updates to an object at any location. When an object is locked, its primaryship is transferred to to the local ScaleOut StateServer store where the request is made. The remote store which previously held the primary copy converts its copy to a proxy since the primary copy can only be held in one store at a time.

Since transferring primaryship of an object and its contents requires use of the WAN and associated delays, it should only be used for objects that require synchronized access. ScaleOut GeoServer Pro minimizes data transfers by tracking the version of objects accessed from remote stores so that data transfers can be avoid when the contents have not changed.

Combining Global Data Access and Replication

Mission-critical applications often need to coherently access and update objects at multiple datacenters while maintaining copies at both data centers to protect against WAN or datacenter failures. For example, an ecommerce site that distributes its workload across two datacenters using a global load-balancer needs to store shopping carts that can be coherently accessed at either datacenter. At the same time, a copy of each shopping cart needs to maintained at both datacenters.

ScaleOut GeoServer Pro provides a solution for this challenge with the notify coherency policy. Available for bi-directional use across two datacenters, this coherency policy replicates all changes to an object to the companion datacenter while ensuring coherent access using standard object locking (as described above). This overcomes the limitations of asynchronous data replication and guarantees that applications always see the latest updates when object locking is employed.

When sharing data across a WAN between datacenters, failures to the WAN or a remote datacenter can occur. This is called a split-brain condition. To ensure continuous operations when this occurs, ScaleOut GeoServer Pro allows both datacenters to operate independently. It automatically converts proxy objects using the notify coherency policy to primarys if the remote data center cannot be reached.

A WAN failure can cause duplicate primary copies of an object to be created. When the WAN failure has been corrected, the two data centers will automatically resume shared access to objects. As updates are performed to objects with the notify coherency policy, they are replicated across the WAN to the remote store. In so doing, duplicate primary copies are automatically identified, and the copy with the older update time is removed. This enables both stores to resume shared, synchronized use of a single primary copy for all objects.

Note that resolution of a split-brain condition can cause some updates to be lost. (This is true for all distributed systems that cannot easily merge updates and need to select one of multiple copies when recovering from split brain.) For example, when using the heuristic that selects the object with the latest update time, updates made during the WAN outage solely to the companion object with an older update time will be lost.

Memory Usage

For each object update to be replicated, a descriptor of approximately 320 bytes (not including object metadata) is enqueued in memory within the local StateServer service. (For efficiency, object data is not copied to the replication request queues.) If the rate of object replication is lower than the rate of local object updates or if the remote store cannot be reached, an increasing amount of memory will be consumed within the local SOSS store to buffer these outbound descriptors.

The local SOSS store automatically stops enqueuing replication requests and flushes all enqueued descriptors if the memory consumed for all stored objects and descriptors reaches a specified threshold on any host. This threshold can be set using the repl_threshold configuration parameter (see Configuration Parameters). In this situation, some replication requests will be permanently lost, and the remote store will not receive all objects from the local store. It is important to use a highly reliable and fast communication link to the remote store to avoid interrupting replication and losing object updates.