Performance Considerations

By design, ScaleOut StateServer allows simultaneous access to its distributed store from all hosts. A symmetric, fully distributed algorithm is used to implement storage access. This avoids bottlenecks as the server farm grows, and it allows access performance to scale linearly with the number of hosts until network bandwidth is saturated. The software is designed to provide uniform, scalable access and excellent load-balancing on server farms with up to sixty-four (or more) hosts. Since each host provides an access rate that matches or exceeds that for a database server running on the same host, the aggregate access rate can be scaled to surpass that of very large database servers.

Scalable Throughput

ScaleOut StateServer demonstrates scalable throughput. This means that as you add hosts to the server farm and proportionally increase the aggregate load offered to the farm, ScaleOut StateServer linearly increases its throughput (and its memory capacity) to handle the additional load. This keeps response times from growing as the load increases. In contrast, storage solutions with fixed throughput, such as database servers, can experience rapidly increasing response times as their maximum throughput is reached.

For example, ScaleOut StateServer’s scalable throughput was measured on a server farm with 20,000 1K byte objects per host performing continuous read/update access requests sequentially to all objects. The objects represented concurrent users accessing the server farm and maintaining Web sessions. The average time to complete a read and update was recorded. Hosts were added to the farm until 28 servers with 560K objects were in use. As each server was added, the memory and access load increased proportionately. The response time was also measured for the same read/update operations on a database server in which the objects also were stored.

The following chart shows the measured response times for both ScaleOut StateServer and for the database server:

images/diagrams/scalable_resp_time.png

The chart shows that SOSS’s response time (shown as the lavendar bars) remained flat as objects and hosts were added to the farm. This demonstrates scalable throughput that increased linearly as the server farm grew from to 28 hosts. In contrast, the database server exhibited a non-linear increase in response times (the maroon bars) as the object count and access load increased.

When measuring scalability with ScaleOut StateServer, remember that server farms with one host and with two hosts have unique performance characteristics because they have fewer than three replicas of each data object. As a result, they incur lower network overhead for creating replicas, and they exhibit unusually high throughput. Scalable throughput is observed for server farms with three and more hosts.

Fast Response Time

ScaleOut StateServer incorporates internal caches to accelerate access requests and minimize response times. These caches enable ScaleOut StateServer to achieve read access times that are very close to the performance of an "in process" store and often significantly better than read access times for other "out of process" stores. For example, the following diagram shows SOSS’s average read access time (in msec.) for an ASP.NET session object with a 100KB dataset in comparison to other storage alternatives:

images/diagrams/read_resp_time.png

To accomplish this fast response time, SOSS employs multiple, internal caches as follows:

  • API cache: a cache of references to recently accessed, deserialized data objects which is maintained by the SOSS API client libraries (the cached objects are held in the memory of the user’s application between access requests),
  • session cache: a cache of recently accessed, deserialized session objects maintained by SOSS’s ASP.NET client libraries, and
  • server cache: a cache of recently accessed, serialized objects maintained by the StateServer service on each host.

These caches are depicted in the following diagram:

images/diagrams/client_cache_diagram.png

Performance Optimizations

Beyond providing scalable throughput for growing server farms, ScaleOut StateServer also incorporates several optimizations to provide the fastest possible access to stored objects. These optimizations minimize access times and minimize both CPU and network usage.

As described in the previous section, the StateServer service and client libraries on each host cache recently accessed data objects for its local clients. These caches avoid the need to send objects across the network from another SOSS host unless the object has been updated since the last access. The cache sizes on each host are configurable using the max_cache and max_client_cache Configuration Parameters. To save cache space, a serialized object is never cached on an SOSS host in the server cache if it is already stored as one of the object’s replicas.

ScaleOut StateServer uses a proprietary, lightweight transport protocol running over the standard TCP protocol to minimize network overhead and latency for data retrieval and replication while maintaining high availability. By storing data objects in server memory, the software can provide much faster response time than do database servers by eliminating the need for disk access to commit updates. Its simpler access methods also dramatically reduce the complexity and overhead usually associated with database servers.

To minimize overhead, the software has been optimized to minimize data copies when communicating between the user and the StateServer service and when communicating between hosts. Object copying is avoided when marshaling data for replication on both the sending and receiving hosts. To avoid unnecessary copies, objects larger than a threshold size (set to approximately 500 bytes) are stored in separate memory buffers. These optimizations and incremental memory usage provide substantial performance gains during object replication.

ScaleOut StateServer stores objects within local memory on each host. Its internal hash tables have been configured to provide high performance for storing approximately 1M objects per host. For very large object populations, access and load-balancing performance can be increased by adjusting the hash_tbl_factor configuration parameter.