Introduction

Welcome to the ScaleOut Product Suite, Version 5

Thank you for selecting ScaleOut Software’s product suite for in-memory data grid and in-memory computing. This software product runs on every server within a Web or application server farm to store mission-critical, workload data. ScaleOut’s in-memory data grid with integrated, in-memory distributed caching provides extremely fast access to your critical but fast changing data, and its performance and capacity grow as you add servers to the farm. The software automatically replicates stored data between servers so that critical data are not lost if a server fails, and it maintains scalable, highly available access that a standalone database server (or even a failover database cluster) cannot duplicate. You can also use the Windows version to transparently save and retrieve ASP.NET session-state. All together, ScaleOut StateServer delivers a highly effective middleware storage tier for e-commerce session-state and other mission-critical, workload data combined with powerful tools for in-memory computing, including stateful stream processing and real-time, parallel data analysis.

The following help topics explain how to use ScaleOut’s product suite. Step-by-step instructions for installation and configuration help you to get your in-memory data grid up and running quickly and easily. Additional topics help you to troubleshoot problems and obtain technical support if necessary.

We want your feedback! Please send your comments on the product, documentation, or Web site to feedback@scaleoutsoftware.com. Thank you.

What Is an In-Memory Data Grid?

To capture the evolution of distributed caching and its integration with other advanced technologies, ScaleOut Software uses the term in-memory data grid to describe its scalable, distributed in-memory data storage. Also called distributed data grids, in-memory data grids combine distributed, in-memory caching with powerful analysis and management tools to give you a complete solution for managing fast-changing data in a server farm or compute grid. Distributed, in-memory data grids now have become an essential component of scalable, mission-critical applications and are increasingly relied upon for data-parallel analysis and computation.

What is In-Memory Computing?

Beyond just serving as a fast, scalable repository for live data, in-memory data grids provide the foundation for stateful stream processing and real-time analytics on in-memory data. By harnessing the scalable computing power of the server clusters on which they run, IMDGs enable large, in-memory data sets to be analyzed in parallel, delivering immediate results and important feedback to live systems. While they can serve distributed queries to select data of interest for client applications, their real power lies in the ability to host data-parallel computations within the grid—moving computing to where the data lives—to deliver blazing performance and eliminate bottlenecks to scalability.

ScaleOut Product Suite

ScaleOut Software product suite comprises:

  • ScaleOut StateServer®: scalable, in-memory data grid for Windows and Linux with integrated parallel query
  • ScaleOut ComputeServer®: in-memory computing for real-time, data-parallel analysis; includes all the features of ScaleOut StateServer
  • ScaleOut StreamServer™: in-memory computing for stateful stream processing and real-time, data-parallel analysis; includes all the features of ScaleOut StateServer and ScaleOut ComputeServer
  • ScaleOut hServer®: in-memory computing for source code-compatible Apache Hadoop MapReduce on in-memory data
  • ScaleOut GeoServer®: optional extension to ScaleOut StateServer for WAN-based data replication and global data access
  • ScaleOut SessionServer™: subset of ScaleOut StateServer for scalable, highly available ASP.NET session-state storage

What’s New in Version 5.8

Version 5.8 adds two exciting new features to the ScaleOut Product Suite, the ScaleOut Client Library for .NET (beta release) and the ScaleOut Digital Twin Builder™ software toolkit (alpha release). This release also contains performance enhancements and bug fixes.

The ScaleOut Client Library for .NET represents the next generation in APIs for ScaleOut StateServer, offering full support for asynchronous .NET APIs and compatibility with .NET Core 2.0. As a planned replacement for the current Named Cache APIs, this library adds new flexibility for configuring client applications and support for more than two parent objects when defining dependency relationships.

The ScaleOut Digital Twin Builder enables developers to build stateful stream-processing applications with the digital twin model for execution using ScaleOut StreamServer. This model associates state information and message history with data sources so that message processing can take advantage of enhanced context for deeper introspection and more effective real-time feedback. In addition, this toolkit includes libraries for connecting ScaleOut StreamServer to Microsoft’s Azure IoT Hub, as well as simplified access to Kafka message pipelines. Messages also can be sent to digital twins using a REST web service. The ScaleOut Digital Twin Builder was designed to make creation and deployment of both Java and C# digital twin models for stream processing as simple as possible. We invite your feedback as you experiment with the toolkit’s powerful capabilities.

What’s New in Version 5.7

Version 5.7 contains numerous performance enhancements that accelerate server-to-server communications and event processing under heavy load and eliminate bottlenecks. In addition, the execution path for single method invocations has been streamlined to reduce overhead and latency. New performance counters report rates for event posting, queries, single method invocations, and parallel method invocations. Object metadata and the ScaleOut Object Browser now report object creation and last update times.

With this release, the open source time windowing libraries for managing streaming events in ScaleOut StreamServer has been upgraded to a full production version. The ASP.NET Core 2.0 NuGet package also has been upgraded with support for remote client (public) gateways. Multicast discovery now can be configured dynamically and globally without the need for a server restart.

Version 5.7 also introduces a preview of the ScaleOut Web Console, which lets users manage ScaleOut’s in-memory data grid from a web browser running on a local network. The web console offers all of the management capabilities of the ScaleOut Windows Management Console and, in its upcoming production release, will replace the original ScaleOut PHP-based web console on Linux systems.

What’s New in Version 5.6

Version 5.6 introduces ScaleOut StreamServer™, a new software platform for stateful stream processing. This platform offers important new capabilities for analyzing streaming data by enabling applications to model and track the behavior of data sources instead of just analyzing the telemetry they emit. This allows applications to implement deeper introspection and more effective alerting on streaming data across a wide range of applications, including medical device monitoring, financial services, manufacturing and logistics, and the Internet of Things (IoT). ScaleOut StreamServer includes all of the features and capabilities of ScaleOut ComputeServer.

ScaleOut StreamServer includes open source time windowing libraries for managing streaming events; these libraries are available for Java and .NET on GitHub. It also includes support for posting events using the ReactiveX APIs and for integrating with Kafka message queues.

Version 5.6 also adds support for distributed caching to Microsoft’s ASP.NET Core 2.0 platform that lets application developers transparently take advantage of ScaleOut StateServer’s in-memory data grid technology. This distributed caching library is made available as a NuGet package.

Other features included with version 5.6 include support for Docker containers, OpenSSL 1.1 support, and enhancements to the ScaleOut Object Browser.

What’s New in Version 5.5

Version 5.5 introduces a new .NET API called Distributed ForEach for data-parallel programming in ScaleOut ComputeServer. Modeled after .NET’s Parallel.ForEach, this operator lets developers easily structure data-parallel computations that span all (or a queried subset of) objects within a name space in the in-memory data grid. This enables applications to handle much larger workloads than would be possible on a single server, deliver scalable throughput by adding servers, and maintain fast execution times. In addition, this operator streamlines garbage collection during parallel execution to deliver the best possible performance.

Version 5.5 also adds asynchronous .NET APIs for grid access and query. These APIs let applications fully integrate into .NET applications that use the async/await asynchronous programming model. Other new features for Windows include new PowerShell cmdlets, which enable IT administrators to use .NET’s PowerShell scripts to deploy and manage the in-memory data grid, and support for ISO 19770-2 software tagging that helps system administrators identify software assets.

This version adds important new optimizations that reduce memory usage for stored objects and accelerate performance. By default, all objects are now allocated on the heap instead of using pre-allocated memory buffers, and query indexes are no longer allocated unless in use. Also, memory overhead for string keys has been reduced and integrated into object allocation for higher efficiency.

Version 5.5 introduces a preview of distributed, push-based notifications for C# and Java. This new feature adds operators compatible with the popular ReactiveX library and lets applications scale the throughput of real-time event processing by transparently distributing notifications across the in-memory data grid and its integrated compute engine.

What’s New in Version 5.4

Version 5.4 incorporates numerous performance enhancements designed to take full advantage of large, multicore systems. All aspects of ScaleOut’s internal implementation have been redesigned to distribute the workload across all available cores and extract maximum performance. In addition, the Object Browser has been enhanced to handle very large numbers of objects with faster performance and lower memory usage.

The Windows version of ScaleOut StateServer now includes the Windows Server AppFabric Caching Compatibility Library. This API provides complete, source-level compatibility with the Windows AppFabric Caching API, including support for regions, tag-based query, and event notifications. In most cases, applications previously designed to use Windows AppFabric Caching as a distributed cache can easily migrate to ScaleOut StateServer with only a recompile. In addition, these applications can access ScaleOut StateServer’s extended functionality, such as fully distributed LINQ query, by calling native APIs side-by-side with AppFabric Caching API.

Version 5.4 adds support for Windows PowerShell cmdlets to manage the in-memory data grid. To assist former AppFabric Caching users, it also includes aliases for the corresponding AppFabric Caching cmdlets where applicable. Because of its highly scalable, peer-to-peer design, ScaleOut StateServer makes administration of an AppFabric Caching-compatible distributed cache much easier than ever before.

What’s New in Version 5.3

Version 5.3 adds new APIs for operational intelligence across ScaleOut’s product offerings. This release introduces ScaleOut ComputeServer™, which integrates a scalable, in-memory compute engine within ScaleOut’s in-memory data grid and lets applications perform fast, data-parallel computations on memory-based data; this product replaces ScaleOut Analytics Server®. Version 5.3 also adds several optimizations which enhance the performance of parallel method invocations within the .NET client libraries to take better advantage of large multicore systems.

Complementing support for executing standard MapReduce applications in ScaleOut hServer®, new SimpleMR APIs simplify and streamline MapReduce applications by avoiding the need for standard Hadoop libraries. These APIs are integrated directly into ScaleOut’s in-memory compute engine and in-memory data grid to deliver extremely fast execution times. SimpleMR eliminates the need to install and reference Hadoop MapReduce libraries from standard distributions in order to run in-memory, data-parallel computations with MapReduce semantics, further reducing execution times. These APIs are available for both Java and C#, and now .NET applications can run in-memory MapReduce.

ScaleOut StateServer® extends its property-based query APIs for in-memory data grids with the introduction of InvokeFilter methods for both Java and C#. This new feature allows applications to run data-parallel methods which can analyze properties when selecting objects within a parallel query. Applications now can eliminate the restrictions imposed by standard query techniques and harness the power of data-parallel computation to implement much deeper query analysis. In C#, invoke filters are integrated into Microsoft LINQ as extension methods which simplify program structure.

What’s New in Version 5.2

Version 5.2 increases ease of use and deepens support for operational intelligence across ScaleOut’s product offerings, including ScaleOut Analytics Server® and ScaleOut hServer®. Version 5.2 introduces REST APIs, support for Apache Hive and YARN, and .NET APIs for small object storage.

The new REST API service allows customer applications to easily access objects in ScaleOut’s in-memory storage using HTTP with built-in SSL for security. Objects now can be remotely accessed from applications written in almost any programming language. This new web service can be deployed either using its own built-in, high-performance embedded web server or as a FastCGI module behind an existing web server.

Version 5.2 includes support for Hadoop YARN, enabling Hadoop MapReduce applications to take advantage of ScaleOut hServer’s fast in-memory execution engine and integrated, in-memory data storage. This new capability lets ScaleOut hServer function as an alternative MapReduce execution framework within a Hadoop YARN cluster, running MapReduce applications with significantly lower execution times and zero code changes. ScaleOut hServer lets MapReduce applications analyze live, operational data and has demonstrated more than 40x faster execution than Apache MapReduce in benchmark testing.

Version 5.2 also allows ScaleOut hServer users to run Apache Hive queries using hServer’s fast, in-memory execution engine, thereby accelerating execution and enabling query of in-memory data. Standard Hive distributions can run queries without changes using ScaleOut hServer as a MapReduce framework under YARN. Most popular Hive distributions, including those from Cloudera and Hortonworks, are compatible with ScaleOut hServer.

Complementing ScaleOut’s existing Java support for large numbers of small objects, version 5.2 adds full small-object support for .NET users through a new “NamedMap” API in the Soss.Client.Concurrent namespace. This new storage model streamlines in-memory storage and accelerates parallel analysis when using ScaleOut’s Parallel Method Invocation engine. To maximize ease of use, APIs for this storage model follow the standard semantics of Java’s NamedMap and .NET’s ConcurrentDictionary, adding additional methods those familiar interfaces in order to support parallel query and data-parallel analysis.

Version 5.2 includes an Apache-licensed, open source Java HTTP session-state provider which implements the Filter interface to store HTTP session objects within a specified named cache in the in-memory data grid.

What’s New in Version 5.1

Version 5.1 further expands the features introduced in version 5.0 and make ScaleOut StateServer even faster, more secure, more adaptive, and more versatile.

Starting with version 5.1, ScaleOut StateServer uses a new server-unit licensing (SUL) model, where a server unit is defined as 8 logical processors. Previous versions of ScaleOut StateServer were licensed under a per-host model. Existing customers upgrading to version 5.1 should contact their sales representative with any licensing questions.

Version 5.1 adds support for ScaleOut hServer to Windows. Scaleout hServer extends StateServer’s analytics capability to the Hadoop market, integrating its in-memory data grid and computation engine with Hadoop technologies, which enables Hadoop MapReduce code to execute in parallel and in-memory without necessitating a Hadoop cluster. Alternatively, ScaleOut hServer can be used as an HDFS cache in an existing Hadoop environment, greatly accelerating data access for repeated HDFS operations.

Version 5.1 welcomes the addition of a native, open source C++ API to the existing Java, .NET, and C APIs. The first release of this new API brings the Named Cache to C++ applications, including support for many advanced Named Cache features such as parallel query, backing store integration, and an in-process deserialized client cache.

ScaleOut StateServer now optionally uses secure connections between clients and hosts and between remote sites, encrypted with industry-standard SSL technology. This enables SOSS to be deployed in environments where plain-text transmission of serialized object data is a security concern, such as over untrusted WAN links.

The ScaleOut GeoServer option has been extended to support cloud-hosted environments, such as Amazon EC2 and Microsoft Azure, with minimal configuration, enabling your application data to be replicated to and accessible from the cloud. This feature enables cloud-hosted stores to benefit from GeoServer’s redundancy and protection against complete datacenter outages without data loss.

Support for deploying ScaleOut StateServer in public cloud environments has been extended with additional configuration options. Amazon Web Services deployments can now launch instances into a Virtual Private Cloud (VPC), and Microsoft Azure deployments can now select either Windows Server 2012 or Windows Server 2012 R2, in addition to Windows Server 2008 R2, as the base operating system.

ScaleOut StateServer’s internal network protocols have been significantly improved. The transport protocol used during internal load balancing has been optimized to deliver up to 5X higher performance, resulting in faster load balancing during membership changes and recovery due to a host failure. The heart-beating protocol used to determine overall store health has been enhanced with new adaptive heuristics, reducing the chance of heartbeat failure in heavily congested networks, especially in virtual server environments.

What’s New in Version 5.0

Version 5.0 introduces several exciting new features that dramatically extend ScaleOut StateServer’s capabilities. These features make version 5.0 significantly faster, more scalable, and cloud-ready.

ScaleOut StateServer’s membership architecture has been redesigned to enable the in-memory data grid to easily scale well beyond 100 hosts. This enables SOSS to take full advantage of the elastic computing resources quickly becoming available in public and private clouds. The new membership mechanism also lets hosts join and leave the in-memory data grid significantly more quickly. To accommodate cloud and other enterprise infrastructures, the use of UDP multicast can now be disabled and SOSS can be restricted to using only TCP for all network communication by manually configuring host groups. Please see the Introduction to Management section for more details.

The ScaleOut GeoServer option has been extended with a new replication model that enables applications to transparently access stored objects from remote SOSS stores as they are needed. Called pull-based replication, this new capability allows objects to be shared by a geographically diverse network of in-memory data grids, with policies on each object dictating how frequently remote datacenters should update their copies of the object. Furthermore, the authoritative “master” copy of an object can migrate from datacenter to datacenter as demand dictates.

Version 5.0 adds a major new method for querying the in-memory data grid. Prior to version 5.0, the grid was queried using metadata-based index values assigned to stored objects. Now the C# or Java properties of stored objects can be directly queried. C# applications can make use of .NET’s Language Integrated Query (LINQ) to structure queries using SQL-like semantics. Java applications can make use of “filter methods” to compose queries with logical and comparison operators. For high performance, SOSS’s client libraries automatically extract selected properties and store them as deserialized data during object updates. In addition, to ensure fast parallel queries, index tables are transparently used to accelerate each host’s portion of a parallel query.

ScaleOut Compute Server™ extends ScaleOut StateServer`s capabilities for parallel computation on stored data by adding support for invocation grids to prestage application code on grid servers for use in Parallel Method Invocation (PMI). This new capability dramatically simplifies code deployment for parallel data analysis.

In addition, a new feature called Single Method Invocation (SMI) lets C# and Java applications invoke a method on a specified object, supply parameters to invocation, and receive the method’s result value. Because of its highly optimized implementation which avoids all unnecessary network copies, SMI can be used to efficiently analyze a targeted set of stored objects as an alternative to the map/reduce model provided by PMI. In addition, applications can use SMI to efficiently update stored objects without replacing their full contents in a manner similar to the use of stored procedures in database systems.

With version 5.0, ScaleOut StateServer introduces two mechanisms for authorizing access to named caches within the in-memory data grid. (In this release, these mechanisms are intended for use within a secure datacenter by a single organization and do not secure the in-memory data grid from malicious attack.) The default login mechanism checks the application’s current login name against a list of authorized login names that have been associated with the named cache using the soss command-line program. This management tool also can be used to authorize either read/write access or read-only access. The user also can implement an extensible authorization policy. When the application logs in to a named cache, the SOSS client passes the application’s encoded credentials to the user’s authorization provider, which is associated with SOSS using the soss management tool. This provider validates the credentials using a user-defined mechanism and then returns an authorization ticket back to SOSS along with read/write or read-only authorization.

Document Version 5.8.0

©2018 by ScaleOut Software, Inc.

ScaleOut StateServer, ScaleOut ComputeServer, ScaleOut hServer, and ScaleOut GeoServer are registered trademarks and ScaleOut StreamServer, ScaleOut SessionServer, and ScaleOut Management Pack are trademarks of ScaleOut Software, Inc. Windows is a registered trademark of Microsoft Corporation; Windows Azure and Microsoft Azure are trademarks of Microsoft Corporation in the United States and other countries. Amazon Web Services, AWS, Virtual Private Cloud, VPC, Elastic Compute Cloud, and EC2 are either trademarks or registered trademarks of Amazon Web Services, LLC or its affiliated companies. Hadoop is a registered trademark of the Apache Software Foundation.

Notice: While ScaleOut Software strives to ensure the accuracy of the information contained herein, product requirements, specifications, and limitations are subject to change without notice.