Advanced Filtering with InvokeFilter

ScaleOut Software NamedCache API

ScaleOut's InvokeFilter LINQ operator allows you to perform arbitrarily complex filter operations on objects stored in the ScaleOut in-memory data grid. If a class contains a collection whose elements need to be evaluated, or if an advanced calculation needs to be performed on a property that goes beyond what's possible with the supported LINQ operators, InvokeFilter can be used to deploy and run your custom filter method in parallel across all of your ScaleOut hosts.

Using InvokeFilter

Traditionally, if a server is unable to filter objects then it would be the client's responsibility to retrieve a superset of the desired objects and then perform the additional, advanced filtering directly in the client process. ScaleOut's InvokeFilter method uses ScaleOut's Invocation Grid feature to automatically deploy a custom .NET filter method to all of your ScaleOut hosts. Instead of a single client performing the filter, all of the ScaleOut hosts run the filter simultaneously against their locally stored objects using all available cores in the farm.

For example, consider an airline class that contains a collection of flight delays:

Airline class containing a collection
[SossIndex, Serializable]
public class Airline
{
    [SossIndex]
    public string Name { get; set; }

    // SossIndex cannot be applied to collections
    public List<int> DelaysInMinutes { get; set; }

    [SossIndex]
    public int FleetSize { get; set; }
}

Because ScaleOut hosts can only index single, simple values using the [SossIndex] attribute (such as individual primitives, strings, DateTimes, etc.), the airline's DelaysInMinutes property cannot be used in a LINQ where clause. However, The InvokeFilter method makes it possible to filter on the collection:

Using InvokeFilter
NamedCache cache = CacheFactory.GetCache("Airline cache");
TimeSpan operationTimeout = TimeSpan.FromSeconds(30);

// Find count of airlines whose average delay is longer than 30 minutes:
var longDelayAirlineCount = (
            from airline in cache.QueryObjects<Airline>()
            select airline)
            .InvokeFilter(operationTimeout, 30, (obj, delayThreshold) =>
            {
                if (obj.DelaysInMinutes.Average() > delayThreshold)
                    return true;
                else
                    return false;

            }).Count();

Performing an InvokeFilter operation requires passing your filter logic to the InvokeFilter method's predicate parameter. Your method will be executed against every object in the named cache. Your method must return a boolean: true indicates that the object being evaluated should be included in the filtered result set, and returning false indicates that an object does not match your criteria. InvokeFilter also accepts an arbitrary parameter object that will be passed to your filter method (the 30-minute "delayThreshold" parameter in the example above).

The InvokeFilter method can be used in conjunction with LINQ filtering on basic properties. Specify filter criteria for properties annotated with [SossIndex] in a LINQ "where" clause. You can then perform further filtering on the result in parallel using InvokeFilter.

Using InvokeFilter in conjunction with LINQ a where clause
// Count of large airlines (100+ planes) with average delays > 30 minutes:
var longDelayLargeAirlineCount = (
            from airline in cache.QueryObjects<Airline>()
            where airline.FleetSize >= 100
            select airline)
            .InvokeFilter(operationTimeout, 30, (obj, delayThreshold) =>
            {
                if (obj.DelaysInMinutes.Average() > delayThreshold)
                    return true;
                else
                    return false;

            }).Count();
Configuring an Invocation Grid

The InvokeFilter method uses the same in-memory compute engine that is used by ScaleOut ComputeServer to deploy and invoke custom filter logic on your hosts. ScaleOut's Invocation Grid feature manages the deployment and hosting of user code--if you do not explicitly configure your NamedCache to use an invocation grid then one will be automatically started for you across the hosts in your ScaleOut farm. An invocation grid that has been automatically started will shut itself down after 5 minutes of inactivity. You can also manually configure and start an invocation grid in your application if you would like more fine-grained control over the execution environment.

Use the InvocationGridBuilder class to manually launch a grid of worker processes across your farm. The InvocationGridBuilderAddDependency method is used to specify which of your custom assemblies should be deployed and loaded by the worker instances--be sure to add the assembly that contains any methods that will be involved in your filter logic. The InvocationGridBuilderLoad method starts the workers across the farm. Once the grid is loaded and running, associate the new grid with a named cache using the NamedCacheInvocationGrid property.