Using LINQ

ScaleOut Software NamedCache API
The LINQ Programming Pattern

The Microsoft .Net Framework supports two sets of LINQ standard query operators, one that operates on objects of type IEnumerableT and the other that operates on objects of type IQueryableT. These operators are defined as extension methods in Enumerable and Queryable respectively. This means that they can be called by using either static method syntax or instance method syntax. Alternatively, you can use the integrated language support for LINQ which provides a simpler syntax for calling these same operators.

The methods from Queryable that operate on IQueryableT data sources do not directly implement any querying behavior. Instead, they build an expression tree that represents the query to be performed. The query processing is handled by the source IQueryableT object. When a sequence of result must be enumerated or counted, the LINQ provider translates the expression tree into a query that the target data source can interpret and enumerates the results.

In the case of the ScaleOut StateServer LINQ provider, LINQ queries are translated into StateServer filter expressions that work with StateServer's property index. A translated LINQ query is executed in parallel across multiple StateServers in a server farm. The results are merged and returned to the requesting client, where possibly more transformations or calculations are performed.

The LINQ Where Clause

Suppose you have instances of these classes written to a NamedCache on StateServer:

Example classes
 1[Serializable]
 2class Stock
 3{
 4    [SossIndex]
 5    public string Ticker { get; set; }
 6    [SossIndex]
 7    public DateTime DelistingDate { get; set; }
 8
 9    public decimal TotalShares { get; set; }
10    public decimal Price { get; set; }
11}
12
13[Serializable]
14class SmallCapStock : Stock
15{
16}
Then the following code will retrieve stocks with a Ticker value of "GOOG", "MSFT", or "ORCL" and print the count of instances found:
1var q = from s in cache.QueryObjects<Stock>()
2        where s.Ticker == "GOOG" || s.Ticker == "MSFT" || s.Ticker == "ORCL"
3        select s;
4
5foreach (var result in q) {
6    Console.WriteLine("Ticker: {0}; Value: {1}", result.Ticker, result.Price);
7}

In this case, the query is executed when iteration begins in the foreach loop, at which point the "where clause" is translated from a LINQ expression into a StateServer filter and is sent to the StateServers in the farm. The keys of the objects that satisfy the filter expression are returned to the client.

As a more complex example, consider this query:

 1NamedCache cache = CacheFactory.GetCache("Stocks");
 2
 3var q = from s in cache.QueryObjects<Stock>()
 4        where 
 5                (String.Compare(s.Ticker, "A") >= 0 && String.Compare(s.Ticker, "B") < 0)
 6             || ("Z".CompareTo(s.Ticker) >= 0)
 7        orderby s.Ticker
 8        select new { StockTicker = s.Ticker, Value = s.Price * s.TotalShares };
 9
10foreach (var result in q) {
11    Console.WriteLine("Ticker: {0}; Value: {1}", result.StockTicker, result.Value);
12}
The where clause in lines 4-6 is executed in a distributed manner on the StateServer instances present in the system. A list of StateServer keys is assembled on the client machine and the objects corresponding to those keys are read from StateServer. The client then performs the orderby processing in line 7 and creates the projection to an anonymous type in line 8.

In addition to evaluating where clause predicates in the StateServer service, predicates provided to LINQ's count operator can be evaluated on the server. For example:

1// get count of "penny" stocks whose price is under $1
2var count = cache.QueryObjects<Stock>().Count(s => s.Price < 1.0M);

Tip Tip

In this StateServer release, only filter operations (where clauses) and count operations are executed on the server. Any remaining LINQ operators in a LINQ expression are executed on the requesting client against the filtered list of objects returned from the StateServer(s).

In future StateServer releases, more functionality is expected to migrate to the server. The ScaleOut LINQ provider is responsible for determining how to partition the various actions within the query between code running on the client and code running on the Server. No application-level code changes will be required to take advantage of that new functionality.

StateServer Where Clause Expressions

In their simplest form, the predicates used in where-clauses and count operations evaluated on StateServer consist of

  • a comparison between a value in the property index and a value from the query itself
  • a comparison between two properties values in the property index
  • a comparison between the result of the CompareTo or Compare methods and zero for those types that don't support the standard inequality operators (e.g. String and Guid).
  • a test for containment of one string within another via the String.Contains method
  • a test of a Boolean property value from the property index
  • an evaluation of a ScaleOut Tag test method: HasAllTags, HasAnyTag, or HasTags

Note Note

The supported comparison operators are ==, !=, <, <=, >, and >= (or equivalent in your programming language).

These simple expressions may then be combined with parentheses and AND, OR, or NOT operators to build up much more complex expressions.

LINQ where clause expressions are parsed when you compile your application. Syntax errors in your expressions will be identified by the compiler at compile time. However, the compiler has no knowledge of the capabilities of the eventual execution target for the expression and allows some expressions at compile time that cannot be executed at runtime.

For example, consider the following:

1NamedCache cache = CacheFactory.GetCache("Stocks");
2
3var q = from s in cache.QueryObjects<Stock>()
4        where s.TotalShares > 100000000
5        select s;
6
7// Fails with a NotSupportedException at runtime
8Console.WriteLine("{0} Stocks found", q.Count());
Since the property "TotalShares" is not annotated with SossIndexAttribute, it is not in the StateServer property index. As a result, the query cannot be executed on StateServer. The code above will compile with no errors but will fail at runtime at the point the query is executed. The runtime exception raised is:
System.NotSupportedException: The member 'TotalShares' has not been annotated with SossIndexAttribute and is not in the property index

The LINQ subsystem delays query evaluation until elements from the query are actually enumerated. In the case above, query evaluation is delayed until the Count() method is executed. Consequently, the NotSupportedException is raised when Count() executes, not (as you might expect) when you assign the variable q.

Similarly, this query compiles with no errors:

1NamedCache cache = CacheFactory.GetCache("Stocks");
2
3var q = from s in cache.QueryObjects<Stock>()
4        where s.Ticker.ToLower() == "goog"
5        select s;
6
7// Fails with a NotSupportedException at runtime
8Console.WriteLine("{0} Stocks found", q.Count());
But, it fails at runtime because StateServer does not (currently) have the ability to perform calculations or transformations on property index values at runtime. In particular, it does not have the equivalent of a ToLower function. As we saw above, the query fails when the Count()) method runs. The exception raised is:
System.NotSupportedException: The method 'System.String.ToLower' is not supported

On the other hand, arbitrary expressions may be used within where clauses if the expression can be evaluated on the client before issuing the query to StateServer. For example, the following query succeeds at runtime:

1NamedCache cache = CacheFactory.GetCache("Stocks");
2
3var q = from s in cache.QueryObjects<Stock>()
4        where s.DelistingDate < new DateTime(DateTime.Now.Year, 1, 1)
5        select s;
6
7Console.WriteLine("{0} Stocks found", q.Count());

Note that you can cause filtering to be performed on the client rather than the server by explicitly forcing the query to be enumerated and then filtering the result on the client. Recall that in example above where we filtered on TotalShares, the query failed at runtime because the property TotalShares property was not in the property index. Ignoring the possible performance implications of transferring all objects of type Stock to the client, you could modify that query as follows to cause the where-clause to execute on the client:

1NamedCache cache = CacheFactory.GetCache("Stocks");
2
3var q = from s in cache.QueryObjects<Stock>().AsEnumerable()
4        where s.TotalShares > 100000000
5        select s;
6
7// Succeeds, since filtering done on client
8Console.WriteLine("{0} Stocks found", q.Count());
The use of the standard method AsEnumerable in line 3 causes the query up to that point to be executed. The where clause at line 4 is evaluated on the client by fetching each Stock object returned from the cache.QueryObjects<Stock>() method call and filtering out only those objects whose TotalShares property is greater than the given limit. Because the where clause is executed on the client with deserialized objects, it is irrelevant that the TotalShares property is not in the server-side property index.

Tag Expressions

StateServer can also process expressions using the TagExtensions methods. For example, consider the class:

[Serializable]
class BlogEntry : ITaggable
{
    public string Title { get; set; }
    public string Content { get; set; }

    SparseBitmap ITaggable.TagHolder { get; set; }
    NamedCache ITaggable.NamedCache
    {
        get { return CacheFactory.GetCache("BlogData"); }
    }
}
BlogEntry implements the interface ITaggable using just a default implementation of the TagHolder property. In this case, BlogEntry implements the ITaggable interface explicitly. However, we could also have implemented the interface with public properties. The TagHolder property has the SossIndexAttribute set in the interface definition itself. Consequently, the TagHolder implementation within BlogEntry is implicitly treated as if it carried the SossIndexAttribute.

The NamedCache property simply looks up the NamedCache. in the CacheFactory given the application name "BlogData". Alternatively, the NamedCache. instance could simply be cached in a static field and the property implementation could return the cached instance.

Next, we create a BlogEntry instance and add some tags. Note that the tags are held in the object itself and are written along with the object when the object is stored in the cache. If these tag names are not already in use within this , they will be automatically distributed around the system to all other StateServers as a side effect of adding the tags to this object.

NamedCache cache = CacheFactory.GetCache("BlogData");
BlogEntry blogEntry = new BlogEntry()
{
    Title = "Why Tags Are Cool",
    Content = "<html>... blah blah ...</html>"
};

blogEntry.AddTags("Tags", "StateServer", "Samples", "Cool Things");
cache[Guid.NewGuid()] = blogEntry;    // write blogEntry out to the cache

Finally, we can query out all BlogEntry instances that have both of the tags "Cool Things" and "Samples" set with code like this:

var coolSamples = from b in cache.QueryObjects<BlogEntry>()
                  where b.HasAllTags("Cool Things", "Samples")
                  select b;

string aCoolSampleTitle = coolSamples.First().Title;

See Also

Other Resources