The GridOutputFormat

The Grid Record Writer, which is generated by the Grid Output Format (of type GridOutputFormat), writes key/value pairs emitted by the reducers to a named map or a named cache within the IMDG. The grid output format does not preserve the order of key/value pairs. If two values have the same key, only one of them will be saved. ScaleOut hServer does not perform sorting of the keys if the grid output format is used, because named maps and named caches do not preserve ordering.

Using a NamedMap for Output

To configure the GridOutputFormat to use a named map for output, the named map should be passed as a configuration property by calling setNamedMap(…). The following example illustrates how to set up the grid output format and associate it with a named map in the IMDG:

NamedMap<IntWritable, Text> outputMap = NamedMapFactory.getMap("myMap");
// ...
job.setOutputFormatClass(GridOutputFormat.class);
GridOutputFormat.setNamedMap(job, outputMap);

Using a NamedCache for Output

To configure the GridOutputFormat to use a named cache for output, the cache name should be set as a configuration property by calling setNamedCache(…). The following example illustrates how to set up the grid output format and associate it with a named cache in the IMDG:

NamedCache writablecacheO = CacheFactory.getCache("MyOutputCache");
// ...
job.setOutputFormatClass(GridOutputFormat.class);
GridOutputFormat.setNamedCache(job, "MyOutputCache");

If a named cache is used for output, the reducer’s output key should be one of the following: Text, String, CachedObjectId, UUID or byte[]. Values should implement Writable or Serializable. If the values are Writable, a custom serializer should be set for the named cache before accessing the data set through the named cache’s access methods (see section 3.5).