Castor JDO - Caching concepts

License Codehaus OpenEJB OpenJMS OpenORB Tyrex

Old releases

General

Release 1.3

Release 1.3rc1

Release 1.2

Main

Home

About

Features

Download

Dependencies

Reference guide

Publications

JavaDoc

Maven 2 support

Maven 2 archetypes

DTD & Schemas

Recent HTML changes

News Archive

RSS news feed

Project Wiki

Development/Support

Mailing Lists

SVN/JIRA

Contributing

Support

Continuous builds

Prof. services

Related projects

Spring ORM support

Spring XML factories

WS frameworks

XML

XML Code Generator

JDO

Introduction

First steps

Using JDO

JDO Config

Types

JDO Mapping

JDO FAQ

JDO Examples

JDO HOW-TOs

Tips & Tricks

Other Features

JDO sample JAR

Tools

Schema generator

Advanced JDO

Caching

OQL

Trans. & Locks

Design

KeyGen

Long Trans.

Nested Attrs.

Pooling Examples

LOBs

Best practice

DDL Generator

Using DDL Generator

Properties

Ant task

Type Mapping

The Examples

3rd Party Tools

JDO Tests

XML Tests

Configuration

About

License

User stories

Contributors

Marketplace

Status, Todo

Changelog

Library

Contact

Project Name

Castor JDO - Caching concepts

Introduction
Caching and long transactions
Configuration
    Configuration sample - count-limited
    Configuration sample - time-limited
fifo and lru cache providers
    Configuration sample - fifo
    Configuration sample - lru
    Configuration sample - fifo (custommized)
Caching and clustered environments
    Configuration sample - Coherence
    Configuration sample - Gigaspaces
Custom cache provider
    Cache implementation
    CacheFactory implementation
    Configuration
CacheManager - monitoring and clearing caches

Introduction

As explained at the introduction to Castor JDO, Castor has support for many advanced features such as caching. The below sections detail the features related to caching in Castor JDO, as their understanding is required to use Castor JDO in a performant and secure way.

In general, performance caches enhance the application performance by reducing the number of read operations against the persistence storage, by storing and reusing the last read or committed values of the object. Performance caches do not affect the behavior of short transactions or locking. It only affects persistence objects that have been released from any transactional context.

Starting from Castor 0.8.6, a performance cache implementation has been added. At a technical level, Castor maintains separate (performance) caches for each object type specified in the JDO mapping provided, allowing users to specify - for each object type individually - the type and capacity of the cache.

By default, the following cache types are available:


name	Vendor	Version	Distributable?	Open source/commercial	high volume/performance	Added inrelease
none	Built-in	-	No	Open Source	No
unlimited	Built-in	-	No	Open Source	No
count-limited	Built-in	-	No	Open Source	No
time-limited	Built-in	-	No	Open Source	No
coherence	Tangosol Coherence	2.5	Yes	Commercial	Yes	0.9.9
jcs	JCS	1.2.5	Yes	Open source	Yes	0.9.9
fkcache	FKCache	1.0-beta6	No	Open Source	No	0.9.9
oscache	OSCache	2.5	Yes	Open Source	No	1.0
fifo	Built-in	-	No	Open Source	Yes	1.0
lru	Built-in	-	No	Open Source	Yes	1.0
ehcache	Built-in	-	Yes	Open Source	?	1.0.1
gigaspaces	JCS	5.0	Yes	Commercial	Yes	1.0.1

As some of these cache providers allow for allow you to use it in a distributed mode, this allows Castor JDO to be used in a clustered (multi-JVM) environment. Please see the section below for short summary of this feature.

Per definition, all build-in performance caches are write-through, because all changes to objects as part of a transaction should be persisted into the cache at commit time without delay.

For problems related to the use of performance caches, please consult with the relevant entries in the JDO F.A.Q..

Caching and long transactions

As it stands currently, performance caches also serve a dual purpose as dirty checking caches for long-transactions. This limitation implies that the object's availability in the performance cache determines the allowed time span of a long transaction.

This might become an issue when performance caches of type 'count-limited' or 'time-limited' are being used, where objects will eventually be disposed. If an application tries to update an object that has been disposed from the dirty checking cache, an ObjectModifedException will be thrown.

Configuration

The DTD declaration is as follows:

<!ELEMENT cache-type  ( param* )>
<!ATTLIST cache-type
    type           ( none | count-limited | time-limited | unlimited |
                     coherence | fkcache | jcache | jcs | oscache |
                     fifo | lru | ehcache | gigaspaces ) "count-limited"
    debug          (true|false) "false"
    capacity       NMTOKEN  #IMPLIED>

<!ELEMENT param EMPTY>
<!ATTLIST param
          name   NMTOKEN  #REQUIRED
          value  NMTOKEN  #REQUIRED>

With release 1.0 of Castor the DTD has changed but it is backward compatible to the old one and allows to enable debugging of cache access for a specific class as well as passing individual configuration parameters to each cache instance. Only count-limited and time-limited of the current build-in cache types support parameters. Parameter names are case sensitive and are silently ignored if they are unknown to a cache type.

It need to be noted that there are 3 parameter names that are reserved for internal use. If you specify a parameter with one of the names: type, name or debug their value will silently be overwritten with another one used internally.

Configuration sample - count-limited

A count-limited least-recently-used cache (LRU) for 500 objects can be specified by:

<cache-type type="count-limited" capacity="500"/>

<cache-type type="count-limited"/>
    <param name="capacity" value="500"/>
</cache-type>

If both, the capacity attribute and parameter with name="capacity" is specified, the parameter value takes precedence over the attribute value.

Configuration sample - time-limited

A time-limited first-in-first-out cache (FIFO) that expires objects after 15 minutes can be specified by:

<cache-type type="time-limited" capacity="900"/>

<cache-type type="time-limited"/>
    <param name="ttl" value="900"/>
</cache-type>

If both, the capacity attribute and parameter with name="ttl" is specified, the parameter value takes precedence over the attribute value.

The debug attribute can be used to enable debugging for objects of a single class. In addition to setting this attribut to true you also need to set logging level of org.castor.cache.Cache to debug.

Note
The default cache-type is count-limited with a capacity of 30. This will be used when no cache-type is specified in the mapping for a class.

fifo and lru cache providers

The cache types fifo and lru are based on a set of articles in the O'Reilly Network by William Grosso, to implement a simplified and 1.3-compatible implementation of a Hashbelt algorithm.

Hashbelts are simple, in principle. Instead of walking all objects and finding out when they're supposed to expire, use a "conveyor belt" approach. At any particular point in time, objects going into the cache go into the front of the conveyor belt. After a certain amount of time or when the size limit of a container has been reached, move the conveyor belt - put a new, empty container at the front of the conveyor belt to catch new objects, and the one that drops off of the end of the conveyor belt is, by definition, ready for garbage collection.

As seen in his system, you can use a set of pluggable strategies to implement the actual hashbelt bits. A container strategy allows you to change out the implementation of the container itself - from simple hashtable-based implementations, up through more complex uses of soft referenced or hashset-based implementations, depending on what you need and what you want it to be used for. A pluggable "expire behavior" handler allows you to determine what action is taken on something which drops off of the bottom of the conveyor belt.

In difference to all other cache types the fifo and lru cache types offer various configuration options. Both of them have 6 parameters to configure their behaviour.

parameter

description

containers

The number of containers in the conveyor belt. For example: If a box will drop off of the conveyor belt every 30 seconds, and you want a cache that lasts for 5 minutes, you want 5 / 30 = 6 containers on the belt. Every 30 seconds, another, clean container goes on the front of the conveyor belt, and everything in the last belt gets discarded. If not specified 10 containers are used by default.
For systems with fine granularity, you are free to use a large number of containers; but the system is most efficient when the user decides on a "sweet spot" determining both the number of containers to be managed on the whole and the optimal number of buckets in those containers for managing. This is ultimately a performance/accuracy tradeoff with the actual discard-from-cache time being further from the mark as the rotation time goes up. Also the number of objects discarded at once when capacity limit is reached depends upon the number of containers.

capacity

Maximum capacity of the whole cache. If there are, for example, ten containers on the belt and the capacity has been set to 1000, each container will hold a maximum of 1000/10 objects. Therefore if the capacity limit is reached and the last container gets droped from the belt there are up to 100 objects discarted at once. By default the capacity is set to 0 which causes capacity limit to be ignored so the cache can hold an undefined number of objects.

ttl

The maximum time an object lifes in cache. If the are, for example, ten containers and ttl is set to 300 seconds (5 minutes), a new container will be put in front of the belt every 300/10 = 30 seconds while another is dropped at the end at the same time. Due to the granularity of 30 seconds, everything just until 5 minutes 30 seconds will also end up in this box. The default value for ttl is 60 seconds. If ttl is set to 0 which means that objects life in cache for unlimited time and may only discarded by a capacity limit.

monitor

The monitor intervall in minutes when hashbelt cache rports the current number of containers used and objects cached. If set to 0 (default) monitoring is disabled.

container-class

The implementation of org.castor.cache.hashbelt.container.Container interface to be used for all containers of the cache. Castor provides the following 3 implementations of the Container interface.


-	org.castor.cache.hashbelt.container.FastIteratingContainer
-	org.castor.cache.hashbelt.container.MapContainer
-	org.castor.cache.hashbelt.container.WeakReferenceContainer

If not specified the MapContainer will be used as default.

reaper-class

Specific reapers yield different behaviors. The GC reaper, the default, just dumps the contents to the garbage collector. However, custom implementations may want to actually do something when a bucket drops off the end; see the javadocs on other available reapers to find a reaper strategy that meets your behavior requirements. Apart of the default org.castor.cache.hashbelt.reaper.NullReaper we provide 3 abstract implementations of org.castor.cache.hashbelt.reaper.Reaper interface:


-	org.castor.cache.hashbelt.reaper.NotifyingReaper
-	org.castor.cache.hashbelt.reaper.RefreshingReaper
-	org.castor.cache.hashbelt.reaper.ReinsertingReaper

to be extended by your custom implementation.

Configuration sample - fifo

A fifo cache with default values explained above is specified by:

<mapping>
    ...
    <class name="com.xyz.MyOtherObject">
       ...
       <cache-type type="fifo"/>
       ...
    </class>
    ...
</mapping>

Configuration sample - lru

A lru cache with capacity=300 and ttl=300 is defined by:

<mapping>
    ...
    <class name="com.xyz.MyOtherObject">
       ...
       <cache-type type="lru" capacity="300"/>
       ...
    </class>
    ...
</mapping>

or better by:

<mapping>
    ...
    <class name="com.xyz.MyOtherObject">
       ...
       <cache-type type="lru">
          <param name="capacity" value="300"/>
          <param name="ttl" value="300"/>
       </cache-type>
       ...
    </class>
    ...
</mapping>

Configuration sample - fifo (custommized)

An example of a customized configuration of a fifo cache is:

<mapping>
    ...
    <class name="com.xyz.MyOtherObject">
       ...
       <cache-type type="fifo"/>
          <param name="container" value="10"/>
          <param name="capacity" value="1000"/>
          <param name="ttl" value="600"/>
          <param name="monitor" value="5"/>
          <param name="container-class" value="org.castor.cache.hashbelt.container.WeakReferenceContainer"/>
          <param name="reaper-class" value="org.castor.cache.hashbelt.reaper.NullReaper"/>
       </cache-type>
       ...
    </class>
    ...
</mapping>

Caching and clustered environments

All of the cache providers added with release 0.9.9 are distributed caches per se or can be configured to operate in such a mode. This effectively allows Castor JDO to be used in a clustered J2EE (multi-JVM) environment, where Castor JDO runs on each of the cluster instances, and where cache state is automatically snychronized between these instances.

In such an environment, Castor JDO wil make use of the underlying cache provider to replicate/distribute the content of a specific cache between the various JDOManager instances. Through the distribution mechanism of the cache provider, a client of a Castor JDO instance on one JVM will see any updates made to domain objects performed against any other JVM/JDO instance.

Configuration sample - Coherence

The following class mapping, for example, ...

<mapping>
    ...
    <class name="com.xyz.MyOtherObject">
       ...
        <cache-type type="coherence" />
       ...
    </class>
    ...
</mapping>

defines that for all objects of type com.xyz.MyOtherObject Tangosol's Coherence cache provider should be used.

Configuration sample - Gigaspaces

The following class mapping, for example, ...

<mapping>
    ...
    <class name="com.xyz.MyOtherObject">
       ...
        <cache-type type="gigaspaces" />
       ...
    </class>
    ...
</mapping>

defines that for all objects of type com.xyz.MyOtherObject the Gigaspaces cache provider should be used. As Gigspaces supports various cache and clsuer modes, this cache provider allows product-specific configuration as shown below:

<mapping>
    ...
    <class name="com.xyz.MyOtherObject">
       ...
        <cache-type type="gigaspaces" >
           <param name="spaceURL" value="/./" />
           <param name="spaceProperties" value="useLocalCache" />
        </cache-type>
       ...
    </class>
    ...
</mapping>

Custom cache provider

As of release 0.9.6, Castor allows for the addition of user-defined cache implementations. Whilst Castor provides a set of pre-built cache providers, offering a variety of different cache algorithms, special needs still might require the application developer to implement a custom cache algorithm. Castor facilitates such need by making available standardized interfaces and an easy to understand recipe for integrating a custom cache provider with Castor.

As explained in API docs for the persists package, LockEngine implements a persistence engine that caches objects in memory for performance reasons and thus reduces the number of operations against the persistence storage.

The main component of this package is the interface Cache, which declares the external functionality of a (performance) cache. Existing (and future) cache implementations (have to) implement this interface, which is closely modelled after the java.util.Map interface.

Below is a summary of the steps required to build a custom cache provider and register it with Castor JDO:

Create a class that implements Cache.
Create a class that implements CacheFactory
Register your custom cache implementation with Castor JDO in the castor.properties file.

Cache implementation

Please create a class that implements the interface Cache.

To assist users in this task, a AbstractBaseCache class has been supplied, which users should derive their custom Cache instances from, if they wish so. Please consult existing Cache implementations such as TimeLimited} or CountLimited for code samples.

/**
 * My own cache implementation
 */ 
 public class CustomCache extends AbstractBaseCache {
 
    ...
    
 }

CacheFactory implementation

Please add a class that imnplements the CacheFactory interface and make sure that you provide valid values for the two properties name and className.

To assist users in this task, a AbstractCacheFactory class has been supplied, which users should derive their custom CacheFactory instances from, if they wish so. Please consult existing CacheFactory implementations such as TimeLimitedFactory} or CountLimitedFactory for code samples.

/**
 * My own cache factory implementation
 */ 
 public class CustomCacheFactory extends AbstractCacheFactory {
 
    /**
     * The name of the factory
     */
    private static final String NAME = "custom";

    /**
     * Full class name of the underlying cache implementation.
     */
    private static final String CLASS_NAME = "my.company.project.CustomCache"; 
    
    /**
     * Returns the short alias for this factory instance.
     * @return The short alias name. 
     */
    public String getName() {
        return NAME;
    }
    
    /**
     * Returns the full class name of the underlying cache implementation.
     * @return The full cache class name. 
     */
    public String getCacheClassName() {
        return CLASS_NAME;   
    }
    
 }

Configuration

The file castor.properties holds a property org.castor.cache.Factories that enlists the available cache types through their related CacheFactory instances.

# 
# Cache implementations
# 
org.castor.cache.Factories=\
  org.castor.cache.simple.NoCacheFactory,\
  org.castor.cache.simple.TimeLimitedFactory,\
  org.castor.cache.simple.CountLimitedFactory,\
  org.castor.cache.simple.UnlimitedFactory,\
  org.castor.cache.distributed.FKCacheFactory,\
  org.castor.cache.distributed.JcsCacheFactory,\
  org.castor.cache.distributed.JCacheFactory,\
  org.castor.cache.distributed.CoherenceCacheFactory,\
  org.castor.cache.distributed.OsCacheFactory,\
  org.castor.cache.hashbelt.FIFOHashbeltFactory,\
  org.castor.cache.hashbelt.LRUHashbeltFactory

To add your custom cache implementation, please append the fully-qualified class name to this list as shown below:

# 
# Cache implementations
# 
org.castor.cache.Factories=\
  org.castor.cache.simple.NoCacheFactory,\
  org.castor.cache.simple.TimeLimitedFactory,\
  org.castor.cache.simple.CountLimitedFactory,\
  org.castor.cache.simple.UnlimitedFactory,\
  org.castor.cache.distributed.FKCacheFactory,\
  org.castor.cache.distributed.JcsCacheFactory,\
  org.castor.cache.distributed.JCacheFactory,\
  org.castor.cache.distributed.CoherenceCacheFactory,\
  org.castor.cache.distributed.OsCacheFactory,\
  org.castor.cache.hashbelt.FIFOHashbeltFactory,\
  org.castor.cache.hashbelt.LRUHashbeltFactory,\
  org.whatever.somewhere.nevermind.CustomCache

CacheManager - monitoring and clearing caches

Sometimes it is necessary to interact with Castor's (performance) caches to e.g. (selectively) clear a Castor cache of its content, or inquire about whether a particular object instance (as identified by its identity) is cached already.

For this purpose a CacheManager can be obtained from a Database instance by issuing the following code:

JDO jdo = ....;
Database db = jdo.getDatabase();
CacheManager manager = db.getCacheManager();

This instance can subsequently be used to selectively clear the Castor performance cache using one of the following methods:

- expireCache()

- expireCache(Class,Object)

- expireCache(Class,Object[])

- expireCache(Class[])

To inquire whether an object has already been cached, please use the following method:

- isCached (Class, Object);

Please note that once you have closed the Database instance from which you have obtained the CacheManager, the CacheManager cannot be used anymore and will throw a PersistenceException.

Copyright © 1999-2005 ExoLab Group, Intalio Inc., and Contributors. All rights reserved.

Java, EJB, JDBC, JNDI, JTA, Sun, Sun Microsystems are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and in other countries. XML, XML Schema, XSLT and related standards are trademarks or registered trademarks of MIT, INRIA, Keio or others, and a product of the World Wide Web Consortium. All other product names mentioned herein are trademarks of their respective owners.