Monday, March 1, 2010

iBatis "Cache Miss", Part 3 Confguring the CacheModel

In this final blog of this series we want to analyze the iBatis CacheModel, and how to configure it correctly. The first blog of the series discussed how to use JBoss logging to verify the problem and the previous blog demonstrated how to write a test to automatically test caching.

Let's start with simple example from the other blogs. The location_type table. It has no foreign keys to other tables. We only have to worry about changes to this table. Let's examine the location_type iBatis CacheModel.

<cacheModel id="locationType_cache"
type="LRU"
readOnly="true"
serialize="false">
<flushInterval hours="24"/>
<flushOnExecute statement="LocationType.insertLocationType"/>
<flushOnExecute statement="LocationType.updateLocationType"/>
<flushOnExecute statement="LocationType.removeLocationType"/>
<property name="size" value="10"/>
</cacheModel>

There are four attributes defined for the CacheModel. The first is the id, this is the name you will reference on the select definitions in the iBatis XML file. The next is, type, which is the type of caching used. There two choices: LRU - Least Recently Used and FIFO - First-In-First-Out. We chose LRU so the most frequently used objects would be kept in the cache. The next two parameters are the two that determine how the cache model works(or doesn't if misconfigured).

I ran across a blog, that helped me understand what I was doing wrong. As Clinton explains how the readOnly and serialize attributes play together. I will replicate the meat of his analysis here, but have included the link to his post.

Setting readOnly to false, which means it can be modified, and serialize to false the data is not able to be serialized forces iBatis to limit the caching to the current session request. This essentially means that a second request from the client, will not use the cache. This was the situation we found ourselves in.

There are two possible configurations that will allow caching to work. You will have to decide which is appropriate for your project. The first is readOnly=true, serialize=false. This will cache the objects for in one cache accessed by all users. This assumes that the objects are readOnly and will not change, or that if they change you will have the appropriate flushOnExecute statements defined.

The second configuration you can use is readOnly=false, serialize=true. This will allow each user to have their own cache. This allows the tables to be updated with out negatively affecting the other users. The downside of this approach is memory usage. With each individual having their own cache, you can run yourself out of memory.

On our project we went with the single cache shared among all users. This placed the burden on us to manage the cache model. How do you do that? There are some other properties that can be set for the CacheModel. One of them is size of the cache for a given table. It will keep the most frequently used objects in the cache up to the quantity specified. Other parameters, include flushInterval. This allows you to say if this hasn't been flushed in N hours, flush it anyway. The finest control is telling the cache model what statements to flush on, using the flushOnExecute properties.

How do you know what statements need to cause the cache to flush? If it caused the database to change, you have to flush the cache. If you add a new record, modify an existing record, or delete a record you need to flush the cache. By examining the CacheModel definition above, you will see that it does just that.

That was the simple case. What if your table joins to other tables in it's select statements? Then cache management becomes more complex. You need to flush not only when your table contents are modified, but also when the contents of any table you are joined to changes. Consider the following CacheModel:

<cacheModel id="locations_cache"
type="LRU"
readOnly="true"
serialize="false">
<flushInterval hours="24"/>
<flushOnExecute statement="Location.insertLocation"/>
<flushOnExecute statement="Location.updateLocation"/>
<flushOnExecute statement="Location.deleteLocation"/>
<flushOnExecute statement="LocationType.insertLocationType"/>
<flushOnExecute statement="LocationType.updateLocationType"/>
<flushOnExecute statement="LocationType.removeLocationType"/>
<property name="size" value="250"/>
</cacheModel>

Location contains a reference to what its location type is, and subsequently all the selects performed on location join to location type to get the type information. We ran into a problem right after we enabled caching that when locations were cached, and the location_type had the name updated, it would not be reflected in the cache for 24hrs. After investigating we realized that the location cache was not being cleared out when location types were modified.

This forced us to start using iBatis namespace so we could reference one iBatis XML statements within another. In the CacheModel above you can see that the cache will be flushed anytime the location or location type table is modified.

Our project uses Spring 2.5.6 and iBatis 2.3.4. As a result of this finding, we also found a problem with iBatis 2.3.4. It does not support transaction caching. What this means is that, even though spring rolls back the transactions after each test method, if the cache was updated it is not flushed. We discovered this when running the automated test suite and they were failing.

To see this problem you can write a test class that has two tests. In the first test save a new record, and validate that it was saved. At the end of the test the database is rolled back. However the cache will still contain the record. in the second test method, do a get all and record the number of records returned. Let's say it was 2. Save a new record, which forces the cache to flush and get all the records again. If all was working as expected, the count would now be 3. However, the cache still had the record from the first test in it, when the save is done and the cache is flushed, the get will again return 2.

That's with tests. Can this happen in a live system? The answer is yes. If the saves are complex in nature(ie. more than one dao call required to save the information) and their are select's performed at different points during the save process that update the cache, if an error occurs before the save has completely succeeded and the transaction is rolled back. The database will be cleaned up, but the cache will still contain the records it attempted to save.

iBatis 3 is supposed to solve that problem, however, there is no Spring support for iBatis 3 yet. They have it planned for Spring version 3.1. For now, you must be very careful using cache on complex save operations.

No comments: