Object Relational Mapping (ORM) Using NHibernate - Part 8 of 8

Article

In parts 1- 7 of the article series, I completed coding various types of associations using NHibernate.

his is the last of the eight-part article series. In this article, we clear the question from the end of the first article, "How do we manage a persistent object across sessions?". The background for the article is very important since it paves the way and introduces important topics required for understanding the discussions here.

Background

All ORM applications use a persistence manager for functionality like database writes, reads, and queries. In NHibernate, it is done by interfaces called Session, Transaction, Criteria, etc. Each session must first open a transaction, and all activities on objects are within this boundary of transactions (saying it as a transaction is a highly simplified way of representing it. In ORM, it is generally referred to as a unit of action and not necessarily a transaction, but you will get an idea with the word "transaction," and saves me from going into an explanation of "what a unit of action" is, which in itself is an exhaustive & interesting topic in ORM and is definitely not necessary for the consumption of ORM software for application development but vital for developing an ORM library like NHibernate.).

Each session is said to be associated with a PersistenceContext. PersistenceContext provides functionality like maintaining a unique object identity (called identity scoping), repeatable reads, avoiding a recursive load of objects in a graph, automatic transactional writes, dirty checking, etc. People who have worked with ORM code will immediately know what this Persistence Context means just by reading the functionality it offers. Yes, it is the good old IdentityMap. For the rest, I explain the idea here, which is essential to understand. Note that the idea expressed here is based on our previous experience in the development of an ORM library and not NHibernate specifically, but the idea is almost the same for most ORM libraries.

As said earlier, for ORM software, the first step is fixing object identity for persistent objects in a way that it is uniquely matched to a database row. The next step is ensuring that the identity is unique and that only one persistent object exists representing a particular database row. This avoids ambiguity. So what is usually done to ensure this object's uniqueness is an in-memory hashtable is used with the database primary key as the "key" field of the hashtable and the persistent object as "Value". Whenever ORM software does a lookup of a persistent object in the database while executing a query or gets it by id (i.e, the primary key value), it first does a lookup for the persistent object in this hashtable using the "primary key," which is the key field for the hashtable. If it exists, it just returns the object from the hashtable. Otherwise, the ORM software creates an object based on values obtained by executing the query against a database and adds it to the hashtable, and then finally returns the object to the client code. Every persistent object used by the client code is stored in this hashtable. This is the first-level cache for the ORM software. When you ask for an object repeatedly, only the first time, it is queried against the DB. It is subsequently gotten from this cache or hashtable. This is called a repeatable read. The main thing is where you keep this hashtable. If you keep this hashtable at the process level and store all objects of a process there, then you need to synchronize access to the hashtable from multiple threads. So the general idea is to keep this hashtable at the thread level, and all persistent objects modified in a session and transactions within this thread are maintained inside this hashtable per thread. This is named a persistent context cache (more or less gives a rough picture of how the ORM library is developed). The next functionality offered by the persistent context is dirty checking. The name dirty checking is self-explanatory. It means you find whether an object is modified or not within the scope of 1 transaction. ORM software uses two ways to do that: 1. By storing a snapshot of a persistent object separately at the start of the transaction and comparing it at the end of the transaction, 2. Keep an IsDirty flag and set it if an object is modified. NHibernate must use the first method of DirtyChecking, obviously, because we don't set any flags while modifying persistent objects, and we don't inherit objects from a well-defined NHibernate interface which would be necessary for the second method. So what is the use of dirty checking? At the end of a transaction, NHibernate will synch the changes of all modified dirty persistent objects to the database if it is a successful transaction. You don't have to do anything. Persistent objects are managed by NHibernate. The snapshot method is preferred by a sophisticated library like NHibernate because you can fine-tune the SQL generated for updates to the database to include only fields that were actually modified instead of all the fields for the object. So now we are familiar with the persistent context. This persistent context and its cache is associated with each session separately. The diagram in Figure 1 gives a rough idea of the thread-level persistent context cache.

PERSISTENT-CONTEXT-CACHE

Figure 1 - Persistent Context Cache : Objects already in the cache are directly retrieved from the hashtable or cache in two steps. For objects not in the cache, 3 steps are required because first the item has to be brought into the cache and then it is returned to the client. The method session.getbyid(identifier) is shown to illustrate retrieval by using identifier or primarykey.

Code Example

A persistent object in NHibernate exists in four states: Transient, Persistent, Removed, and Detached Objects. Objects instantiated using the "new" operator are called Transient objects. They do not have a db id. When this transient object is saved to a db, it is associated with a db id and becomes a Persistent object. Objects loaded from the db also have the db id and are called persistent objects only. These are the two main types of objects in ORM. NHibernate introduces a third object state called the Detached state. It is vital to understand this detached state, and to understand this only, the topic of a persistent context cache was introduced in the background of this article.

When the session closes, the persistent context cache associated with the session also closes. However, it is possible that the reference to a persistent object exists, outliving the life of the session in which it was created. The state of these persistent objects outside the scope of a persistent context cache is said to be detached.

The final lesser-used state is the "REMOVED" state of an object which is the intermittent state of an object which is deleted in a session from the db, but the transaction in which the deletion is called is not yet completed. So until the transaction completes, the object is in a state called REMOVED, after which it transitions to the Transient state. Figure 2 below gives a rough idea of the states of the object:

DIFFERNET-STATES-OF-OBJECT

Figure 2 - Various states of objects

Now the problem with the fetch at the end of the first article should have been easily understandable to everyone. The problem was that we got an object using a helper method in the DBRepository class. This DBRepository instance uses a session in the getItemById method to retrieve the object from the DB and closes this session at the end of the method. When we later tried to consume the object we got an exception. We got an exception because, by default, NHibernate says lazy=true, wherein all associations for the object retrieved are not sent with the object during retrieval, and only a proxy is kept instead. This is the lazyloading feature in NHibernate. So when this proxy is accessed with the session object closed, it throws an exception. By setting lazy=false, we got the full object instead of the proxy for associations. This is one of the solutions, but it cannot be used everywhere. But now, after the background given here and our current knowledge of the different states of the objects, we can comprehend the real problem. The real problem is that when the session is closed by DBRepository after returning the object we need, the persistent context cache associated with the session is also closed. So what we have in hand is a detached object. This detached object cannot be directly used in any other session because it does not exist in the persistent context cache of the other sessions. So it has to be brought into the persistence context of the other session by a process called reattaching. The method called for it is:

newsession.update(object);

This attaches the object to the new session. Reattachment for unmodified persistent objects can also be done using the method called lock().

In scenarios where the new session is already loaded with a different instance of an object having the same db identifier as the old object that you want to load into the new session's persistent context cache, the method to use is:

new_object = newsession.merge(old_object);

This "merge" method will copy the state of the old_object inside the new_object. After this call, the client is supposed to use the new_object only.

The solution to the problem we had in the first article is to use these techniques of reattachment of objects. So by using the Reattachment of objects, we now know the technique to use objects across sessions.

Conclusion

This concludes the eight part article series. I will write more articles on querying using NHibernate, fetching strategy in NHibernate and full samples of object conversation using reattachment in months to come when I find time for it. Until then enjoy NHibernate.