Sunday, October 11, 2015

Introduction to Hashing and Hash Functions

Searching for an element in an un-ordered array has cost of O(n). Binary search does a better job of searching for an element by sorting and dividing the array into two halves repeatedly and eliminating half the size of array elements for effective lookup. BST has an efficiency of O(log2n) but it requires a sorting of the array. In order to achieve near constant time lookup i.e O(1), we could use a hash table with key as index to the value. Hash table is an implementation of dictionary abstract type. When key is Integer then it can be directly taken as index. However if the key is any other object type, we need to use an hashing strategy to utilize less amount of memory to store large range of keys (hash codes) without sacrificing the quality of hashing.

In Java hash code and equals should follow the basic contract that if two object have same hash code, they may not be equal but if two objects are same, both should yield same hash code value (deterministic).

A collision of hash code occurs when two different object have same hash code. In other words if we try to store two unequal keys by their hash code value, they fall into the same location. This is not allowed to preserve the basic quality of hash table.

To reduce the likelihood of collision, computer science recommends uniform hashing where each item to be hashed has an equal probability of being placed into a hash location (otherwise known as memory slot), regardless of the other elements already placed at the position. In other words keys are distributed evenly in the allocated memory range or they are not clustered together. In common sense, there is a greater likely hood of bumping into someone in a crowded room rather than a large open place. Perfect hashing is a technique for building a hash table with no collisions. It is only possible to build one when we know all of the keys in advance. Perfect hashing guarantees O(1) for insert and lookup of keys in hash table. Another less important and idealistic concept is Minimal perfect hashing which implies that the resulting table contains one entry for each key, and no empty slots. It has limited use because its impossible to achieve minimal perfect hashing unless you know all keys in advance. Two other property of hashing for a practical use are optimal usage of available memory and the speed of calculating hash. If we have a near perfect hash function which takes forever to calculate or expects huge amount of memory, it will not be practical.

Universal hash function has the probability of collision is less than or equal to 1/n (where n is the number of bucket). In simple words the an hash function from a family of hash functions should not do a worse job than randomly assigning keys over a decent size of buckets/slots. It is also called SUHA (Simple universal hashing assumption) principle. Expected search time with chaining with load factor N/M is O(1+N/M).

The load factor of a hash function is N/M where N is number of items to hash and M is the size of the hash table.

There is always a trade off between space and time. With decent speed and manageable space an hash function will eventually be susceptible to collision. There are two types of collision resolution strategies:

  1. Open Hashing or Separate Chaining : Instead of using a single key elements, use linked list of key elements for all keys with same hash. Linked list is preferred over a simple array so that if the key is deleted, we do not have to reinitialize or resize the array. Never the less this strategy requires an overflow area or auxilliary set for storing colliding keys and some way to link them to index of main key storage slot. Not to mention that the performance of lookup would degrade with the length of the chain.
  1. Closed Hashing or Probing : Here we need to find an alternate location for the key if collision occurs. There are different techniques for finding alternate location as well:
  • Linear probing : Look for alternate key location starting from adjacent location of original hash move towards the end and then from beginning to preceding slot. Linear probing suffers from primary and secondary clustering problem where probe path is same for colliding keys (secondary) or when a key hashes into the probe path of other collision.
  • Quadratic Probing : A quadratic probing uses following sequence for finding alternate locations: (hash(key) + 1) MOD size,  (hash(key)   + 4)  MOD size, (hash(key)   + 9)  MOD size, (hash(key)   + 16)  MOD size...(hash(key)   + (add successive odd number to previous incremental factor))  MOD size
    One issue which must be resolved is whether we will actually be able to find an open slot with this method under reasonable circumstances. A relatively easy number-theoretic result ensures that if the capacity is prime and the load is under 50%, then the probe sequence will succeed.
  • Double-hash probing : Another hash function is used in combination main hash function. This helps in eliminating clustering problem and distributes keys more evenly across the allocated space reducing the chances of collision. However this adds complexity and suffers from slowness. The probe sequence with double hashing follows : h1(key) MOD size, (h1(key) + 1*h2(key)) MOD size, (h1(key) + 2*h2(key)) MOD size, (h1(key) + 3*h2(key)) MOD size and so on. A double hash should never evaluate to 0. A good choice for h2 is [P - (key MOD P)] where P is a prime number. 
In some cases Re-hashing is needed to transfer all keys into a new table by rehashing all existing keys. This would help in spreading the keys and insert/delete and lookup on new table would perform better. Re-hashing is an expensive operation.

Hash Functions:

Java's hashcode for Integer is the int primitive of the Integer. For hashcode of Long the result is The result is the exclusive OR of the two halves of the primitive long value held by the Long object


Integer which is the type of hashcode has a word size of 32 bits in Java. Long is 8 bytes i.e 64 bits in size. If the value of long falls in integer range then the hashcode would compute to same value.

The hash code for a String object in Java is computed as

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

31 is considered a good prime number for better distribution. 31 is also a Mersenne prime (like 127 or 8191) which is a prime number that is one less than a power of 2. This means that the mod can be done with one shift and one subtract if the machine's multiply instruction is slow. and multiplication by 31 can be achieved by left shift and subtraction i.e 31 * X = (X<<5) - X

Division Method: Reminder value after diving the key with size of of hash table. The size should not be selected power of two or one less than any power of two number. Also the size m should not be a power of 10. Good value of m is a prime number not close a power of two.

h(key) = key MOD m

Knuth variant of Division Hash function is :

h(key) = key(key + 3) MOD m

Multiplication Method: It can be used for a size of table as power of two. Here the key is multiplied by a constant and bit shifted to compute hash code.
h(key) = key * A << x

Knuth suggested a magic fraction for constant value calculation as :

A = Size Of Table * (Sqrt(5)-1)/2 = m * 0.6180339887498948

The number of right shifts should be equal to the log2 N subtracted from the number of bits in a data item key. For instance, for a 1024 position table (or 210) and a 16-bit data item key, you should shift the product key * k right six (or 16 - 10) places.

CRC 32 : Hashing is computing for checksums to verify the integrity of data  downloaded from a site of comparing files (typically certificate files or compressed files). Java provides A CRC of a data stream is the remainder after performing a long division of the data (treated as a large binary number), but using exclusive or instead of subtraction at each long division step. This corresponds to computing a remainder in the field of polynomials with binary coefficients. CRCs can be computed very quickly in specialized hardware. Fast software CRC algorithms rely on accessing precomputed tables of data.

CRC32 checksum = new CRC32();

The logic for CRC32 is some what like this:

highorder = h & 0xf8000000    // extract high-order 5 bits from h
                                   // 0xf8000000 is the hexadecimal representation
                                   //   for the 32-bit number with the first five 
                                   //   bits = 1 and the other bits = 0   
     h = h << 5                    // shift h left by 5 bits
     h = h ^ (highorder >> 27)     // move the highorder 5 bits to the low-order
                                   //   end and XOR into h
     h = h ^ ki                    // XOR h and ki

One at a time Hash  (SHA, MD5) : These are cryptographically secure hash functions. SHA-1 and MD5 can be computed by converting string to bytes, or when reading in bytes 1 at a time. JDK has to compute these hash values. 

Fowler–Noll–Vo hash function (FNV) : The purpose of FNV hashing was to be fast and maintain low collision rate. It is best suited for predictable texts such as URLs, machine names, host names, ip addresses, router tables, file names, person names, country names, station names, domain entitiy names such as stock ticker etc. There are different variations of FNV hash. They are listed at FNV and Knuth are the basis for many advanced hash functions although they are rarely used in their original form now a days. Please refer to wikipedia for details on FNV

Murmur Hash: Murmur hash thoroughly mixes the bits of a value by application of left rotation and multiplication by magic number for certain number of times (usually 15).
x *= m;
x = rotate_left(x,r);

m, r, and x are pseudo-randomized. Unfortunately multiply+rotate has a few major weaknesses when used in a hash function, so some implementation use multiply+shift+xor. Java implementation is available at

Paul Hsieh's SuperFastHash : SuperFast Hash uses one at a time hashing as a model. A brief description behind all the avalanche and magic numbers used for rotation and multiplication can be found at the author's website Java implementation of super fast hash is available here

Google's CityHash : Not suitable for cryptography, CityHash mixes inputs bits thoroughly. It is used for strings. The C++ code for cityhash family of hash functions are available at

Cuckoo hashing: Cuckoo hashing achieves constant average time insertion and constant worst-case search: each item has two possible slots. Put in either of two available slots if empty; if not, eject another item in one of the two slots and move to its other slot (and recur). "The name derives from the behavior of some species of cuckoo, where the mother bird pushes eggs out of another bird's nest to lay her own." Rehash everything if you get into a relocation cycle. Maximum load with uniform hashing is log n / log log n. Improve to log log n by choosing least loaded of two.

Djb2, SDBM, Lose Lose : These hash functions are well defined at

An old profiling numbers of all the major hash functions implemented in C can be found at . New and improved hash functions are still a subject of many research scholars for specifically targeted for various usage. In my personal experience SuperFastHash implementation worked well for integer arrays and strings. Bob Jenkin has analyzed different hashing techniques and compared against his implementation in his website:

Many of the hashing function implementation is available in experimental form in google's guava library under We should test all of them and chose what works best for expected data set. 

Monday, December 31, 2012

Spring property placeholder trick, one properly file for all environments

To specify different environment specific properties in spring such as jdbc url, the old school technique is to create a property file per environment. The property file names with environment name in it. Spring property placeholder is directed to different file as per environment. The drawback with this approach is that developers are lazy to update so many property files. Sometimes common properties are moved a separate property file which increases the number of such files. Here is a cleaner approach, you might like. It uses property of property to dynamically create the property key for each environment. Ofcourse, you won't be putting production password in a property file but use aop to inject the password at runtime.

 # is a properties for all environment.  

dev.dataSource.alias=my_dev_datasource_alias dev.dataSource.username=dev_user dev.dataSource.password=dev_pass
test.dataSource.alias=my_test_datasource_alias test.dataSource.username=test_user test.dataSource.password=test_pass
prod.dataSource.alias=my_prod_datasource_alias prod.dataSource.username=prod_user prod.dataSource.password=prod_pass

Here is how you will use this file in spring context configuration

 <bean id="dataSource"  
 class="oracle.jdbc.pool.OracleDataSource" destroy-method="close">  
 <property name="connectionCachingEnabled" value="true" />  
 <property name="URL" value="${dataSource.url}" />  
 <property name="user" value="${dataSource.username}" />  
 <property name="password" value="${dataSource.password}" />  
 <property name="connectionCacheProperties">  

How to toggle a fast view in rcp application?

If you want to toggle a fast view in your rcp application here is a tip for you. This can be done on a custom show/hide button. Using showView or HideView of active page could cause duplicate views or views with different sizes. I am using eclipse internal package for this as the following code is stolen from eclipse source itself, but it works very well. In the following code cmd is the command for the button or menu where you want to add toggle view functionality.

 IWorkbenchWindow window = HandlerUtil.getActiveWorkbenchWindowChecked(event);  
     ICommandService cs = (ICommandService) window.getService(ICommandService.class);  
     HashMap<String, Object> parameters = new HashMap<String, Object>();  
 // view id  
       IParameter iparam1 = cmd.getParameter(IWorkbenchCommandConstants.VIEWS_SHOW_VIEW_PARM_ID);  
       Parameterization params1 = new Parameterization(iparam1, YourView.ID);  
       // if it is a fast view  
       IParameter iparam = cmd.getParameter(IWorkbenchCommandConstants.VIEWS_SHOW_VIEW_PARM_FASTVIEW);  
       Parameterization params = new Parameterization(iparam, "true");  
       // build the parameterized command  
       ParameterizedCommand pc = new ParameterizedCommand(cmd, parameters.toArray(new Parameterization[parameters.size()]));  
       // execute the command  
       IHandlerService handlerService = (IHandlerService) window.getService(IHandlerService.class);  
       handlerService.executeCommand(pc, null);  

Wednesday, November 21, 2012

Spring Directory (Java)

It has been quite long since I posted on spring. Spring has come a long way since I started using it in 2006. Spring provides modularity, productivity, portability, testability. From architecture 101, it would imply that the side affects of using spring would be performance and availability as a general rule of thumb. However every case is different and you should do your own tests to determine how much overhead spring is adding to your application. Spring also adds layering overhead based on how much loose coupling and high cohesion exists between services build on top of other services.

Spring source YouTube channel has great videos for getting familiar with these technologies in less time.

I wanted to make a comprehensive list of all the features in spring for Java developers. If any of the following links become outdated or you know about any spring module not mentioned below, please let me know.

Spring Core
Notable features of Spring core are Dependency Injection, Singletons, Support for multiple configuration files, Annotation Processor, Bean Lifecycle Managment, Spring AOP Interceptors, Bean Scopes, Property Placeholders, namespace for p and util, Spring expression language, Adding bean dependency as entries in Collection and Maps, Junit integration and ability to export Spring beans and JMX MBeans. Spring task executors provides a wrapper over java.util.concurrent. Spring auto component scanning and auto wiring are also a powerful features in spring core. Spring core provides integration for object<->xml mapping based on oxm and castor. Spring email integration is based on Java Mail.

Spring MVC:
Spring MVC needs little learning curve. It is super easy to configure. Spring security goes well with spring MVC.

Spring Web Flow
It is an extension to spring MVC for consistent navigation. It is useful for shopping cart or workflow type web design where you do not have to worry about browser back button and other nuances.

Spring Flex and Spring BlazeDS Integration
Another extension to Spring MVC using integration with Adobe BlazeDS. Here the client side interface is Adobe Flex application rather than HTML.

Spring JDBC
Convenient wrapper to pain old jdbc. I find it very useful for quick application development just using jdbc. It has nice object mappers to convert select result into pojos.

Spring Data Access
One of the most widely used feature of spring is spring ORM. Spring ORM provides convenient data access api using hibernate, jdo, jpa or ibatis. In my personal experience, I found hibernate JPA to be more powerful than just spring JPA. However you need to be careful if you are using hibernate specific annotations and wrappers to get hibernate session as it will defeat the purpose of using spring as an abstraction over jpa. But like I said, hibernate JPA provides much richer feature set and portability with hibernate query language. As a general guideline I prefer using iBatis for selects, jdbc batch for bulk inserts, jpa for complex outerjoin relationships and using entities for updates. I have never used JDO but I suppose it is the only option available if you are using non traditional EIS such as SAP.

Spring Data - JPA
Spring jpa actually falls under Spring data. It helps in developing JPA based data access layer.

Spring Data - Query DSL Query DSL provides type safe query construction for JPA, JDO or SQL. It does the compile time checks and provides programmatic nesting of query parts. Spring provides support for QueryDSL SQL module. For JPA, you have to rely on criteria queries.

Spring Batch
Spring batch provides many features required to perform asynchronous bulk operations. Some of these benefits includes logging or tracing, transaction management and job stats.

Spring Integration
Spring integration is a light weight messaging framework based on enterprise integration patterns. Other features of spring integration is remoting and scheduling.

Spring Data - GemFire
Spring integration with vmWare VFabric GemFire is targeted towards high availability, distributed and concurrent systems with large transaction throughput using large number of cheaper hardware. GemFire is an in-memory database cluster with sql friendly interface. It has option of memory+disk hybrid approach.

Spring Data - Apache Hadoop
Hadoop is widely popular tool for achieving scalability in large distributed systems.
Hadoop provides command line utilities for its maintenance. Spring provides dependency injection to configure Hadoop. Hadoop based workflows are supported through spring batch.

Spring Data- REST
This should actually be part of Spring MVC but officially part of Spring Data. Currently it supports exporting JPA repositories CRUD operations as REST service via sping mvc servlet. There is plan to extend this capability to NoSQL data store like MangoDB.

Spring Data - Commons
It has common interfaces for all the spring data projects and metadata model for persisting java classes. If you need to create spring data integration with your favorite NoSQL flavor, which is not supported out of the box from Spring, you can design it using commons.

Spring Data- Grails
Grails is a groovy based web framework. This project provides wrapper or plumbing with Grails Object Relation Mapping (GORM) for hibernate and NoSQL data stores.

Spring Data - NoSQL
At the time of writing this post, Spring data had integration with Redis (Key-Value Store), MangoDB (Document store), Neo4J (Graph Database), and HBase (column oriented datastore based on Google's BigTable).

Spring Web Services
Spring web services incorporates best practices around contract first document driven web service development.

Spring Security
Spring security has integration with JAAS, OpenID, LDAP, CAS, Kerberos/SpNego, X.509 certificate based authentication, Web flow sercurity and WSS.

Spring Mobile
Spring has provided abstraction layer to android app development framework.

Spring Social
Spring Social provides integration to open api provided by facebook and twitter. You can define new integration for other social media.

Spring Cloud
There is no special integration for clound providers in spring. Spring is by default supported in vmWare PaaS (platform as a service) offering - CloudFoundry. No change is needed to port a spring application into coudfoundry based cloud. If you are using spring with Amazon EC2 or other cloud providers, please drop a comment about your experience.

Spring Roo
Spring Roo is a rapid application development platform. It provides set of build and productivity tools developed as roo add-ons.

Spring AMQP
AMQP (Advanced Messaging Queuing Protocol) is a messaging standard primarily supported by Rabbit MQ. It is a higher level abstraction then API based messaging such as JMS container. If you are planning to migrate from JMS to AMQP, this article could provide a good start.

Monday, February 20, 2012

Mixing Spring, Hibernate JPA and plain JDBC

There are various ways to mix Spring, Hibernate JPA and plain JDBC. I will show you one of the example derived from best practices given in Spring and Hibernate documentation. Before we dive into the boilerplate code, here are some arguements as why you would ever want to mix these technologies?

  1. Performance: JPA works great when domain model is simple (it degrades beyond level 3 and 4 normalization). The reason for this performance degradation is not JPA but the sql generated by JPA to read and write such complex entity relationships. 
  2. Virtual entity relationships: You want to realize entity relationship different from how it is actually defined in  database. For example you have set of tables to define all products in hardware and another set of tables for online products like software and you want to combine them inside a virtual composite entity. However you want to collect individual products in separate sql queries and fill into composite because the JPA mapping is taking too long. In sql you can prepare set based query where you can fetch all the records into a flat result set. 
  3. JDBC batch: You want to use the power of plain old JPBC batch to insert many records in same or different tables. You don't have to use JDBC batch every time as adding multiple child entities in parent entity within a transaction and calling persist will internally fire JDBC batch (assuming you pre-fetch your primary keys and do not depend on pk auto generation). More details on this can be found at
Whatever be the reason, if you finally decide that you need to mix JDBC with spring and JPA then you can continue reading rest of the blog:

Step 1: Define your persistence unit in persistence.xml. You will need to replace the dialect according to your database vendor.
 <persistence xmlns="" version="1.0">  
   <persistence-unit name="myJPAPeristenceUnit" transaction-type="RESOURCE_LOCAL"/>  
            <property name="hibernate.dialect"  
            <!-- Print SQL to stdout -->  
            <property name="hibernate.show_sql" value="false" />  
            <property name="hibernate.jdbc.batch_size" value="10"/>  

Step 2: Define data source and entity manager factory. It is advised to use entity manager factory instead of entity manager as hibernate JPA knows best when to instantiate entity manager based on your transaction settings. The anti-patterns such as entity-manager-per-application, entity-manager-per-request and entity-manager-per-session are discouraged. The reason for this is well explained hibernate documentation. In the following example, I have MyGreatService using MyGreatDao. Here is a sample application context using transaction manager:

 <?xml version="1.0" encoding="UTF-8"?>  
 <beans xmlns=""  
      xmlns:xsi="" xmlns:p=""  
      xmlns:tx="" xmlns:context=""  
      <bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">  
           <property name="driverClassName" value="com.mysql.jdbc.Driver"/>  
           <property name="url" value="jdbc:mysql://mdc-mysql-host/mydb"/>  
           <property name="username" value="sa"/>  
           <property name="password" value=""/>  
      <bean id="entityManagerFactory"  
           <property name="persistenceXmlLocation" value="classpath:META-INF/persistence.xml" />  
           <property name="persistenceUnitName" value="myJPAPeristenceUnit" />  
           <property name="dataSource" ref="dataSource" />  
           <property name="jpaVendorAdapter" ref="jpaVendorAdapter"/>  
           <property name="jpaDialect" ref="jpaDialect"/>  
      <bean id="myGreatDao"  
           p:entityManagerFactory-ref="entityManagerFactory" />  
      <bean id="myGreatService "  
      <bean id="jpaDialect" class="org.springframework.orm.jpa.vendor.HibernateJpaDialect"/>  
      <bean id="jpaVendorAdapter" class="org.springframework.orm.jpa.vendor.HibernateJpaVendorAdapter"/>  
      <bean id="transactionManager" class="org.springframework.orm.jpa.JpaTransactionManager">  
           <property name="entityManagerFactory" ref="entityManagerFactory" />  
           <property name="dataSource" ref="dataSource" />  
           <property name="jpaDialect" ref="jpaDialect" />  
      <tx:annotation-driven transaction-manager="transactionManager" />  

Step3: There is a very important piece for this implementation to work. You should define entitymanager inside dao with annaotation @PeristenceContext so that hibernate jpa can inject entity manager at runtime. When an existing enity manager should be reused and when a new one needs to be created, depends on how you manage your transaction boundary. Transaction boundary should be defined at the service public methods. In a web application scenario, it should be started at the entry point, where a user action controller passes input and expects output from. Again the transaction boundary can be defined at interface level or implementation level. Here I define transaction at implementation level.

MyGreatDaoImpl class

 @Transactional(propagation = Propagation.REQUIRED)  
 public class MyGreatDaoImpl implements MyGreatDao {  
   private EntityManagerFactory entityManagerFactory;  
   @PersistenceContext(unitName = "myJPAPeristenceUnit")  
   EntityManager entityManager;  
   public String getValueUsingJPAQL(long id) {  
     List<Value> values = entityManager  
       .createQuery("Select value from Value value where")  
       .setParameter(1, id).getResultList();  
     if (values.size() > 0) {  
       return values.get(0).getValue();  
     return null;  
   public Value getValueUsingJPA(long id) {  
     return entityManager.find(Value.class, id).getValue();  
   public String getValueUsingJDBC(long id) {  
     ResultSet rs = null;  
     Statement statement = null;  
     Connection conn = null;  
     try {  
       conn = getConnection();  
       String sqlquery = "select value from VALUE where id=" + id;  
       statement = conn.createStatement();  
       rs = statement.executeQuery(sqlquery);  
       if (null != rs && {  
           return rs.getString(1); //resultset starts from 1  
        /* or use hibernate session  
        Session sess = getSession();  
        String sqlHibernatequery = "select value from VALUE where id=:id;  
        SQLQuery sq = sess.createSQLQuery(sqlquery).addScalar("value", StandardBasicTypes.STRING).setLong("id",id);  
        List<String> valueList = sq.list();  
        if(null!=valueList && !valueList.isEmpty()){  
           return valueList.get(0);  
     } finally {  
       //close result set  
        //close statement  
        //Do not close connection. It wil be closed when transaction finishes  
     return null;  
   public Connection getConnection() throws HibernateException {  
     Session session = getSession();  
     Connection connection = session.connection();  
     return connection;  
   private Session getSession() {  
     Session session = (Session) entityManager.getDelegate();  
     return session;  

MyGreatDao Interface

 public interface MyGreatDao {  
   public String getValueUsingJPAQL(long id);  
   public Value getValueUsingJPA(long id) ;  
   public String getValueUsingJDBC(long id);  
   public Connection getConnection() throws HibernateException;  

Here is how a typical service class with transaction annotation will look like:

 @Transactional(rollbackFor = Exception.class)  
 public class MyGreatServiceImpl implements MyGreatService {  
      private MyGreatDao myGreatDao;  
      public MyGreatDao getMyGreatDao(){  
           return myGreatDao;  
      public void setMyGreatDao(MyGreatDao mg){  
           this.myGreatDao = mg;  
      public readOnlyMethod(long id){  
      @Transactional(propagation = Propagation.REQUIRED, readOnly = false)  
      public readWriteMethod(MYGreatVO myVO){  

I have omitted code for package and imports in the above listing. Also this is a read only example but once you get plain old jdbc connection, you can do all the read/write and batch operations using Statements and PreparedStatements. Another quick trick before I close. There is a handy utility (Transformers) in hibernate to get typed list from SQLQuery.list(). If you define a value object (java bean), where the property names are same as selected columns names then you can use the following code to avoid writing setters:
 List<MyJavaBeanVO> voList = sq.list();  

Wednesday, November 16, 2011

Using Java Annotation with Spring AOP

Here is a short refresher for your spring aop vocabulary.

"Spring AOP uses Aspect to Advice about possible actions with respect to Target object at the Join Point matching Point Cut expression". Aspects are linked to Proxy by run time weaving. Unlike AspectJ, a join point is always a method in Spring.
You can write spring aop advice, which will work based on your custom java annotations. Here is a simple example. Please read about spring aop limitations at the end.

Define annotation name MethodAnnotation:
 import java.lang.annotation.ElementType;  
 import java.lang.annotation.Retention;  
 import java.lang.annotation.RetentionPolicy;  
 import java.lang.annotation.Target;  
 public @interface MethodAnnotation {  
      String comment();  

Define Aspect for the annotation. In this example we are using @around advice
 import org.aspectj.lang.ProceedingJoinPoint;  
 import org.aspectj.lang.annotation.Around;  
 import org.aspectj.lang.annotation.Aspect;  
 import org.aspectj.lang.annotation.Pointcut;  
 public class AspectForMethodAnnotation{  
      public Object executeAroundMethod(ProceedingJoinPoint pjp, MethodAnnotation methodAnnotation) throws Throwable {  
           // Do what you want with the actionperformed  
           String commentFromAnnotation = methodAnnotation.comment();  
           // You can use join point arguments in any way you like  
           for ( Object object : pjp.getArgs()) {  
           Object ret = pjp.proceed();  
         //doing something after method execution  
         return ret;  

There are few points to note here. First of all, java annotation will provide method arguments through ProceedingJoinPoint.getArgs() but not the exact variable name (unless you compile the classes with debug option). So if you have any logic based on argument name, you might want to consider passing index of that argument within annotation. To make auto proxy discover your aspect, you can declare it as @component or @configurable.

That's all. You can define auto proxy in applicationcontext.xml as below
 <?xml version="1.0" encoding="UTF-8"?>  
 <beans xmlns=""  
 <bean id="methodAnnotationAspect" class="com.mdc.spring.annotationexample.AspectForMethofAnnotation" />  

To inject our annotation inside a bean available within spring context, simply add annotation
 @MethodAnnotation(comment="World's best method")  
 public void bestMethod(String arg1, UrObject arg2) {  

There are some caveats to this approach. If you add multiple annotations within same bean and one method calls another, then spring will not fire aop advice for subsequent calls. This is because sping aop is based on run time proxy. When a proxy invokes method on its instance, consecutive method invocation happens in actual instance, without knowledge of spring environment.

If you could have annotation class in multiple packages then place full package of annotation interface inside @annotation(...).

More info and examples of point cuts are in spring doc:

47 degrees has extended this concept to implement a dynamic pub-sub functionality.

Saturday, October 01, 2011

Preprocess bean property before assigning to bean in Spring applicationcontext

There could be a need to evaluate, validate or pre process a value before assigning as a property in a spring bean. One of the practical use of this technique is to decrypt password before assigning in data source. Spring 3 provides an elegant way to do this without extending any spring class. This is made possible by Expression Language supported by Spring 3 onwards.

 <bean id="datasourcePreprocessor" class="">  
  <property name="inputPassword" value="HighlyEncryptedPassword"/>  
  <property name="user" value="userName"/>   
 <!-- other properties -->  
 <bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">  
  <property name="driverClassName" value="oracle.jdbc.driver.OracleDriver"/>   
 <property name="url" value="jdbc:oracle:thin:@my.db.server:1521:SID"/>   
 <property name="username"value="#{ datasourcePreprocessor.user }"/>   
 <property name="password" value="#{ datasourcePreprocessor.outputPassword }"/>  

You can implement the decrypt method in setInputPassword or getOutputPassword() method of
Anotherside tip of this facility is that you can refer to any system property by simply putting something like value="#{ systemProperties[''] }". 'systemProperties' is predefined in spring 3.

To read more about this technique read <a href=""></a>

Monday, September 26, 2011

How to avoid duplicate entities from hibernate outer join

Ever wondered why we get duplicate entities as a result of outer join from Hibernate. We would expect this as hibernate query results should be smart enough to remove duplicates based on primary key (id) field. It does not happen by default. If you run into this issue, you can use DistinctRootEntityResultTransformer. To reduce your result set into distinct entities:

 String hql = "from p from Parent p left outer join p.children where";  
 Query q = session.createQuery(hql);  
 q.setString("name", "John");   
 List<parent> parents = (Parent)DistinctRootEntityResultTransformer.transformList(q.list());  

If you are using @OneToMany(fetch=FetchType.EAGER) then "Set" should be used to avoid duplicate entities.

Wednesday, January 26, 2011

How to auto scan jpa entities? (using spring configuration)

If you configure spring with annotation-config and datasource, spring provides easy way to auto scan entity classes within a package. You can find details of this configuration at You might think it should be pretty simple to let spring scan all JPA persistence entitites even when you define EntityManagerFactory instead of DataSource. Well, spring by default scans @Entity classes from same path as persistence.xml. There is no easy way to specify a package name and let spring scan within that package instead of giving relative or absolutely path. Here I tell you steps to achieve the same. It really helps to specify the package scan like you can do with datasource configuration. Step1: Define a custom bean say myMagicScanner like this
Step 2: While defining EntityManagerFactory use above bean as PersistenceUnitPostProcessor
Define a custom bean say myMagicScanner like this Step 3:
Finally here is the java class for scanner utility used in MyMagicScannerBean. Notice that I am implementing org.springframework.orm.jpa.persistenceunit.PersistenceUnitPostProcessor package com.springjpa.util;

Monday, February 11, 2008

Hibernate property default for hbm2ddl

While using hbm2ddl for schema creation and update, if you want it to set a default value for the new column, use "default" property in hbm.

<property name="initialSize" type="int">
<column name="INITIAL_SIZE" not-null="false" unique="false" sql-type="INTEGER" default="3"/>

Although hibernate documentation also mention to have insert="false" attribute, but it seems to be working even otherwise. The resultant effect would be same as using default in alter/create table SQL statements for schema creation/update.